AI and machine learning on the IC?

Has anyone tried TinyLlama yet? 1.1B parameters, Llama2 architecture and tokenizer, trained on 3 trillion tokens.

Current checkpoint available: PY007/TinyLlama-1.1B-step-50K-105b · Hugging Face
GitHub: GitHub - jzhang38/TinyLlama: The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Since the 4-bit quantized model weights of TinyLlama consumes only ~550MB RAM, I’d imagine the Llama.cpp 4-bit quantized version would run nicely inside a canister.

1 Like