Llama2.c LLM running in a canister!

icpp · August 30, 2023, 1:09am

I like to use this thread to share the high level roadmap I have in mind for icpp-llm, and keep you posted on the progress each time I reach a milestone or encounter a blocker.

Milestone 1: remove the limitations of the current tinystories canister, so it can generate stories longer than 20 words.
This means I need to find a way to work around the max instructions per message limit.

Milestone 2: run inference with memory and matrix calculations distributed across multiple canisters.
For this, I plan to use an HPC type appoach, kind of treating the IC as a massively parallel compute cluster.

Milestone 3: Run inference with the llama2_7b_chat model. Not worrying about speed, just the ability to load it and talk to the LLM.

Milestone 4: Optimize and scale.

This is going be a fun challenge.

Topic		Replies	Views
Is llama-13B(or 7B) LLM possible to deploy on canister? Developers Discussing	4	488	June 6, 2024
Introducing the LLM Canister: Deploy AI agents with a few lines of code Developers rust , DeAI	48	2925	May 7, 2025
Llama.cpp on the Internet Computer Programs & Applications	11	461	February 2, 2025
Chatbot Canister some advice Developers	1	31	February 15, 2025
Persistence for Llama2.c LLM weights in canister General	0	18	February 16, 2025

Llama2.c LLM running in a canister!

Related topics