Llama2.c LLM running in a canister!

icpp · August 11, 2023, 7:58pm

Try it

Deployment to main-net went smooth.

icpp_llama2 with the stories15M.bin is now running on-chain in canister 4c4bn-daaaa-aaaag-abvcq-cai.

You can call it’s inference endpoint with:

dfx canister call --network ic 4c4bn-daaaa-aaaag-abvcq-cai inference '(record {prompt = "" : text; steps = 20 : nat64; temperature
 = 0.8 : float32; topp = 1.0 : float32;})'
(
  variant {
    ok = "Once upon a time, there was a little boat named Bob. Bob loved to float on the water"
  },
)

If you play with the parameters, you will quickly run into the instructions limit. That is an area of investigation right now.

Topic		Replies	Views
Is llama-13B(or 7B) LLM possible to deploy on canister? Developers Discussing	4	494	June 6, 2024
Introducing the LLM Canister: Deploy AI agents with a few lines of code Developers rust , DeAI	49	3137	June 12, 2025
Llama.cpp on the Internet Computer Programs & Applications	11	478	February 2, 2025
Chatbot Canister some advice Developers	1	33	February 15, 2025
Persistence for Llama2.c LLM weights in canister General	0	18	February 16, 2025

Llama2.c LLM running in a canister!

Try it

Related topics