Llama2.c LLM running in a canister!

Try it

Deployment to main-net went smooth.

icpp_llama2 with the stories15M.bin is now running on-chain in canister 4c4bn-daaaa-aaaag-abvcq-cai.

You can call it’s inference endpoint with:

dfx canister call --network ic 4c4bn-daaaa-aaaag-abvcq-cai inference '(record {prompt = "" : text; steps = 20 : nat64; temperature
 = 0.8 : float32; topp = 1.0 : float32;})'
(
  variant {
    ok = "Once upon a time, there was a little boat named Bob. Bob loved to float on the water"
  },
)

If you play with the parameters, you will quickly run into the instructions limit. That is an area of investigation right now.

7 Likes