Dfinity/llm on the main Net

I found this package on npm but I have no idea how it work locally. I have to install ollama and pull a model and then I can use it but on the main Net how the heck does it actually work? Does it have or connect to an AI model that is already deployed on a canister there is no explanation there.

Hi @AliSci, if you’d like to try the LLM canister with no setup, you can try the example on ICP Ninja here: https://icp.ninja/projects/llm-chatbot. Note that we only have a Rust or Motoko example for now on ICP Ninja.
If you need the JavaScript example, you can find it here.

2 Likes

No, I want to deploy it on my own canister on main net.

The LLM canister is already deployed on mainnet under the canister ID w36hm-eqaaa-aaaal-qr76a-cai. Do you want to deploy your own canister as a client of the LLM canister?

1 Like

yes, how to do that?

You can check out the examples I shared above to see how to make your canister talk to the LLM canister. You can then download the code from ICP Ninja, or clone the repo to deploy the client to your own canister. Do you have a more specific question, or does that answer your question?

1 Like

As far as using the LLM canister on mainnet, it should “just work” when you deploy the examples available. If you’re curious how it works behind the scenes on mainnet, checkout the “How Does it Work?” section in the README here. Note that the canister used for local development is slightly different from the binary actually used on mainnet.

4 Likes

I just tried it, this is the worst AI I ever used.

  1. too slow
  2. Can’t generate json from text
  3. tasks is easy
    prompt:" I will have 15 minutes call tomorrow 9 am with david"
    response: {
    start: '2025-06-028T9:00",
    end: '2025-06-028T9:30",
    title:“call between ali and david”,
    attendees: [“ali”,“david”]
    }
    I can’t do that at all.
    -I have complex things like
  • give it CV and return json with title, desc, skills, cover letter
  • compare CV with job offer

Alot of it is just wrapping components in a way with the goal of reducing counter party. Whether you want an LLM to do particular things is up to the LLM and how it works… For edgecases people finetune existing models or train new models altogether or prompt engineer models to produce whatever result theyre looking for.

The LLM canister provides an interface for you to interact with the models it supports.

You get the results of the models it supports out the box and you prompt engineer to get whatever results youre after.

Most large LLMs are have broad scope. And even with a very broad model people can and have prompt engineered great results for different usecase. And people can and have solves prompt engineering limitations by fine tuning or even training new models.

Bro said all the obvious things.

1 Like

I think they released a guide on how to implement the AI LLM and connect to a worker. Earlier we had to do http outcalls to get data transformed for a semantic rag. With this implementation, you can use all the models needed.

2 Likes

It sounds like the bad experience you’re describing relates to the model you’re using, which is likely the Llama 3.1 8B. Have you tried some of the other models like Qwen 32B or Llama 4?

The LLM is on-chain or off-chain?

I want phi4 but not available in ur canister.

The LLM is currently hosted off-chain. You can read more about this in the README here.

1 Like