The Current State of the Art of AI and Blockchains

About a week before the elections @diegop Diego Pratt published what must be the only, or one of the very few honest analysis of how can blockchains and AI collaborate.

If you have not read it I encourage you to do so here:

Diego focuses on blockchains, and not just any cloud with AI powers, which makes the discussion especially valuable. He makes three crucial points:

  • AI on chain is limited by compute power (how many operations per second) and
  • On chain memory available to run current open source LLM models such as Llama
  • GPUs are not coming to any blockchain any time soon because of the huge challenge that is to achieve determinism while using GPUs which are inherently not determenistic.

I particularly love how he highlighted that the IC can actually run some small LLM models, provided they can run on 4 GB of RAM, and do not require much more than 2 Billion Operations per second, otherwise they will run but responses to a prompt will be quite slow.

The following are the LLMs that would run okay with those limitations according to Chat GPT:


| Model               | Parameters (approx.) | Context Length | Quantization | Notes                                        | Source |
|---------------------|----------------------|----------------|--------------|----------------------------------------------|--------|
| **BTLM-3B-8K**      | 3 billion           | 8,000 tokens   | 4-bit        | Designed for efficiency, fits within 4GB RAM | [Sci Fi Logic](https://scifilogic.com) |
| **StableLM-3B-4E1T**| 3 billion           | Variable       | 4-bit        | General NLP, works on low-end hardware       | [Sci Fi Logic](https://scifilogic.com) |
| **TinyLlama-1.1B**  | 1.1 billion         | Variable       | 4-bit        | Optimized for conversation on low-end devices | [Sci Fi Logic](https://scifilogic.com) |
| **phi-1.5**         | 1.3 billion         | Variable       | 4-bit        | Common sense, language understanding, diverse NLP tasks | [Sci Fi Logic](https://scifilogic.com) |

I hope this is useful to those building AI on the IC.

To those developers building now AI Agents or solutions, what are you guys using as an LLM?

3 Likes

At Seers AI, we use the latest AI models and HTTPS outcalls to deliver better results. On-chain models are still too expensive and not advanced enough for intelligent agents.

1 Like

@marcio Makes sense, that way your speed is limited by the HTTPS outcalls only.

How responsive are the prompts to the user?

1 Like

Currently, we’re using agents for non-real-time tasks such as summarizing news, creating prediction markets based on those summaries, reading news updates, and resolving markets accordingly. Soon, agents will also be able to trade.

Starting this week, users will have the ability to create agents to perform similar tasks. We’ll begin testing how many agents can run simultaneously and how quickly users can schedule them.

However, even with HTTPS outcalls, the process can become expensive quite quickly. Also, they are limited right now to 30 sec and often they don’t reach consensus so we are using for the moment a dedup and caching server in between. Hopefully, we can get improvements to outcalls soon!

1 Like

Thanks for all those details on how it works, this is the way for sure, a hybrid approach so that AI is actually powerful, and useful inside IC DAPPs.

1 Like

Indeed a great summary.

As pointed out, you can run llama.cpp now in a canister.

My experience is that it works well for 0.5B parameters, like the Qwen 2.5 model.

It is still too slow for instant chat applications, but it is definitely possible to come up with very interesting applications where instant response is not needed. There are many applications where the LLM inference call is done as batch and it is ok if it takes a while to get an answer.

That can be done today

1 Like

Do you know how it compares vs https outcalls in terms of price/cycles?

I didn’t look into that, but I expect outcalls to be cheaper

2 Likes