funnAI: first Proof‑of‑AI‑Work on ICP

@berestovskyy, @dsarlis, @free, @Manu, @ielashi

I am tagging you to make you aware of this effort, and request your feedback about the architectural decisions we still have to make based on the main-net testing results.

If you have any feedback or thoughts right now, or if there is any data you recommend we collect during the stress-tests, please let us know.

The backend canister architecture of funnAI is summarized by this diagram:

Some important details:

  • The mAIner agent Creator spins up the mAIner agent & Large Language Model (LLM) canisters, installs the code, and then for the LLM canister, uploads a 644 MB model file (which is the LLM with all its parameters etc). The upload of that file is done with inter-canister update calls, using chunks of 1.9Mb bytes.

  • A mAIner agent canister generates responses from the LLM using a sequence of inter-canister update calls. Each update call is able to generate 13 tokens before hitting the instruction limit.

  • The speed of token generation is important but not critical. Most important is that every request gets handled and produces a response within a reasonable amount of time.

  • Based on work done before, with ICGPT & Charles, we found that 4 LLMs per subnet is optimal.

    We had a session with the Scalability & Performance Working Group last year (here & here), which led to the conclusion that using 4 LLMs per subnet is optimal. We assume this is still the case.

The architectural decisions we need to make are:

  1. How many mAIners can be created at the same time, without bottling up the network?

  2. Is it ok to have one mAIner agent Creator or should we have one for each subnet?

  3. How should we configure the mAIner service to support N mAIner agents?

    We plan to go with one mAIner service and 16 LLMs across 4 subnets. We think this is sufficient, but we need to verify the concurrent load it can support.

  4. How many mAIner Agents of type Own (i.e. having their own dedicated LLM canister attached) can run concurrently on one subnet?

    We will initially limit each mAIner of type Own to just one LLM. It is not feasible to limit the number of mAIners of type Own to just 4 per subnet. But what is the maximum number of LLMs that can be running concurrently and still guarantee that a result is produced?

Some of our main concerns:

  • Our biggest concern is running into time-outs or other bottlenecks and the mAIner agent is not able to generate a response. We will queue things up based on this finding from April 2024:

    From here: “I also tried to go to 16 concurrent users but that resulted in timeouts and unsuccessful update calls”

  • Beyond time-outs, what other bottlenecks do you think are possible and we should anticipate?

  • We also want to ensure that our ICP neighbors (other apps running on the subnets) aren’t drastically affected by our workloads; which measurements would you recommend us to take to check for this?

1 Like