Technical Working Group: Scalability & Performance

icpp · April 18, 2024, 7:56pm

Thank your for today’s review of my scalability tests for ICGPT.

Most important learning is to limit the number of LLMs per subnet to 4.

This because a subnet is using 4 threads for update calls, so going above that will not help.

I reran my tests, and I am now indeed getting very consistent & excellent results up to 8 concurrent users.

In this graph I am plotting the max duration of the story generation, i.e. this is the longest a user will have to wait for their story to be completed. The lower the number the better, and the red curve for 4 LLMs is really good!

I also tried to go to 16 concurrent users but that resulted in timeouts and unsuccessful update calls.

Topic		Replies	Views
ICP.Lab Storage & Scalability Summaries Developers	18	4801	April 9, 2025
High User Traffic Incident Retrospective - Thursday September 2, 2021 Developers	50	8981	October 30, 2021
Let's solve these crucial protocol weaknesses Developers	128	12966	July 9, 2024
Scalability of update calls in a common scenario Developers	21	4039	October 30, 2020
Abstract away the 4GB canister memory limit Developers	26	3868	June 29, 2021

Technical Working Group: Scalability & Performance

Related topics