Technical Working Group: Scalability & Performance

Thank your for today’s review of my scalability tests for ICGPT.

Most important learning is to limit the number of LLMs per subnet to 4.

This because a subnet is using 4 threads for update calls, so going above that will not help.

I reran my tests, and I am now indeed getting very consistent & excellent results up to 8 concurrent users.

In this graph I am plotting the max duration of the story generation, i.e. this is the longest a user will have to wait for their story to be completed. The lower the number the better, and the red curve for 4 LLMs is really good!

I also tried to go to 16 concurrent users but that resulted in timeouts and unsuccessful update calls.

4 Likes