Technical Working Group: Scalability & Performance

Hi @ulan,

I’d like to follow up on this post in the Technical Working Group DeAI. @patnorris mentioned that you had some really exciting news about large improvements in floating point computations that could be factors faster.

If you are interested, I can make my test environment available to you. It is in a private repository right now, with bash scripts to:

  • deploy, load & configure the LLMs and the LoadBalancer
  • run the test for different number of LLMs and different number of concurrent users.

Or alternatively, if you can give me early access to the new capabilities, I can also run tests and share the results with you.