Hi @ulan,
I’d like to follow up on this post in the Technical Working Group DeAI. @patnorris mentioned that you had some really exciting news about large improvements in floating point computations that could be factors faster.
If you are interested, I can make my test environment available to you. It is in a private repository right now, with bash scripts to:
- deploy, load & configure the LLMs and the LoadBalancer
- run the test for different number of LLMs and different number of concurrent users.
Or alternatively, if you can give me early access to the new capabilities, I can also run tests and share the results with you.