@branbuilder ,
I also want to crosslink this message from @jeshli , with a great proposal to introduce optionally charging for query calls in exchange for a big increase in the instructions limit.
I believe this would make it possible for you to run the LLM agents of ELNA on chain.
If it would indeed unblock you, please voice that with DFINITY to put more weight behind this request. The ELNA project is the most visible AI project on the IC, and your support would mean a lot.
EDIT: this proposal would unblock the instructions limit. Still probably blocked by the memory and lack of GPU limits, but perhaps certain, smaller LLM agents could already run with just this instructions limit removed.