LLMs running in canisters

Hi, could someone confirm if this answer is correct? ⁠LLMs (Which can be summarized as “It doesn’t run directly in a canister, but through http outputs that call the ofchain service.”) I’d like to start experimenting with building different types of machine learning models and topologies, and I’d like to know if it’s possible to run compiled models directly in a canisters, considering limitations on the number of instructions per call, etc. Thanks.