Our current architecture plans using smaller conversational model in the canister to handle simple interactions using custom data stored on-chain [vector embeddings].
For more complex inferences, we integrate with larger external models running off-chain. The canister mainly coordinates querying the API for these models.
You’re correct that there are still significant limitations around running very large language models fully on-chain due to canister memory constraints. We have plans for methods like knowledge distillation and model quantization to help address this.
Our roadmap includes continuing to push the boundaries of what’s possible for on-chain inference as canister computing power expands. But our hybrid approach enables customization and personalization today across many use cases by keeping niche training data on-chain while leveraging larger off-chain models where needed.
We are a small team @Antony , we are working hard to get ELNA live ASAP
and currently we are getting a huge response from community, will definitely get back to you for early access ones its ready
Hey @icpp , thank you for reaching out and sharing your awesome project! It’s great to see the pioneering work you’ve done running open source conversational AI models in canisters using C/C++.
We really appreciate you open sourcing llama2.c - that will be an invaluable resource. Your technical demonstration is an inspiration as we work to bring key components on-chain.
Our current prototype uses Python for workflow orchestration and Motoko for canister logic. But we are very interested to evaluate C/C++ for core inference operations as you have done. The ability to optimize and compile models natively could be a big help. [We are struggling with web assembly limitations ]
We would love to take you up on the offer to provide guidance if we explore going down the C/C++ path in our canisters. Don’t hesitate to reach out if you ever want to brainstorm ideas as we push towards fully on-chain conversational AI together! Thanks again for your pioneering contributions.
Let me know if you need any other details on our roadmap or technical approach. Excited to collaborate with community members like yourself to make decentralized AI a reality!
Would like to have a call with you discuss and brainstorm further