This thread discusses llama.cpp on the Internet Computer.
A project funded by the DFINITY Grant: ICGPT V2
The first functioning version is now MIT licensed open source:
This thread discusses llama.cpp on the Internet Computer.
A project funded by the DFINITY Grant: ICGPT V2
The first functioning version is now MIT licensed open source:
Current status:
It’s been a journey, but a pre-release of ICGPT with a llama.cpp backend is now live on the IC.
ICGPT V2 - final milestone reached
(I also posted this on X)
The grant work is now completed and here is a video summarizing what I created. I want to thank @dfinity for the opportunity & the support.
I also want to thank the #ICP community for the enthusiasm as I shared progress along the way over the past months, and the testing some of you did with the early releases.
Some of you even donated cycles that will keep the Qwen2.5 canister up & running for several months. You are the best.
You can try it out at: https://icgpt.icpp.world
I am very happy with the outcome of this project and there are big plans to build on top on this foundation. More on this later. But first some time to celebrate this milestone
congrats super cool i hope this gets the use and attention it deserves
btw i’m playing with a solana project https://github.com/ai16z/eliza/ which can connect to to remote or local LLMs
does this have API end points so that we could add it as one of the remote models? i can only pidgeon code so hard for me to eval how it works
@superduper ,
thanks for your feedback and for pointing out the eliza model. I will check it out.
About the API:
There are two endpoints new_chat & run_update. These are candid based canister endpoints. The links bring you to the candid service definition. In the README it is described how you call it using dfx. This API is using the llama.cpp command line interface, and makes it really easy to test things locally, and then use the same arguments when calling the canister.
We’re looking at creating another API that is following the openAI completions standard.
Congratulations on the final milestones! Must have been some journey.
We just got done with our first milestone of LLM Marketplace and will be getting into an interesting phase. In this phase I was planning to explore Tiny llama (1.1 B parameters) and train it for a niche task.
But looking at your post I’m having second thoughts, primarily because your model is smaller than tiny llama and quantised. Despite, it hits the instruction limit restriction. I will experiment and share my learning.
I’m trying to brainstorm ideas to see what could be a tiny task it could be trained on bypassing the instruction limit.
Would appreciate your thoughts around it.
Cheers!
Hi @roger ,
The 1.1B Tiny Llama will not fit.
I recommend you select a 0.5B parameter model, like the Qwen 2.5 model I am using, and try to fine tune that one.
Yes, I have been contemplating to use Qwen and also exploring few other light wt models like SmolLM, DistilGpt2.
While researching these in also trying to close on a fun use case.
Thank you for the suggestion.