Hi everyone, I’m excited to share DeVinci, the browser-based AI chatbot app served from the Internet Computer. You can chat with the AI model loaded into your browser so your chats remain fully on your device. If you choose to log in, you can also store your chats on the IC and reload them later.
Thank you for giving it a try and sharing this!
I’m actually facing the same issue currently when I’m trying to integrate bigger AI models into DeVinci and running them on my device (btw, currently DeVinci uses RedPajama-INCITE-Chat-3B-v1-q4f32_0, so a model with ca 3 billion parameters). My best guess at this point is that the browser/device cannot allocate enough memory to the model (which actually requires quite a lot).
Do you happen to have many other programs/processes or other tabs in the browser open? They could block needed memory.
Do you know how much RAM the device you’re running this on has? And does it have a GPU?
Yes it is a Ryzen 5 5600 6c/12t with 32 GB RAM and an intel arc 750 with 8 GB of ram, tried latest Chrome and Canary as well as Brave and Nightly. Also the same on an intel i7 -10700 with iGPU only 16 GB ram still stuck at the same step. Will try on Linux after I get home and dig a little deeper.
Sorry my bad, not sure yet if WSL2 and or Docker or other Windows11 bloat. But on a fresh Win11 with only updates nothing else, it works like a charm. Doesn’t even eat that much ram as I expected. Works on chrome and edge very smooth.
Great, happy to hear And thank you for giving it several tries!
Your machine is actually more powerful than mine, so potentially even the bigger models would run. I’ll see that I integrate the Llama2 model soon and allow users to choose between the models, so you and others with a similar device can actually try the state-of-the-art models
I am actually also looking at running Llama 2 locally before tinkering with it and try running it in a canister since Llama2.c LLM running in a canister! it has been done already. But will do another approach.
Just pushed changes which make the Llama2 model available to power the chat. Under Settings, you can now choose which model you’d like to use (the default is RedPajama 3 billion parameters).
This is the Llama2 7 billion parameters model, so quite a bit bigger than the default and thus also requires a pretty powerful device to run. Anyone who thinks their device is up to the challenge is invited to give it a try
Yes, getting models to run in a canister is amazing. I hope we can make some exponential steps to soon be able to run models like Llama2 7b and beyond in a canister. I’m not sure which improvements we’d need on the protocol/network level to achieve this.
Hi everyone, if you like you can give the new Mistral 7B model a try on DeVinci now: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/
To use Mistral 7B, you need to log in and select it under User Settings. Then, on the DeVinci AI Assistant tab, click on Initialize to download it (will take a moment on first download) and once downloaded, you can chat with it as usual. Please let me know if you have any feedback Cheers
Happy to announce that DeVinci now also works on Android
If you want to give it a try, you’ll need the latest Chrome Canary browser on your Android device. You can then download the RedPajama 3B LLM (default, so you just need to click on Initialize) and then chat with it fully on your device as usual: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io
If you’ve got limited mobile data, please wait until you’ve got a Wifi connection as the LLM is over 1 GB big