Browser-based AI Chatbot Served From The IC

Hi everyone, I’m excited to share DeVinci, the browser-based AI chatbot app served from the Internet Computer. You can chat with the AI model loaded into your browser so your chats remain fully on your device. If you choose to log in, you can also store your chats on the IC and reload them later.

It’s based on Web LLM (GitHub - mlc-ai/web-llm: Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.) which enables LLMs to run in the browser. Currently, only Chrome and Edge on desktop are supported (with increasing WebGPU support this should change in the future though).

If you like, you can find the code here (GitHub - patnorris/DecentralizedAIonIC) and give the chat app a try here: GitHub - patnorris/DecentralizedAIonIC :slight_smile: Do you have any feedback? I’d love to hear it! Thank you.

9 Likes

Great, do you have a product that I can try

1 Like

Sure thing, you can give it a try here: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/ Please let me know if you have any feedback, thanks :slight_smile:

Great stuff btw, for me it gets stuck @ Fetching param cache[49/51]: 1566MB fetched. 94% completed even after few refreshes

index-37d6f91f.js:934 Uncaught (in promise) DOMException: The operation failed for an operation-specific reason

2 Likes

Thank you for giving it a try and sharing this!
I’m actually facing the same issue currently when I’m trying to integrate bigger AI models into DeVinci and running them on my device (btw, currently DeVinci uses RedPajama-INCITE-Chat-3B-v1-q4f32_0, so a model with ca 3 billion parameters). My best guess at this point is that the browser/device cannot allocate enough memory to the model (which actually requires quite a lot).

Do you happen to have many other programs/processes or other tabs in the browser open? They could block needed memory.
Do you know how much RAM the device you’re running this on has? And does it have a GPU?

Yes it is a Ryzen 5 5600 6c/12t with 32 GB RAM and an intel arc 750 with 8 GB of ram, tried latest Chrome and Canary as well as Brave and Nightly. Also the same on an intel i7 -10700 with iGPU only 16 GB ram still stuck at the same step. Will try on Linux after I get home and dig a little deeper.

Most certainly is a “out of memory” issue DOMException - Web APIs | MDN

Sorry my bad, not sure yet if WSL2 and or Docker or other Windows11 bloat. But on a fresh Win11 with only updates nothing else, it works like a charm. Doesn’t even eat that much ram as I expected. Works on chrome and edge very smooth.

1 Like

Great, happy to hear :slight_smile: And thank you for giving it several tries!

Your machine is actually more powerful than mine, so potentially even the bigger models would run. I’ll see that I integrate the Llama2 model soon and allow users to choose between the models, so you and others with a similar device can actually try the state-of-the-art models :slight_smile:

1 Like

I am actually also looking at running Llama 2 locally before tinkering with it and try running it in a canister since Llama2.c LLM running in a canister! it has been done already. But will do another approach.

3 Likes

Just pushed changes which make the Llama2 model available to power the chat. Under Settings, you can now choose which model you’d like to use (the default is RedPajama 3 billion parameters).

This is the Llama2 7 billion parameters model, so quite a bit bigger than the default and thus also requires a pretty powerful device to run. Anyone who thinks their device is up to the challenge is invited to give it a try :slight_smile:

Yes, getting models to run in a canister is amazing. I hope we can make some exponential steps to soon be able to run models like Llama2 7b and beyond in a canister. I’m not sure which improvements we’d need on the protocol/network level to achieve this.

1 Like

Hi everyone, if you like you can give the new Mistral 7B model a try on DeVinci now: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/
To use Mistral 7B, you need to log in and select it under User Settings. Then, on the DeVinci AI Assistant tab, click on Initialize to download it (will take a moment on first download) and once downloaded, you can chat with it as usual. Please let me know if you have any feedback :slight_smile: Cheers

5 Likes

Hi everyone, I just added a bunch of new models:
Llama-2-13b
WizardCoder-15B
WizardMath-7B
OpenHermes 2.5
NeuralHermes 2.5

You can give them a try by logging in and then selecting them under User Settings: DeVinci App

Please note that the bigger the model, the more RAM needed :slight_smile:

Enjoy and please let me know if you have any feedback!

1 Like

Happy to announce that DeVinci now also works on Android :slight_smile:

If you want to give it a try, you’ll need the latest Chrome Canary browser on your Android device. You can then download the RedPajama 3B LLM (default, so you just need to click on Initialize) and then chat with it fully on your device as usual: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io

If you’ve got limited mobile data, please wait until you’ve got a Wifi connection as the LLM is over 1 GB big :slight_smile:

1 Like

Hi all, you can now also use
Gemma 2B
TinyLlama 1.1B

in addition to previous models (e.g. Mistral 7B, RedPajama 3B, several LLama2 versions).

Happy to hear any feedback you might have! Thanks and best :slight_smile:

1 Like

and Phi-2 is now available if you’re on Android (needs Chrome Canary currently) :+1:

2 Likes

You can now try out Llama3 on DeVinci :rocket:

Under Settings, choose Llama3 8B or 70B on laptop and 8B on an Android phone. Best to be on WiFi to not use up your mobile data :slight_smile:

Enjoy and let me know how it performs for you :+1:

4 Likes