Browser-based AI Chatbot Served From The IC

patnorris · August 19, 2023, 12:32pm

Hi everyone, I’m excited to share DeVinci, the browser-based AI chatbot app served from the Internet Computer. You can chat with the AI model loaded into your browser so your chats remain fully on your device. If you choose to log in, you can also store your chats on the IC and reload them later.

It’s based on Web LLM (GitHub - mlc-ai/web-llm: Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.) which enables LLMs to run in the browser. Currently, only Chrome and Edge on desktop are supported (with increasing WebGPU support this should change in the future though).

If you like, you can find the code here (GitHub - patnorris/DecentralizedAIonIC) and give the chat app a try here: GitHub - patnorris/DecentralizedAIonIC Do you have any feedback? I’d love to hear it! Thank you.

patnorris · August 21, 2023, 1:32pm

Sure thing, you can give it a try here: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/ Please let me know if you have any feedback, thanks

ZackDS · August 21, 2023, 2:06pm

Great stuff btw, for me it gets stuck @ Fetching param cache[49/51]: 1566MB fetched. 94% completed even after few refreshes

index-37d6f91f.js:934 Uncaught (in promise) DOMException: The operation failed for an operation-specific reason

patnorris · August 21, 2023, 4:27pm

Thank you for giving it a try and sharing this!
I’m actually facing the same issue currently when I’m trying to integrate bigger AI models into DeVinci and running them on my device (btw, currently DeVinci uses RedPajama-INCITE-Chat-3B-v1-q4f32_0, so a model with ca 3 billion parameters). My best guess at this point is that the browser/device cannot allocate enough memory to the model (which actually requires quite a lot).

Do you happen to have many other programs/processes or other tabs in the browser open? They could block needed memory.
Do you know how much RAM the device you’re running this on has? And does it have a GPU?

ZackDS · August 21, 2023, 4:39pm

Yes it is a Ryzen 5 5600 6c/12t with 32 GB RAM and an intel arc 750 with 8 GB of ram, tried latest Chrome and Canary as well as Brave and Nightly. Also the same on an intel i7 -10700 with iGPU only 16 GB ram still stuck at the same step. Will try on Linux after I get home and dig a little deeper.

Most certainly is a “out of memory” issue DOMException - Web APIs | MDN

ZackDS · August 21, 2023, 5:32pm

Sorry my bad, not sure yet if WSL2 and or Docker or other Windows11 bloat. But on a fresh Win11 with only updates nothing else, it works like a charm. Doesn’t even eat that much ram as I expected. Works on chrome and edge very smooth.

patnorris · August 21, 2023, 6:11pm

Great, happy to hear And thank you for giving it several tries!

Your machine is actually more powerful than mine, so potentially even the bigger models would run. I’ll see that I integrate the Llama2 model soon and allow users to choose between the models, so you and others with a similar device can actually try the state-of-the-art models

ZackDS · August 21, 2023, 6:20pm

I am actually also looking at running Llama 2 locally before tinkering with it and try running it in a canister since Llama2.c LLM running in a canister! it has been done already. But will do another approach.

patnorris · August 23, 2023, 5:57pm

Just pushed changes which make the Llama2 model available to power the chat. Under Settings, you can now choose which model you’d like to use (the default is RedPajama 3 billion parameters).

This is the Llama2 7 billion parameters model, so quite a bit bigger than the default and thus also requires a pretty powerful device to run. Anyone who thinks their device is up to the challenge is invited to give it a try

Yes, getting models to run in a canister is amazing. I hope we can make some exponential steps to soon be able to run models like Llama2 7b and beyond in a canister. I’m not sure which improvements we’d need on the protocol/network level to achieve this.

patnorris · October 21, 2023, 7:15pm

Hi everyone, if you like you can give the new Mistral 7B model a try on DeVinci now: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io/
To use Mistral 7B, you need to log in and select it under User Settings. Then, on the DeVinci AI Assistant tab, click on Initialize to download it (will take a moment on first download) and once downloaded, you can chat with it as usual. Please let me know if you have any feedback Cheers

patnorris · December 11, 2023, 1:14am

Hi everyone, I just added a bunch of new models:
Llama-2-13b
WizardCoder-15B
WizardMath-7B
OpenHermes 2.5
NeuralHermes 2.5

You can give them a try by logging in and then selecting them under User Settings: DeVinci App

Please note that the bigger the model, the more RAM needed

Enjoy and please let me know if you have any feedback!

patnorris · December 14, 2023, 1:40am

Happy to announce that DeVinci now also works on Android

If you want to give it a try, you’ll need the latest Chrome Canary browser on your Android device. You can then download the RedPajama 3B LLM (default, so you just need to click on Initialize) and then chat with it fully on your device as usual: https://x6occ-biaaa-aaaai-acqzq-cai.icp0.io

If you’ve got limited mobile data, please wait until you’ve got a Wifi connection as the LLM is over 1 GB big

patnorris · March 1, 2024, 12:43pm

Hi all, you can now also use
Gemma 2B
TinyLlama 1.1B

in addition to previous models (e.g. Mistral 7B, RedPajama 3B, several LLama2 versions).

Happy to hear any feedback you might have! Thanks and best

patnorris · March 3, 2024, 8:02pm

and Phi-2 is now available if you’re on Android (needs Chrome Canary currently)

patnorris · April 20, 2024, 9:17pm

You can now try out Llama3 on DeVinci

Under Settings, choose Llama3 8B or 70B on laptop and 8B on an Android phone. Best to be on WiFi to not use up your mobile data

Enjoy and let me know how it performs for you

patnorris · May 14, 2024, 5:38pm

Hi everyone, just released a new version with the following updates:

First vector database functionality: you can upload a PDF document which is turned into an in-browser vector database (so stays fully private) and can then be used by the LLM as a knowledge base (RAG)
Markdown for messages works now
Pressing the enter key will send the prompt in the message input field

Please also note that DeVinci is a Progressive Web App and you can thus install it (including using the chat functionality offline).

I’m happy to hear about any feedback or ideas you have

patnorris · May 16, 2024, 9:07am

Thrilled to announce that @Nuno and I will redesign the DeVinci user experience and add some great features to it! Our goal is to improve DeVinci such that it can truly be your end-to-end decentralized and fully private AI Chat app as an alternative to common Web2 services The app will stay a hybrid with on-device (easily accessible through the browser) and on-chain components built with and on the Internet Computer.

With @Nuno joining this mission, we’ve got great expertise in UX/UI, design and branding on the team now, and I’m very excited to present you our redesigned app soon. It’ll be on a whole other level

We’re grateful that DFINITY’s supports this work via a dev grant As such, please let us know how DeVinci can become max valuable to you and to the ecosystem

DeVinci is also part of the shared vision with @icpp for decentralized AI where a hybrid network of on-chain and on-device AI services work together seamlessly for the user and everyone contributes (instead of only a few centralized big services providing all AI) → DeAI for the Win

Are you open to providing us feedback as a DeVinci user along the way or even interested in becoming an official tester? Please let me know, would be great to have you

Agnostic · May 21, 2024, 5:40am

On Internet Computer main site, there’s now an Ai category in the Dapps section. There are like 16 Ai dapps. This must have recently been added because I don’t see any content creators on social media covering it.

I tried your Ai chat a few weeks ago. I would say your Ai dapp is the best on ICP right now if you’re looking to chat about different topics without jumping to different Ai agents. The responses are also fast compared to one other ai chat I used where the responses were slow and limited.

I’ve also tried Elna, ICgpt, and DeAi Chat (DeAI chat not working yet).

patnorris · May 21, 2024, 9:17am

Thank you so much for the feedback! If you’re open to sharing it (here or as a DM); which aspects about DeVinci are most important to improve from your perspective? And what would you like to do with the dapp that it currently doesn’t support? Would love to hear more about your ideas and experience, as we’re working to lift DeVinci to another level

Agnostic · May 21, 2024, 5:26pm

My views would be from an average user perspective since i have no programming/developer skills/experience.

I think it would be good to add some of the same features that we find on Open Ai chat gpt, like text-to-image and even text-to-video. The Uix has a basic appearance that can be improved, but I’m guessing that you’re still developing the site. I know the important thing is the performance of the Ai.

I think we have to find more ways for ai to be decentralized and make it something that the user can take advantage of. One feature can involve having an easy process (one that doesn’t require technical skills or needing to be a developer) for users to contribute/improve on the knowledge base. For instance, Elna has an option to create an Ai companion. Maybe make an Ai bot that is managed by a DAO.

Topic		Replies	Views
Introducing the LLM Canister: Deploy AI agents with a few lines of code Developers rust , DeAI	61	3785	July 21, 2025
Llama2.c LLM running in a canister! Programs & Applications	61	4861	July 1, 2024
DeAI.chat – Decentralized AI chat on the Internet Computer Showcase DeAI	0	115	February 25, 2025
Technical Working Group DeAI Developers	330	13483	July 17, 2025
AI and machine learning on the IC? Developers	114	10084	June 20, 2024

Browser-based AI Chatbot Served From The IC

Related topics