How does Blob storage fits into the ON-chain narrative?

I was excited hearing what blob storage brings to ICP seeing that I had a hard time using Caffeine to make a video-sharing app. Uploading videos was slow, and most times, they wouldn’t even play even after successful upload. Nevermind, how much this was going to cost after the free alpha is over.

The only thing I’m iffy on is that the blob storage will be off-chain. I know Dfinity will be working on decentralizing it, and allowing all canisters to access this blob storage, but I haven’t seen anything about it eventually being moved ON-chain.

Will Dfinity eventually allow large amounts of data to be stored and ran 100% ON-chain in a cost-efficient way? If not, does that undermine the ON-chain narrative?

I should also say that I’m very appreciative and impressed with all the different apps that can already be ran on the blockchain thanks to Dfinity. I know they are still talking about bringing other things onchain, like training AIs, Wasm 64, etc. I’d rather have blob storage than to have no cost-efficient way to store video and other types of large data. Not having at all would be a huge limitation for Caffeine Ai users.

1 Like

In contrast to how some things wrt ICP development have been done in the past, we’re here pushing to:

  • solve real issues that people have - NOW, rather than in 6-12 months
  • iterate with users to improve the product in a way that helps them most, rather than serving them an over-engineered solution as is (in a “take it or leave it” fashion)
  • focus on quality rather than features; start small with good usability, rather than serving a ton of features and fantastic looks
  • be sustainable and scalable, to support scaling and growth your apps

These are my short-term objectives. I’m not the final decision maker, but I will fight for the above.

In the first phase, DFINITY will provide and manage the storage. Prices should be reasonable and sustainable long term. So quite a bit centralized around DFINITY. Thiis will give us the velocity we need to iterate quickly and give users the capacity they need.

It’s too early to talk about the next phase, but one idea is: allow anyone to provide and manage the storage. Storage providers get a slice of the cake.

After that, we make a gateway (or whatever the name) that will allow us to serve the currently unused storage capacity from ICP nodes (every node has ~30TB of high-end NVMe storage). Possibly have a layer of Erasure Coding on top of that, to provide the same reliability as the current ICP subnets have (1/3 of nodes may fail and we can still keep running).

We (or at least I) have big plans. But let’s go step by step. And based on the user needs. If on-chain is what people absolutely require, then we’ll prioritize that activity.

Step 1 (right now) is to solve the burning issue: scalable storage for millions of apps made by Caffeine. Everything else is up for discussion.

11 Likes

Thanks for this explanation @sat. This sounds like the smart way to go about it to me. Move fast, iterate, and deliver value while improving based on real feedback. The end goal sounds extremely exciting and I cant wait.

NICE!!!

Can we have simple website builder next to Caffeine, right now Caffeine is not suitable for making websites. Wix and Hostinget like builder is needed, only than you have chance to get 1+ mil apps with short timeframe.

What makes you say that?

Have you tried asking for a landing page with a paragraph of placeholder text, and an admin page where a priviledged user can sign in and update that text (provide your principal)?

Then iterate from there. Basically build your own website builder from where you can sign in and update stuff.

1 Like

Sounds like common use case tutorials with sample prompts would be very useful for some users.

1 Like

@lorimer beat me to it.

This thing is super powerful. The biggest limitation is what you can think to do.

3 Likes

Appreciate the response and I agree with this approach and a lot of your other ideas.

Dominic has primed the community to be blockchain maximalists and that’s part of why I’m raising this question. But hey, if we can preserve much of Dominic’s vision, like being decentralized, having data ownership, tampoerproof smart contracts, then I don’t care if it’s on web 2 or web 3 infrastructure.

We spoke about blob storage a couple years ago at the ICP lab and it was floated that extra node storage have a kind of pseudo ipfs gateway. If you need something on a different subnet(or elsewhere on ipfs) you could pin it to your subnet and pull it over and have access from your canister. You get the hash based assurance ipfs, provided you self check the hash or provide a way for your users to. I always liked that idea.

Payment for storage is probably a bit quirky, but probably solvable. We could really use this for the wasm repo we are building.

The unspoken advantage here is that ipfs gossip and infra is probably a solved problem if the node machines have capacity.

I must agree with @Henn91 on that front, how cool and useful caffeine is from a technical perspective.

If I were a website builder that would build websites for small businesses (usually marketing related). I can’t build them to my liking because the tools are (still) missing.

The designs are based on what caffeine thinks is right and it’s pretty much impossible to envision a design and get that recreated by caffeine.

Some things that would improve it would be;

  • supporting images
  • Supporting design files like Figma to recreate
  • Edit content without having to recode the whole dapp

But maybe a website builder can be created via caffeine that solves all this, didn’t try that yet

3 Likes

This is precisely the running plan. All blobs are hashed, and a Merkle tree is built for the contents of the blob. The root of the Merkle tree is preserved on chain - in the canister itself, which ensures that nothing can modify the blob without it getting noticed. Pretty much the same functionality as what IPFS does. So blobs stored off chain are still cryptographically protected (by sha256). On top of that, we want to: a) add a gateway in front of the actual storage that behaves very similar to regular HTTP servers (including range accesses, regular GETs and PUTs, etc) so that minimal effort is needed to use it, and b) the gateway stores data on regular s3-compatible storage (such as https://www.min.io/) so that anyone can become a storage provider with minimal effort, including DFINITY.

The reason we didn’t go with the real IPFS is because we ran some tests and we also got some reports from the broader community, that IPFS can be quite unreliable at times, with long and hard to control tail latencies.

Payments will be taken care of by us. I agree it’s tricky, but we’ll try to make it user friendly.

Right after we launch with Caffeine, we’ll allow all other (non-Caffeine) canisters to use the same storage, in a similar way. After all, Caffeine only facilitates building regular canisters.

4 Likes

Thank you so much for clarifying. Since we have it off chain, could there be a possibility on the road map to have the users choose their off chain storage service provider? Asking because a lot of our startups have data localization requirements, the devs often have to build custom storage solutions off chain, some have solutions of agent that verify and log data integrity, but caffeine can do that, it would be super cool.

anyone can become a storage provider with minimal effort, including DFINITY.

Will providers have to be approved by the NNS? In your first comment you mentioned eventually the blob storage will also leverage unused storage from IC nodes and possibly use erasure coding, would this also be applied to offchain providers?

Payments will be taken care of by us

Are you planning on decentralizing the payment part too? Or will it always be handled by DFINITY?

2 Likes