You might have seen my previous posts discussion issues about safe upgrades of canisters and zero downtime upgrades, and clearly, our platform isn’t particularly great for developers when it comes to persisting your data across safely and efficiently canister code upgrades. But it should be great!
If we squint a bit, we’ll see that the problems we are facing here aren’t that unique: We have a system with a persistent storage (the stable memory) that is unstructured, flat and kinda slow. We need to store all relevant data in this storage. And we want to be able to make sense of this data even after changing the code of our canister, or possibly even changing the implementation language. This describes basically any computer out there with a disk attached to it! There, this is solved using file systems and/or databases, so maybe we can simply use the same?
TL;DR
The sqlite
in-process database solves most if not all of the problems related to persistent storage, and can be backed by the IC’s stable memory.
The vision
Something like this should work, and the developer shoudn’t need to worry about much else:
(TBD, avoiding 403 error)
Behind the scenes this would use the canister’s stable memory like a raw block device.
Why is this great?
sqlite
solves the problem of mapping structured data onto a flat disk, in a time-tested way.sqlite
is well known to many developers, as it is the most commonly used data base out there. Even mobile phone apps tend to use sqlite for their data storage!- The problem of schema migration, i.e. what to do if your new canister version stores its data differently, is well-known and has plenty of existing solutions to pick from.
sqlite
is used to work with a slow, block-based backend, so backing it with stable memory is sensible, and will likely make good use of the canister main memory as a cache.sqlite
is designed to support large files (>4GB), so you can use large stable memory easily.sqlite
guarantees backwards compatibility, so no worry about losing your data because your data serialization library has changes.- The
sqlite
file format can be analyzed by dedicated tools, so if the IC (or your canister) provides read access to stable memory, this can be used for debugging, backup and archival. sqlite
is available in many programming languages. By storing your data this way, you can in the future switch from one Canister implementation language to another without losing your data.- By syncing at the end of each message, the data in stable memory is complete after each messages. This means you can upgrade your canister without running code in the
canister_preupgrade
hook, greatly reducing the risk of non-upgradeable canisters. - The atomicity guarantees of the Internet Computer may allow you to use
sqlite
without locking and possibly without a journal, simplifying things. (But beware cross-message transactions are possible). - Plenty of tools out there that build on top of sqlite, e.g. ORMs, that you could then use on the IC.
How to get there?
@hassen.saidi has already demonstrated that we can compile the sqlite
library to Wasm and use it on the Internet Computer; there backed by Main memory only (memvfs
), and thus not persisted across upgrades.
Luckily, sqlite
is designed to be cross-platform and allows us to provide our own storage backend code, using the VFS feature. It even has example VFS code, which quite nicely describes our use case:
This file implements a demonstration VFS named “fs” that shows how SQLite can be used on an embedded device that lacks a filesystem. Content is written directly to the underlying media. A VFS derived from this demonstration code could be used by a gadget with a limited amount of flash memory to make SQLite behave as the filesystem for the flash memory on the device.
So we’d have to change this file to use the ic0.stable_*
system API calls in mediaRead()
and mediaWrite()
, wrap it in a Rust crate, maybe add some initialization code, and we are ready to roll!
So, who wants this?
I find the prospect of this pretty appealing, but I currently don’t have a project of my own that needs this, so I shied away from just doing this and instead came here to describe the idea. Maybe someone is now enthused and builds this?
Or maybe ICDevs considers this a worthwhile bounty (ping @skilesare)?
Or maybe the SDK team in the foundation will take this on?