You might have seen my previous posts discussion issues about safe upgrades of canisters and zero downtime upgrades, and clearly, our platform isn’t particularly great for developers when it comes to persisting your data across safely and efficiently canister code upgrades. But it should be great!
If we squint a bit, we’ll see that the problems we are facing here aren’t that unique: We have a system with a persistent storage (the stable memory) that is unstructured, flat and kinda slow. We need to store all relevant data in this storage. And we want to be able to make sense of this data even after changing the code of our canister, or possibly even changing the implementation language. This describes basically any computer out there with a disk attached to it! There, this is solved using file systems and/or databases, so maybe we can simply use the same?
sqlite in-process database solves most if not all of the problems related to persistent storage, and can be backed by the IC’s stable memory.
Something like this should work, and the developer shoudn’t need to worry about much else:
(TBD, avoiding 403 error)
Behind the scenes this would use the canister’s stable memory like a raw block device.
sqlitesolves the problem of mapping structured data onto a flat disk, in a time-tested way.
sqliteis well known to many developers, as it is the most commonly used data base out there. Even mobile phone apps tend to use sqlite for their data storage!
- The problem of schema migration, i.e. what to do if your new canister version stores its data differently, is well-known and has plenty of existing solutions to pick from.
sqliteis used to work with a slow, block-based backend, so backing it with stable memory is sensible, and will likely make good use of the canister main memory as a cache.
sqliteis designed to support large files (>4GB), so you can use large stable memory easily.
sqliteguarantees backwards compatibility, so no worry about losing your data because your data serialization library has changes.
sqlitefile format can be analyzed by dedicated tools, so if the IC (or your canister) provides read access to stable memory, this can be used for debugging, backup and archival.
sqliteis available in many programming languages. By storing your data this way, you can in the future switch from one Canister implementation language to another without losing your data.
- By syncing at the end of each message, the data in stable memory is complete after each messages. This means you can upgrade your canister without running code in the
canister_preupgradehook, greatly reducing the risk of non-upgradeable canisters.
- The atomicity guarantees of the Internet Computer may allow you to use
sqlitewithout locking and possibly without a journal, simplifying things. (But beware cross-message transactions are possible).
- Plenty of tools out there that build on top of sqlite, e.g. ORMs, that you could then use on the IC.
@hassen.saidi has already demonstrated that we can compile the
sqlite library to Wasm and use it on the Internet Computer; there backed by Main memory only (
memvfs), and thus not persisted across upgrades.
This file implements a demonstration VFS named “fs” that shows how SQLite can be used on an embedded device that lacks a filesystem. Content is written directly to the underlying media. A VFS derived from this demonstration code could be used by a gadget with a limited amount of flash memory to make SQLite behave as the filesystem for the flash memory on the device.
So we’d have to change this file to use the
ic0.stable_* system API calls in
mediaWrite(), wrap it in a Rust crate, maybe add some initialization code, and we are ready to roll!
I find the prospect of this pretty appealing, but I currently don’t have a project of my own that needs this, so I shied away from just doing this and instead came here to describe the idea. Maybe someone is now enthused and builds this?
Or maybe the SDK team in the foundation will take this on?