ICSQLite:Sqlite backed by stable memory

We have a system with a persistent storage (the stable memory) that is unstructured, flat and kinda slow. We need to store all relevant data in this storage. And we want to be able to make sense of this data even after changing the code of our canister, or possibly even changing the implementation language.

Problem that developers have on the IC?

1、IC only provides unstructured storage, which is not practical for applications
2、The user wants to index the data, but it is difficult to directly store it on the IC
3、When the canister is upgraded, the user should not be responsible for data loss caused by operation errors
4、After changing the language implemented by the canister, the new language may not understand the original data.
5、Difficult to implement scenarios where multiple applications use the same data.

How to solve?

Luckily, sqlite is designed to be cross-platform and allows us to provide our own storage backend code, using the VFS feature 8.
1、sqlite solves the problem of mapping structured data onto a flat disk, in a time-tested way.
2、sqlite is well known to many developers, as it is the most commonly used data base out there. Even mobile phone apps tend to use sqlite for their data storage!
3、The problem of schema migration, i.e. what to do if your new canister version stores its data differently, is well-known and has plenty of existing solutions to pick from.
4、sqlite is used to work with a slow, block-based backend, so backing it with stable memory is sensible, and will likely make good use of the canister main memory as a cache.
5、sqlite is designed to support large files (>4GB), so you can use large stable memory easily.
6、sqlite guarantees backwards compatibility, so no worry about losing your data because your data serialization library has changes.
7、The sqlite file format can be analyzed by dedicated tools, so if the IC (or your canister) provides read access to stable memory, this can be used for debugging, backup and archival.
8、sqlite is available in many programming languages. By storing your data this way, you can in the future switch from one Canister implementation language to another without losing your data.
9、There are a lot of tools built on sqlite on the market, such as ORM, and you can use them on IC.

More details on ICSQLite

ICSQLite is a SQLite based database on Internet Computer, which provides SDK for developers.
By introducing SQLite into the IC developer ecosystem, the storage and retrieval of structured data are greatly simplified.
Developers can embed it in the Dapp canister, or provide it as a separate database for multiple canisters to call remotely.

ICSQLite mainly includes three modules: rusqlite, sqlite-vfs and ic-sqlite
1、rusqlite(GitHub - froghub-io/rusqlite at wasm32-unknown-unknown)
Added support for wasm32-unknown-unknown feature based on rusqlite library,for packaging in IC environment

2、sqlite-vfs(GitHub - froghub-io/sqlite-vfs: sqlite extend vfs)
Implement VFS (Virtual File System) based on the memory KV data structure

3、ic-sqlite(GitHub - froghub-io/ic-sqlite: Using sqlite on an internet computer)
Provide use case projects of ic-sqlite SDK, including creating tables, inserting data, querying data, deleting data, etc.

Outlook
1、It supports horizontal expansion and is not limited to use in a single canister, so that its data storage capacity can exceed the limit of a single canister
2、SDK development in various languages, such as Mokoto

Thank you for using ICSQLite. You can use canister to deploy ICSQLite independently to provide structured storage for your canister; or refer to the “ic-sqlite” project to embed your own canister to provide storage. As long as you are familiar enough with SQL, it’s easy to get started.

I know very well that it is not perfect at present. I hope you can provide him with trial feedback, bug reports and issues.
I also hope that everyone can discuss the future of ICSQLite.

24 Likes

Hey! Interesting work. I like SQL for analytics.
Can you provide performance test results, pushing this to the limits? Good if you want someone to consider building with it Like - how well will it handle a million records and more complex queries and joins? My pessimistic guess is, when you add a hundred thousand records, some queries will stop working (if it even gets that far)

2 Likes

The only problem I see is that if you do extra processing on the query answer. If that is done in one call, you might exceed the total number of cycles per call. usually, that kind of error indicates that you are doing something extra after getting the data that you should not be doing.

I took a very similar approach but haven’t talked about it much since I had trouble getting it to build with Nix.

2 Likes

Very good suggestion. The performance related test is about to be carried out, and further results will be replied again

1 Like

Yes, this is the main problem encountered. ICSQLite has been optimized. The number of SQL retrieval cycles per time is nearly 40% less than before. We will continue to optimize ICSQLite. The restrictions on the use of ICSQLite have been added to the project README document. I hope you can get feedback on your subsequent use

3 Likes

Cross-posting some info I found on linking C code when targeting Wasm.

tl;dr

The C ABI is different and not guaranteed to work.

paul have you seen this:

https://sqlite.org/wasm/doc/tip/about.md

I have. That talks about emscripten so doesn’t have the ABI issue but also doesn’t help for canisters on the IC (I think)

following this:
https://sqlite.org/wasm/doc/trunk/building.md

used the sqlite3.js and sqlite3.wasm from build

https://vwq6i-2qaaa-aaaal-abiiq-cai.raw.ic0.app/

is this anything significant?

next step
https://sqlite.org/wasm/doc/trunk/persistence.md

this looks promising
https://wicg.github.io/file-system-access/
github

1 Like

I think the difference in the scenario I presented is that it involves Wasm produced by Rust as well as Wasm produced by C, and them calling each other.

ICSQLite Performance Optimization and Benchmarking Report

  • Directly use IC stable storage as data storage layer
  • Fewer instruction sets for single messages
  • It supports the writing of tens of millions of data in a single table and can simply add, delete, modify, and query. However, the more data, the complex queries are still limited by the total number of call cycles. You need to optimize the sql statements according to the use scenarios to avoid full-text scanning.

Performance benchmarks for SQL commands

SQL
commands
performance counter
1w single table data
performance counter
10w single table data
performance counter
50w single table data
performance counter
100w single table data
create table 1194347 1433766 2565609 4066020
create index
(empty table)
884588 1122419 2241730 3601724
count 209847 2995943 15183853 30392494
insert 350256 349635 351731 355381
select
(where primary key)
265363 265960 265345 268112
select
(where index field)
312389 314594 314666 319276
select
(where like field)
178263088 1784671532 limit for single message execution limit for single message execution
update
(where primary key)
385492 389192 391599 394111
update
(where index filed)
239384 237908 237993 240998
delete
(where primary key)
429190 259541 419615 423064

Limitations

Limited by the total number of cycles of a call, if the number of rows retrieved by a single SQL query exceeds a certain amount, the call will crash.

SQL statement suggestions

  • Strictly follow the rules of database optimization
  • Index building must be an empty table
  • Where query must be filtered for primary key or index field
  • Less use NOT,!=,<>,!<,!> NOT EXISTS, NOT IN, NOT LIKE, OR, they will ignore the index and cause a full table scan
11 Likes

FrogHub, a team focused on blockchain and web3, is committed to the development of fully decentralized web3 application services, with the goal of helping developers quickly switch from web2 to web3. At present, it has provided users with mature decentralized storage solutions, decentralized CDNS, decentralized cloud computing, gateway and other easy-to-use products.

At present, the team has more than 20 technical and operational personnel who are deeply engaged in web3.0. They have their own exploration in multiple blockchain ecosystems and have accumulated rich resources and experience.

  1. Buckets is developed by FrogHub based on IPFS to store users’ and enterprises’ data securely and reliably on the Filecoin network using a simple and concise interface or the industry-standard AWS S3 API.
  2. Key Management Service (KMS) allows you to easily create and control keys for encryption operations, providing encryption and re-storage services for data stored on IPFS.
  3. IPFS Gateway service , focused on providing technical and non-technical creators with a fast, easy and reliable way to share content without limit. It does all the heavy lifting for the developer, who can easily bundle the content stored in FrogHub Buckets to IPFS without having to manage his own IPFS node. For NFT creators, it makes it easy to deliver content at scale without any technical experience.
  4. NFT Art , a codeless NFT tool trusted by NFT creators. Creators can use this tool to create their own NFT from materials without contract development experience. And publish it to the decentralized network of ETH and Filecoin.
  5. Filecoin Cloud C2 is designed to provide remote service for zero-knowledge proof algorithms in Filecoin’s computing package, serving some regions and providing 3PiB/ day computing power.

Internet Computing (ICP) will be an important part of our decentralized cloud computing. We plan to develop a full set of cloud functions that are easy to use. ICSQLite is part of that plan.

What is the ICSQLite?

  1. ICSQLite is a cloud SQLite database on Internet Computer and provides SDK for developers.
  2. By bringing SQLite into the IC development ecosystem, which greatly simplifies the storage and retrieval of structured data.
  3. Developers can use it embedded in the Dapp canister, or use it as a cloud database to call it remotely.

Why choose SQLite?

  1. The characteristics of SQLite.
  2. VFS (virtual file system) support.
  3. Rusqlite supports WASM.

ICSQLite Youtube Video is to explain the code implementation of ICSQLite and an example of how to use ICSQLite in the IC environment.

7 Likes

Very interesting, thanks, adding here the github project for ref

2 Likes
#[pre_upgrade]
pub fn pre_upgrade() {
    let state = ic_sqlite::get_storage();
    storage::stable_save((state,)).unwrap();
}

#[post_upgrade]
pub fn post_upgrade() {
    let (s,): (HashMap<String, String>,) = storage::stable_restore().unwrap();
    ic_sqlite::set_storage(s);
}

I was using this code earlier for upgrade, how do I store in the new version or it is stored automatically?

1 Like

previously, my code used to work but looks like you’ve updated the code a lot. I’m getting error like these while querying

Call failed:
Canister: rrkah-fqaaa-aaaaa-aaaaq-cai
Method: stablecoin_list (query)
"Status": "rejected"
"Code": "CanisterError"
"Message": "IC0503: Canister rrkah-fqaaa-aaaaa-aaaaq-cai trapped explicitly: Panicked at 'called `Result::unwrap()` on an `Err` value: SqliteFailure(Error { code: NotADatabase, extended_code: 26 }, Some(\"file is not a database\"))', /Users/pramitgaha/.cargo/registry/src/github.com-1ecc6299db9ec823/ic-sqlite-0.1.0/src/lib.rs:19:9

The error while updating

Error: Failed update call.
Caused by: Failed update call.
  The Replica returned an error: code 5, message: "Canister rrkah-fqaaa-aaaaa-aaaaq-cai trapped explicitly: Panicked at 'Stable vars are not initialized yet', /Users/pramitgaha/.cargo/registry/src/github.com-1ecc6299db9ec823/ic-stable-memory-0.2.6/./src/utils/vars.rs:40:13"

There is a difference in data structure storage between the official version and the test version, so the new version cannot use the old version data. If it has not been released to the chain, it is recommended to discard the old version directly. If you want to restore the use, the following two solutions are recommended:

  1. Provide the old version of github source to import project packaging for solution
[dependencies]
ic-sqlite = { version = "0.1.0", git = "https://github.com/froghub-io/ic-sqlite.git", branch = "hashmap_memory" }
  1. The post_upgrade method re-stores the old data in a new version of the storage method (note: this method has not yet been verified)

You can see which method to choose to solve the current problem, and I will assist you here.

2 Likes

seems there are some conflicts: can not work with diesel…

Building canisters…
Checking for vulnerabilities in rust canisters.
Fetching advisory database from https://github.com/RustSec/advisory-db.git

  Loaded 544 security advisories (from /root/.cargo/advisory-db)
Updating crates.io index
Scanning Cargo.lock for vulnerabilities (149 crate dependencies)

Audit found no vulnerabilities.
Executing: cargo build --target wasm32-unknown-unknown --release -p books_ic_monorepo_backend --locked
Updating git repository https://github.com/froghub-io/ic-sqlite.git
Updating crates.io index
error: failed to select a version for libsqlite3-sys-ic.
… required by package rusqlite-ic v0.28.1
… which satisfies dependency rusqlite = "^0.28" of package ic-sqlite v0.1.0 (https://github.com/froghub-io/ic-sqlite.git?branch=hashmap_memory#d3904576)
… which satisfies git dependency ic-sqlite of package books_ic_monorepo_backend v0.1.0 (/root/projects/books_ic_monorepo/backend/books_ic_monorepo_backend)
versions that meet the requirements ^0.25.0 are: 0.25.0

the package libsqlite3-sys-ic links to the native library sqlite3, but it conflicts with a previous package which links to sqlite3 as well:
package libsqlite3-sys v0.25.1 (/root/projects/books_ic_monorepo/backend/rusqlite/libsqlite3-sys)
… which satisfies path dependency libsqlite3-sys (locked to 0.25.1) of package diesel v2.1.0 (/root/projects/books_ic_monorepo/backend/diesel/diesel)
… which satisfies path dependency diesel (locked to 2.1.0) of package books_ic_monorepo_backend v0.1.0 (/root/projects/books_ic_monorepo/backend/books_ic_monorepo_backend)
Only one package in the dependency graph may specify the same links value. This helps ensure that only one copy of a native library is linked in the final binary. Try to adjust your dependencies so that only one package uses the links =‘libsqlite3-sys-ic’ value. For more information, see Dependency Resolution - The Cargo Book.

failed to select a version for libsqlite3-sys-ic which could resolve this conflict