Canister backup

I’m not sure it’s worth it. It’s still a poor man’s backup solution: I’m currently reading and writing the stable memory page by page (because query’s payloads are limited in size). I’ve set the page size to about 1Mb. I also need to base64 the strings. Now every query takes about half a second for one page! Restoring of the backup will obviously take even longer. So for canisters with a heap of hundreds of megabytes, let alone gigabytes this won’t really work.

1 Like

I’m currently reading and writing the stable memory page by page […] I’ve set the page size to about 1Mb

I remember Rick from dscvr saying something among the same lines. I believe they figured that you start at a higher number, and if the call succeeds then you continue, if not you retry the query with lower page size.

As to converting to base64, is that really necessary? If you go the rust way, you should be able to query a vec of bytes, right?

2 Likes

Yes, that’s what I started with. But the problem is that the blob returned by dfx is still encoded (to be printable I guess), but the size of the encoding is then larger than what I can send as a command line arg to dfx when I use it to restore the state (e.g. on my local replica).

Probably worth replacing the shell script with a proper backup client using the rust agent, then no binary data needs to be pretty-printed.

It seems that the interface (dump_to_stable, fetch_page, upload_page, load_from_stable) is actually generic enough so that this tool could be used by anyone.

As for safeguards: i’d probably add safeguards where, for example, dump_to_stable bumps and returns a counter that’s then passed to fetch_page to prevent you from accidentally downloading half the pages from an older and the other half from a newer backup. This could happen if you query very quickly after the dumping, or if some other admin dumps or upgrades while you download. Extra bonus points if that token happens to be a hash of the whole stable memory (can be calculated by dump_to_stable while writing), then you can check the download image. Similar for uploading.

2 Likes

Great inputs, thanks a lot @nomeata! If I get a bit more time, I’ll write a small agent-rs based tool to avoid the encoding issues and definitely add the integrity check. Then I’ll open-source it.

2 Likes

It may be worth adding the ability for icx to stream the data from stdin instead of requiring it be passed as a parameter, for Bash script usability.

Maybe I’m missing something obvious, but why do you serialize the canister state to stable memory first before reading from and writing to it? Why not just operate on the canister state (in wasm linear memory) directly?

Not sure how you imagine this? Literally reading the heap page by page? But the heap might be changing under your feet while you’re doing it, right?

Hey there!
Why don’t you just use another canister for the backup?
Your data is already backed up by at least 7 nodes in the subnet. If you think something bad could happen to it, just flush all the data to another canister.

First, it is just more secure (7 nodes on IC vs. 1 EC2 instance on AWS, or even your personal pc).
Second, it can be done in a permissionless manner. For example, you could enable your users to backup (or even publish in a first place) their articles if they care about their persistance. You could deploy a personal backup canister for each of them (on demand and not for free), where they could store it.

Moreover, one day you’ll definitely come to a moment, when your single-canister setup is not enough to store all the data your app has. This way you could front-run this situation.

1 Like

At some point there was talk of forking canisters . I don’t remember how that ended up, but it would be insanely useful for backups and for scaling. It is much easier to copy a canister and delete the first half of the data on the first and the second half on the other than to do 4GB/2MB= 2000 intercanister calls to move data from one to another.

2 Likes

It’s actually doesn’t matter if the platform clones the whole canister to another subnet or if you do that with inter-canister calls. For the network it is the same exact amount of load.

But inter-canister call based cloning is available right now and they allow you to spread that load through time.

It isn’t the same in cycle costs though. A one time fork call would not require pushing all the bits through inter canister calls which each have a per byte and ingress charge. 2000*those fees is a good number of cycles.

There is no public price for that, so I won’t argue.
But since it is the same load, I would speculate that it should cost the same amount of money for nodes.

Thanks @jzxchiang , that’s exactly what I would like to do !

But I don’t know where to begin with to implement it in a canister…
Could you please provide some kind of short code example in Rust of how to access and read the wasm linear memory ? I would forever be grateful :bowing_man:

I might buy you a beer if you come at the European Blockchain Convention in June :call_me_hand:

To be more specific, I have a struct that lives inside a RefCell that holds all the data that I want to back up off chain, so my best wish is to stream it through queries chunk by chunk.

So my guess on tackling this issue are :

  1. Find a way to access the wasm linear memory and find the boundaries of my struct in order to chunk and retrieve. My preferred solution as it would imply no extra costly memory allocation on a already resource constrained environment.
  2. Query page by page from stable memory as pointed out earlier, but it would mean data duplication :confused:
  3. Serializing my struct and then retrieve it chunk by chunk. Again, it would imply data duplication, which I’m trying hard to avoid as it could outgrow the memory limitation of the canister at some no so distant point in the future.

I would be most grateful if someone could give me some insight on this matter :pray:

Hey…If your data is public, you could stream it using the streaming call back and then just curl your state. If it needed to be secret you could add a secret token to the query string. In fact I’m goin to do this now that I’ve thought about it

FWIW I’ve extended my fork of quill to support any ingress messages with raw I/O. That means you can use it to read and write stable memory pages as bytes bypassing the pretty-printing by dfx and hence read/load up to 2Mb at once.

Here is an example how to read data:

./qu --pem-file <..> raw <canister_id> <method> --args "<candid_encoded>" --query | ./qu send --yes --raw - > ./data.bin"

To load the file back:

./qu --pem-file <..> raw <canister_id> <method> --args-file ./data.bin | IC_URL=http://127.0.0.1:8000 ./qu send --yes -

(Note that the loading assumes loading to your local replica, that’s why it has the IC_URL set to localhost and that how I suggest to test the backup, obviously)

This can now be easily wrapped with a bash script to read the memory page by page and load it back page by page. The next step would be to get rid of Candid serialization altogether, but I’m still working on a patch for Rust CDK for that.

3 Likes

Hey, I shared one of our technique here at Distrikt.app in an other post if anybody is still interested.

2 Likes

Any news on native canister backups?

1 Like

Hi! There is a community discussion thread on backup and restore Canister backup and restore [Community Consideration]. We would like to hear your thoughts there.