DSocial is currently one huge canister. I’ve just checked it’s around 1GB in storage. I have no backup, so if anything goes wrong (I make a mistake) I lose everything. We are growing quite fast in the past few weeks so I want to make a backup of the data.
I was thinking to build a backup/restore function on the canister that only the owner can run. My concern is: last time I tried to export a large list from a function I got a message too large error.
Has anyone successfully done this? How does DSCVR or Distrikt do this?
We have two approaches here at Distrikt and the first one is kinda universal if you already use the linear Stable Memory (SM) during your upgrades, and I assume you do, otherwise you’re doomed
We are using Rust here, so I have no idea if you have the same access to the stable memory.
We take a snapshot of the state of the canister by serializing a data structure that is behind a RefCell that we will called state_store in our example here. To do so we call the pre_upgrade hook.
We then stream the data, with query calls, off chain by reading directly on the linear stable memory using ic_cdk::api::stable::stable64_read.
For restore, you then just have to do the reverse operation using ic_cdk::api::stable::stable64_write and use assign the deserialized data to your state_store global variable.
We use MessagePack as Serializer instead of the one provided by the IC. Because we need to know the exact size of the serialized data on the stable memory. Info that is currently unavailable by any means from the IC alone at the moment.
At some point if there is too much data to be processed in a single update call when you’re calling the pre_upgrade hook (to save the sate to SM) or post_upgrade hook (to load the state from SM) directly it will fail because of the cycles limitation of the IC. A workaround is to use the limitation increase of the upgrade process in order to take advantage of the 40x bonus.
Basically you force a canister upgrade (using dummy code) to dump the state to SM and then while the canister is up you stream the backup off chain. Same for restore : you disable the pre_upgrade hook, you stream the backup back to SM and finally you upgrade the canister. Disabling the pre_upgrade hook will keep the data on the SM safe and the post_upgrade hook will load back the canister state from your backup.
Please keep in mind that this method has been tested up to 2Gb of heap memory out of the 4Gb available on a canister. Using it on larger data will most likely result of an un-upgradable canister because the de/serialisation process, and so you will not be able to backup/restore.
I hope this is comprehensive enough and that it will help the IC community.
We built one for Catalyze. Full back and restore application. Email me at ray@catalyze.one and we can help out. We need to get this out into the community. We will ask Dfinity about it and the best way to share it.
@jplevyak was so kind to write a best practice article on canister backup & restore from his experience building Factland. It’s not yet linked in the documentation, but I want to share it here:
It would just be a whole lot easier if we could download our canister state. Happy to pay out the wazzu for cycles…it would just save so much time and heartache.