Static Checking of Canister Upgrade Compatibility (formerly "Canister Safe Upgrades")

Thanks for the one-pager. While not the “sexiest” feature, this is by far one of the most important for any production-ready system.

Just to double check: after this change the new version of dfx will call the new version of moc to check that both a canister’s external Candid interface and internal stable variable interface are “safe” (i.e. the new interfaces are subtypes of the old interfaces).

If it’s safe, the upgrade will automatically proceed. If it’s unsafe, then dfx will print the warning outputted by moc, and the developer can choose whether to force proceed?

And either way, if the upgrade proceeds but errors out due to a lack of cycles, then the entire upgrade is atomically aborted and no data is lost, including BOTH stable and non-stable variables?

3 Likes

Exactly. If the canister is out of cycles, it’s atomically aborted.

3 Likes

Update:

NNS Motion proposal created: https://dashboard.internetcomputer.org/proposal/31168

3 Likes

If you have an update on the timeline of implementation (right now, it’s TBD on the post), that would be great. I know it’s difficult to estimate these things, but I think this proposal would make a huge difference for developers.

Also, will any of the suggestions @akhilesh.singhania made here be part of this? For example, I’d really like a way to easily download / upload canister state, in case I accidentally screw things up with an upgrade.

3 Likes

Good question. The ETA is at the end of December 2021.

This date is listed in the markdown file in the proposal: nns-proposals/20211123T2300Z.md at main · ic-association/nns-proposals · GitHub

But it was not updated in the summary at the top of the thread because Discourse is not letting me edit the original comment.

1 Like

Thanks. These bugs with Discourse edits are really annoying haha, for example I can’t edit a post if I put code in there.

they drive me crazy, tbh

Hey @jzxchiang . No I do not believe this work will address the issues I brought up in the post. My understanding is that this work addresses some of the issues with safely upgrading canisters written in Motoko but not all of them. So most of the issues raised in my post above are not addressed here.

1 Like

This proposal really just addresses checking the external and internal type safety of an upgrade, not the dynamic points of failure identified here

Performing the check will not guarantee that an upgrade will succeed, just that if it succeeds, the existing clients won’t experience Candid serialization error because of an incompatible change of the Candid interface, and that Motoko code won’t inadvertently lose data by dropping a stable variable or changing its type in compatible ways.

The candid interface check is applicable to Rust, Motoko and any other canister that uses Candid.
The stable variable check is Motoko specific.

5 Likes

Perhaps a better title for this post and the NNS proposal would be “Static Checking of Canister Upgrade Compatibility”

3 Likes

It passed!

https://dashboard.internetcomputer.org/proposal/31168

4 Likes

Please note i changed the title of the forum post:

Static Checking of Canister Upgrade Compatibility (formerly “Canister Safe Upgrades”)

1 Like

Should I be worried?

2 Likes

Maybe, given corona, climate change and people believing conspiracy theories. But not due to that code… :slight_smile:

3 Likes

WARNING!
Candid interface compatibility check failed for canister ‘jobs’.
You are making a BREAKING change. Other canisters or frontend clients relying on your canister may stop working.
Method create: func (Job) → (bool) is not a subtype of func (Job/1) → (bool)
Do you want to proceed? yes/No

How can I fix this issue ?

1 Like

It is a warning, so you’re free to ignore it. That said, I’d have a thorough look into all the changes (if at all) to the Job type (and the types it contains). Also make sure that you have the newest dfx installed.

If you can give us the definition of type Job (the former and current versions) that we might drop a few eyeballs on it, too.

1 Like

I expect what happened here is that the new version of type Job is not a Candid subtype if the previous version Job/1. If Jobs are records, perhaps you added a field of non optional type. If they are variants (enums), perhaps you removed a case from the variant.

In either case an old client using the previous interface could wind up sending the incompatible data to the new version of the canister, leading to failure.

If there are no existing clients of the canister, or you don’t mind breaking things, it’s fine to ignore the warning.

Hi,

Thanks for the reply. Here I shared a video link for your reference. Hope it will be useful to you to understand the issue.

https://www.vidline.com/share/V0766HJWDL/b8058b3570cff1583a356639880b5078

Please check and share your thoughts. Thanks again for the support.

It also appears that you change the data type for your stable var records. First question, does this canister carry production data? If so, you should think about a migration strategy that will be able to upgrade past stable data. If not, you better come up with a sound versioned strategy that will spare you from losing data in the future. E.g. a variant type that carries the version and a distinct type for each: { #v1 : [Job_v1]; #v2 : [Job] } or similar.

1 Like

I’m on vacation and don’t have my laptop handy, but these links might help explain what is going on.