šŸ§€ How many slices of Swiss cheese would the community like in their ckERC20 sandwich? Also, ICP Giveaway

If you arrived at this thread hungry in hope of a sandwich, Iā€™m sorry to disappoint. If you came for the ICP giveaway, youā€™ll need to read to the end (thatā€™s the point :wink:). If you have an opinion about Swiss cheese, and/or are interested in minimising the chances of a ckERC20 Ledger Suite Orchestrator attack in the future, then please read and share your thoughts or questions.

Overview

In this post Iā€™ll introduce my understanding of the ckERC20 Ledger Suite Orchestrator, and why on Earth Iā€™m talking about Swiss cheeseā€¦ :yum:

  1. Iā€™ll explain why I believe this relatively new canister sets an important precedent in terms of IC NNS infrastructure and security.

  2. Iā€™ll describe why I believe the NNS needs to be capable of doing a better job of protecting this canister (and any similar canisters that follow the same precedent).

    • The potential attack that Iā€™ll describe would not be easy to pull off, and it would rely on a combination of technical know how, luck and human fallibility (and a lack of Swiss cheese). Iā€™ll explain why I personally believe susceptibility to this attack is significantly increased with this new canister (given the current features provided by the NNS).
  3. Iā€™ll propose a simple means of removing potential for this attack.

  4. Iā€™ll ask the community to share their thoughts on whether this suggestion sounds worth while, or if Iā€™m just being pedantic and need to lay off the Swiss cheese. See here for prior discussion that led to this post.

  5. Iā€™ll endeavour to use the word cheese at every opportunity (please let me know if I missed any good puns). :wink:

Iā€™ll do my best to make this post easy to understand and accessible to all. Please correct me on any technical details if needed.

ckERC20 Ledger Suite Orchestrator

The Ledger Suite Orchestrator is a relatively new NNS canister in the fiduciary subnet that controls the ledger suite canisters of all ckERC20 tokens, including ckUSDC, and in due course should control a large number of other ckERC20 tokens introduced by the community. Basically, itā€™s an important canister with a lot to be gained by an attacker if they could seize control.

The Ledger Suite Orchestrator has been expertly built by DFINITY to enable new ckERC20 tokens simply by members of the community passing the appropriate configuration to this canister (via an NNS proposal). Using this configuration the Ledger Suite Orchestrator then spawns a ledger suite for that new token and maintains control of those ledger suite canisters (facilitating ledger suite updates in the future via a configuration update proposal to the Ledger Suite Orchestrator). Importantly the WASM that the Ledger Suite Orchestrator is running does not need modifying when adding new tokens (itā€™s just a configuration update).

System Canister Management

To my understanding, this new system canister sets a precedent in that itā€™s the first time that the community has been encouraged to submit System Canister Management (SCM) proposals (usually prepared and submitted by DFINITY), in order for the community to introduce new ckERC20 tokens.

SCM proposals are designed to modify the WASM that a system canister is running (the change can be arbitrarily broad in scope, unlike a simple configuration update). The potential damage that can be caused by a bad actor who successfully manages to get an SCM proposal past the community is enormous.

Recall that a new ckERC20 token is just a configuration change to the Ledger Suite Orchestrator - thereā€™s no need to modify the WASM. The proposal that enabled ckUSDC did not modify the WASM (see the canister upgrade history here and notice the unchanging hash for proposal 129750). On the other hand, note that the ckUSDC proposal itself states in the title ā€œUpgrade Nns Canister: vxkom-oyaaa-aaaar-qafda-cai to wasm with hash: 658c5786cf89ce77e58b3c38e01259c9655e20d83caff346cb5e5719c348cb5eā€. This is not at all what the proposal is essentially about - the WASM isnā€™t changed. This is the first sign that an SCM proposal (in its current form) isnā€™t the best fit for minor configuration updates to system canisters.

The Boy Who Cried Wolf

Aesopā€™s Fables tell us a little bit about the dangers of repetitively claiming something thatā€™s false (itā€™s all well and good until the one time itā€™s true, when thereā€™s danger but no-one expects it). This can also be reframed as inattentional blindness caused by familiarity.

Imagine a scenario in the near future where there are 10s or possibly 100s of ckERC20 tokens controlled by the Ledger Suite Orchestrator. Imagine that a genuine update needs to be applied to all ledger suites (i.e. ledger, index and archive canisters) for each ERC20 token. As far as I understand, now a raft of SCM proposals would be submitted for the Ledger Suite Orchestrator canister, simply to update the Ledger Suite Orchestrator configuration which repoints each ledger suite to the latest version they should be running.

  • 1st post update: @christian has clarified that a single proposal updates all ledger suites (thanks Christian). In such case my concerns lie with the addition of new ledger suites (which was my original concern)
  • 2nd post update: What if the ledger suite config schema is updated to include more fields, requiring more information from the original submitters? In this scenario, I could imagine numerous ledger suite update proposals needing to be submitted by various community members :thinking: In any caseā€¦

None of these proposals should modify the WASM that the Ledger Suite Orchestrator itself is running, yet the proposals are required to include the ā€˜newā€™ WASM that the Ledger Suite Orchestrator should be running (even though it shouldnā€™t have changed). This is because the NNS does not provide a means of passing along the configuration without also specifying a new WASM to run.

Note that there are ways to make WASM hashes look very similar (but not identical) to a target hash, despite the WASM itself containing any number of changes (which can be nefarious). Build verification isnā€™t required for this proposal, because thereā€™s nothing new to build. Instead itā€™s sufficient to check that the proposed hash is the same as the hash for the WASM that the Ledger Suit Orchestrator is currently running (unless the Ledger Suit Orchestrator itself genuinely needs updating - e.g. 130342). As far as I understand, all it would take would be a well timed Trojan Horse, and some conditioned complacency and/or fatigue from voters (eyeballing the hashes, relying on uniformly randomly distributed hashes looking very different). This could lead to a very convincing looking SCM proposal (to make an expected minor configuration update to a single ledger suite) instead nefariously modifying the Orchestrator and seizing control over all ckERC20 ledger suites.

Note that this attack isnā€™t specific to updating existing ckERC20 ledger suites. Itā€™s also potentially exploitable in the process of adding new ledger suites (itā€™s hoped that thereā€™ll be many).

This attack isnā€™t something Iā€™m concerned about happening imminently, and itā€™s also one that may never happen. Nonetheless, itā€™s something Iā€™m concerned about if the number of ckERC20 ledger suites significantly grows (which is the plan). My opinion is that some steps should be taken so that the NNS more effectively protects against this possibility in the future, and thereby make it much easier for voters to spot when something doesnā€™t look right (i.e. a proposal to modify the WASM of a system canister that simply has no business being modified for the proposed change).

Where does the Swiss Cheese come in?

No, itā€™s not because IC is a Swiss-based foundationā€¦ :wink: The Swiss Cheese model of incident causation is a well known model for understanding how significant incidents can occur in complex systems.

Putting it simply, layers of Swiss cheese represent layers of protection, but no single layer is perfect and has vulnerabilities. Itā€™s important to layer these mitigating protection mechanisms, so the vulnerabilities of some processes or mechanisms are compensated by the strengths of others.

The opposite approach would be to intentionally leave out or remove layers of cheese (layers of protection) in a hope that it will encourage the other layers to function more effectively. This isnā€™t something that I would recommend, but something similar has been suggested - and this is what has prompted me to write up this post to get more opinions in the mix.

What am I suggesting?

Iā€™m actually not suggesting anything that hasnā€™t been done before to help secure system canisters. Consider how the Bitcoin Canister receives configuration updates - itā€™s a dedicated proposal to execute a specific function on the canister, passing along the configuration payload from the proposal (no WASM update misleadingly indicated or required - this specific attack vector simply isnā€™t there).

To avoid an endless explosion of NNS functions, I would suggest a single NNS function thatā€™s agnostic to the specific canister and update function to call (this would be specified in the proposal). Iā€™m not necessarily suggesting a need for a new NNS proposal ā€˜topicā€™, just an NNS proposal ā€˜typeā€™ (so does not need to have any burden to the community in terms of updating neuron following).

Another approach would be to build tooling for easier verification, but this would likely be off-chain and involves more uncertainty than simply removing the attack vector.

What does the community think? Is this a threat worth investing a small amount of development time to protect against? Would it also simplify the process of enabling new ckERC20 tokens (not requiring the proposer to build and submit the Ledger Suit Orchestrator WASM even though theyā€™re not changing it)? Would it also make the proposals less misleading? Would it help protect system canisters from future changes that only require configuration updates, and nothing more? Would it make reviewersā€™ and votersā€™ lives easier (and safer)?

Thanks for reading, I have the ICā€™s best interests at heart and thatā€™s my only reason for putting time into this post :pray:


Reading this through once more, I didnā€™t use the word cheese anywhere near as much as Iā€™d promised. My apologies.

Please note that I posted this as a follow-up to a discussion on another thread. I planted an Easter Egg :hatching_chick: in that thread - 5 ICP and a notional trophy :trophy: to anyone who can find it (first come first serve, claim by identifying the Easter Egg and pointing it out in a reply to this post).

:cheese::eyes::alarm_clock:

10 Likes

There was a slightly odd link in your other post ; ) if maybe thatā€™s where folk should be lookingā€¦

1 Like

Iā€™m just going to skip to the end right now and give you a like simply for the fun, creative introduction. Now back to reading the rest of the story.

3 Likes

I agree

This seems like a good way of handling this type of proposal. It could also trigger different tools that have been / will be built for the purpose of automating some of the verifications.

Thatā€™s ok. There was plenty of cheese in the introduction. :grin:

It would be best if the proposal explicitly states what is and isnā€™t changing. Hence, if the WASM isnā€™t supposed to change for a particular System Canister Management proposal, then it seems the proposal shouldnā€™t be presented as though it does and the WASM hash shouldnā€™t be a detail that needs confirmation. Hence, I agree with your concern.

1 Like

Is this Swiss cheese link the Easter Egg?

You pointed it out first so you probably already won :sweat_smile:

This isnā€™t a direct reply and may be seen as an off topic distractionā€¦BUT I just want to tell the author :pray::love_you_gesture::ok_hand::heart::see_no_evil::sunglasses::handshake::fist::index_pointing_at_the_viewer::heart_hands:

Edit: we do need to hold serious conversations about how many slices of Swiss cheese this community can take though. That is :100:

2 Likes

I can confirm that is not the easter egg, but well done for trying. Once found, thereā€™ll be no question about whether or not itā€™s the intended easter egg.

Hereā€™s a clue to everyone

Donā€™t simply look for the cheeseā€¦ Be the cheeseā€¦

Other clues are embedded in the points that Iā€™m making in the above post.

Good luck! If itā€™s not found in 48 hours Iā€™ll reveal it and keep the ICP :stuck_out_tongue_winking_eye:

Of course, not. The whole purpose of the orchestrator is to orchestrate all canisters it controls. It takes one proposal to update all of them.

1 Like

Thanks for clarifying Christian, Iā€™ve updated the post accordingly :pray:

Could you imagine a scenario where a new version of the ledger suite is made available, but introduces a feature and/or is not backward compatible with the old configuration payload used when creating the existing ledger suites? In this scenario, presumably the original proposers would be asked to upgrade their ledger suites individually, and in the process supply the new configuration payloads?

I wonder how many of these replica verifiers really look at the code and hunt for bugs/trojans

DFINITY should release standard proposal formats to avoid unnecessary layers of complexity and details in the proposal

One thing Dfinity does do (speaking as a Dfinity engineer) is to release small changes frequently rather than letting changes accumulate into big releases. Small changes are much easier to audit than big ones.

We have internal reviews, including security reviews where it seems appropriate, but external audit is also appreciated. I would argue that external audit is even essential to keep the system open and decentralized, so we definitely appreciate people like codegov reviewers. How deeply they look for trojans I donā€™t know.

It might be that automated tooling might help. Something like deepcode (now owned by snyk). My experience is that automated tooling is not perfect, both in giving false positives and false negatives but robots donā€™t get tired. Having a robot flag a handful of security concerns for CodeGov or other external auditors to look at for every release might not be a terrible idea. And when the false positives get boring you can complain to Snyk about how they need to improve their tooling. :sweat_smile:

But maybe some CodeGov engineers run auditing tools already anyway. No need to specify a solution if there is already one in place.

2 Likes

A hope in the distant future (here some of us get all dreamy-eyed) is formal verification of the codebase. Here one gets essentially a computer-generated proof that the code behaves in a certain way. For Rust code there are tools such as prusti. However Prusti is still work in progress and formal verification is an immense amount of work. Think multiplying development time by a factor of 10. If and when we have a really solid income stream and many years of development time, maybe we can get the codebase formally verified. But then one could automatically check whether the code proposed in a PR satisfies certian properties specified by the community.

For now, thorough testing and code review (including by external parties) is our most effective defence.

4 Likes

Could you imagine a scenario where a new version of the ledger suite is made available, but introduces a feature and/or is not backward compatible with the old configuration payload used when creating the existing ledger suites?

Not really. In this case we would upgrade all suites first (with a backwards compatible functionality) and only then the orchestrator.

1 Like

Thanks for taking the time to write this up so nicely @Lorimer!

So iiuc, your main point is ā€œwhy use generic canister upgrades instead of tailored proposalsā€. I think there are pros and cons for both options.

If we use a special type of proposal to add ckERC20 tokens, then

  • pro: a clear pro is that it would be easier to submit and verify these proposals.
  • con: On the flip side, there are likely more cases that could be simplified with custom proposals, so if we take this path, we may end up with many specific proposal types, which means voters need to be aware of many proposal types, and all these different proposals look different.
  • con: having many proposal types also means a lot of canister-specific logic ā€œleaksā€ into the NNS.

So i think itā€™s a question which of the pros and cons you care more about. In my personal opinion, I care more about keeping canister-specific logic outside of the NNS and keeping canister-related proposals consistent than making this specific type simpler to submit/verify. Curious to hear how others see it.

3 Likes

The thanks go to you for taking the time to read it :blush:

So iiuc, your main point is ā€œwhy use generic canister upgrades instead of tailored proposalsā€

Thatā€™s a potential approach but I donā€™t like it either, and itā€™s not what I would suggest unless it were the only way (I donā€™t think it is). What Iā€™m suggesting isā€¦

a single NNS function thatā€™s agnostic to the specific canister and update function to call (this would be specified in the proposal)

Could you see this working? I think this addresses the cons youā€™ve listed, leaving only the pros (which include a much smaller attack surface).

1 Like

Doesnā€™t this assume that DFINITY would be in a position to correctly update the config for every single ledger suite appropriately? What if the config schema is updated to include more information about the token that only the original proposer is suited to providing?

1 Like

24 hours left to find the Easter Egg :cheese:ā€¦

Iā€™m doubling the giveaway to 10 ICP. Also, you can enlist friends to help you search for it by tagging them in a comment on that thread. If someone who was tagged finds the Easter Egg first, the 10 ICP will be split between them and the person who tagged them. Happy hunting! Youā€™ll kick yourself once itā€™s revealedā€¦

1 Like

Each of your points is the cheese.

So the ckERC20 Ledger Suite Orchestrator is the 1st layer, System Canister Management is the 2nd, and The Boy Who Cried Wolf or more specifically the WASM hashes, is the 3rd layer. Each of these is what you are suggesting is represented in the Swiss Cheese Model for the IC.

This is my best guess. Is this correct?

1 Like