Long Term R&D: Formal Verification (proposal)

diegop · December 7, 2021, 3:43am

1. Summary

This is a motion proposal for the long-term R&D of the DFINITY Foundation, as part of the follow-up to this post: Motion Proposals on Long Term R&D Plans (Please read this post for context).

This project’s objective

This project is about machine-checked verification of the IC’s properties. Examples include:

a. Machine-checked verification of NNS on abstract level
b. Model based testing of NNS canisters via reference implementations
c. Machine-checked verification the Wasm code of NNS canisters adhere to formally specified properties
d. Machine-check interface aspects of the IC abstractly (replay protection, authentication)
e. Create tools to aid building formally verified canisters on the IC for critical applications (e.g., DeFI, auctions, SNS, etc)
f. Verify execution, message routing, state manager, consensus as close to code level as possible, Develop abstract model of the full replica
g. Abstractly model sharding and interaction between subnets
h. Formal and informal linking of models to the actual IC implementation, in particular critical parts of the NNS
i. Maintain link between models and evolving implementation

2. Discussion lead

Jens Groth

3. How this R&D proposal is different from previous types

Previous motion proposals have revolved around specific features and tended to have clear, finite goals that are delivered and completed. They tended to be measured in days, weeks, or months.

These motion proposals are different and are defining the long-term plan that the foundation will use, e.g., for hiring and organizational build-out. They have the following traits and patterns:

Their scope is years, not weeks or months as in previous NNS motions
They have a broad direction but are active areas of R&D so they do not have an obvious line of execution.
They involve deep research in cryptography, networking, distributed systems, language, virtual machines, operating systems.
They are meant to match the strengths of where the DFINITY foundation’s expertise is best suited.
Work on these proposals will not start immediately.
There will be many follow-up discussions and proposals on each topic when work is underway and smaller milestones and tasks get defined.

An example may be the R&D for “Scalability” where there will be a team investigating and improving the scalability of the IC at various stages. Different bottlenecks will surface and different goals will be met.

3. How this R&D proposal is similar to what we have seen

We want to double down on the behaviors we think have worked well. These include:

Publicly identifying owners of subject areas to engage and discuss their thinking with the community
Providing periodic updates to the community as things evolve, milestones reached, proposals are needed, etc…
Presenting more and more R&D thinking early and openly.

This has worked well for the last 6 months so we want to repeat this pattern.

4. Next Steps

Developer forum intro posted
1-pager from the discussion lead posted
NNS Motion proposal submitted

5. What we are asking the community

Ask questions
Read 1-pager
Give feedback
Vote on the motion proposal

Frankly, we do not expect many nitty-gritty details because these are meant to address projects that go on for long time horizons.

The DFINITY foundation’s only goal is to improve the adoption of the IC so we want to sanity-check the projects we see necessary for growing the IC by having you (the ICP community) tell us what you all think of these active R&D threads we have.

6. What this means for the existing Roadmap or Projects

In terms of the current roadmap and proposals executed, those are still being worked on and have priority.

An intellectually honest way to look at this long-term R&D project is to see them as the upstream or “primordial soup” from which more baked projects emerge from. With this lens, these proposals are akin to asking, “what kind of specialties or strengths do we want to make sure DFINITY foundation has built up?”

Most (if not all) projects that the DFINITY foundation has executed or is executing are borne from long-running R&D threads. Even when community feedback tells the foundation, “we need X” or “Y does not work”, it is typically the team with the most relevant R&D area that picks up the short-term feature or project.

diegop · December 7, 2021, 4:46am

Please note:

Some folks gave asked if they should vote to “reject” any of the Long Term R&D projects as a way to signal prioritization. The answer is simple: “No, please, ACCEPT”

These long-term R&D projects are the DFINITY’s foundation’s thesis at R&D threads it should have across years (3 years is the number we sometimes use internally). We are asking the community to ACCEPT (pending 1-pager and more community feedback of course). Prioritization can come at a separate step.

oggy · December 8, 2021, 10:38am

Objective

Use formal verification techniques to increase the confidence that the Internet Computer is secure and behaves as expected.

Discussion Leads

@lara , @oggy

Summary

The Internet Computer (IC) provides a secure environment for the canister smart contracts executing on it. But like any software stack, the IC can contain bugs, even with best practices such as testing and code reviews. Bugs can directly compromise the canisters’ security, making the canisters unavailable, or losing or corrupting the canisters’ data, e.g., changing the balances in the ledger canister. The open, permissionless nature of the IC means that such bugs can be triggered not only spontaneously, but also exploited by malicious actors. While disaster recovery procedures help fix such problems, they take time and may be insufficient in many cases, for example, if an attacker manages to steal ICP utility tokens or create them out of thin air. Formal verification dramatically reduces the risk of bugs in a system, including security vulnerabilities.

Background

Formal verification is a general technique to improve the quality of software systems. It does this by specifying the desired system properties, and proving that a model of the system satisfies these properties. The specification and the satisfaction proof are done using rigorous mathematical techniques that are mechanized in software (so-called formal methods). The mathematical rigor ensures that the specification is unambiguous, and that the proof has no gaps.

Note that the risk of a bug or a vulnerability is not completely eliminated even with formal verification. For example, one could forget to include some desired properties in the specification. Also, the model could diverge from the actual system: for example, a model using (unbounded) natural numbers may miss arithmetic overflow problems that appear in a real system that uses 32-bit integers. However, by carefully defining the system model and the desired properties, the risk of undetected bugs is reduced considerably.

Within DFINITY, we have already successfully applied formal methods in small projects, finding bugs in critical parts of the stack that were missed in manual reviews and tests.

Open Research Questions

Ideally, we would like to formally verify the entire IC stack. But formal verification is a slow and complex process, requiring expert input. Thus, we propose to start our verification effort by targeting some of the most critical components and mechanisms of the IC, such as:

The canisters comprising the Network Nervous System (NNS) and the upcoming Service Nervous Systems (SNS)
The IC’s subnet-based sharding mechanism implementation of the IC’s interface specification
The split of replicas into layers (peer-to-peer, consensus, message routing, and execution), including the interface contracts of the layers
The IC’s consensus algorithm and its implementation
The message routing layer, including the state manager

As part of our research, we will weigh the probability and impact of bugs against the verification effort for each target, and prioritize and limit our efforts accordingly. This ratio will also depend on the verification technique applied, where different techniques might be more suitable for the different targets. The different techniques offer different trade-offs (in particular, the proximity of the system model to the source code versus verification effort), and may include:

Static analysis, techniques for automatically verifying a fixed set of properties of source code
Model checking, an automated technique to verify an abstract, simplified model of a target
Automated test generation, a technique to derive tests for an implementation. This could be guided by an abstract model of the target (model-based testing), or by partial symbolic execution of the target code (concolic testing)
Automated deductive verification, techniques for proving properties of the target’s code
Interactive theorem proving, a general technique for mechanizing proofs of mathematical theorems

We will evaluate the state-of-the art techniques and their applicability to our targets. Applying them may require us to develop novel verification methodologies.

Skills and Expertise

Formal methods expertise in different areas will be needed. The exact areas will depend on the verification tools and techniques we choose to apply, but also the verification targets we choose to pursue (e.g., we may require expertise in code verification, verification of distributed systems, or verification of security protocols). Within DFINITY, the work will largely be done by the members of the Formal Security team.

Community Involvement

The research questions can be answered in collaboration with experts in academia and elsewhere. Any tooling developed for canister analysis could also be useful by other developers on the Internet Computer, to help analyze their own canisters. For example, a possible outcome of a verification project could be an annotation language to specify properties of canister methods, together with command-line tools for analysis, or even integration with an IDE. The development of such tools would present an opportunity for the community to get involved, helping prioritize the tools’ features and providing user feedback.

What we are asking the community

Review comments, ask questions, give feedback
Vote accept or reject on NNS Motion

oggy · December 8, 2021, 10:49am

Hi everyone, I’m Ognjen, and I’m a researcher in the Formal Security team at DFINITY. @lara and myself are happy to take any questions you might have on the project or the general topic of formal verification. We have both done research in this area, and there are a few more people at DFINITY with similar backgrounds and interests.

We have already done a few smaller internal projects using formal methods with good results, and we would love to do more ambitious ones! Hope we can get you onboard with that

oggy · December 10, 2021, 10:47am

@jwiegley has kindly offered to join in as a discussion lead for this topic. He’s one of the people with “similar backgrounds and interests” from my previous post, and definitely also one of those who’d love to be able to do more ambitious projects in the scope of this proposal!

jwiegley · December 11, 2021, 2:49am

Thank you, Oggy. Indeed, Oggy and I first met at a symposium on High Assurance Software Engineering, where we talked about proofs and system modeling. That was now several years ago! and before I joined DFINITY. For the Internet Computer, I’d especially like to apply those methods in many different respects, both internally to the protocol itself and its implementation in Rust, and externally in the form of verified smart contracts and helping users to build trustworthy canisters. Very interested to hear what people would find most valuable, or where the community thinks we should first put our efforts.

jzxchiang · December 12, 2021, 6:47am

Exciting effort. I’m wondering if at some point it’d be possible for canister developers (in either Rust or Motoko) to apply the same techniques you are using to verify parts of the IC to verify their canister. Especially before blackhole-ing something like a DEX canister. Or maybe canister developers could just leverage existing wasm formal verification tools? Not sure, this is pretty new to me.

oggy · December 20, 2021, 3:24pm

Hey @jzxchiang - sorry for the late response, I was on vacation. The answer depends on what we pick to verify. If it’s the NNS canisters, then any tooling we create could indeed be used by anyone else to verify the canisters they create (e.g., a DEX canister). They could also leverage existing, generic Wasm tools instead, but these wouldn’t take into account the IC’s system API, so they would likely be more limited in what you could verify about such a canister.

On the other hand, if we focus on, say, verifying the implementation of the consensus algorithm, then this wouldn’t be directly useful for canister developers to verify their canisters (it would just give them more confidence that the IC is operating correctly). Perhaps there would be some indirect benefits, if we developed (or improve existing) Rust verification tools.

diegop · December 20, 2021, 7:37pm

Proposal is live: Internet Computer Network Status

Topic		Replies	Views
Motion Proposals on Long Term R&D Plans Roadmap	17	8122	June 10, 2022
Long Term R&D: Security Proofs (proposal) Roadmap	4	1213	December 20, 2021
Long Term R&D: Secure OS (proposal) Roadmap	4	1905	December 20, 2021
Long Term R&D: Decentralized CA and DNS (proposal) Roadmap	8	2439	June 3, 2024
Long term R&D: TEE enhanced IC (proposal) Roadmap	24	3725	January 13, 2025