Long term R&D: TEE enhanced IC (proposal)

diegop · December 7, 2021, 1:46am

1. Summary

This is a motion proposal for the long-term R&D of the DFINITY Foundation, as part of the follow-up to this post: Motion Proposals on Long Term R&D Plans (Please read this post for context).

This project focuses on the integration of a hardware-aided Trusted Execution Environment (TEE) into the IC nodes. It will provide confidentiality and offer additional integrity protection for all hosted code and data. In particular, the integration of trusted execution will protect against attacks from rogue data centers operators and intrusions at the level of the host operating system.

2. Discussion lead

Rüdiger Kapitza

3. How this R&D proposal is different from previous types

Previous motion proposals have revolved around specific features and tended to have clear, finite goals that are delivered and completed. They tended to be measured in days, weeks, or months.

These motion proposals are different and are defining the long-term plan that the foundation will use, e.g., for hiring and organizational build-out. They have the following traits and patterns:

Their scope is years, not weeks or months as in previous NNS motions
They have a broad direction but are active areas of R&D so they do not have an obvious line of execution.
They involve deep research in cryptography, networking, distributed systems, language, virtual machines, operating systems.
They are meant to match the strengths of where the DFINITY foundation’s expertise is best suited.
Work on these proposals will not start immediately.
There will be many follow-up discussions and proposals on each topic when work is underway and smaller milestones and tasks get defined.

An example may be the R&D for “Scalability” where there will be a team investigating and improving the scalability of the IC at various stages. Different bottlenecks will surface and different goals will be met.

3. How this R&D proposal is similar to what we have seen

We want to double down on the behaviors we think have worked well. These include:

Publicly identifying owners of subject areas to engage and discuss their thinking with the community
Providing periodic updates to the community as things evolve, milestones reached, proposals are needed, etc…
Presenting more and more R&D thinking early and openly.

This has worked well for the last 6 months so we want to repeat this pattern.

4. Next Steps

Developer forum intro posted
1-pager from the discussion lead posted
NNS Motion proposal submitted

5. What we are asking the community

Ask questions
Read 1-pager
Give feedback
Vote on the motion proposal

Frankly, we do not expect many nitty-gritty details because these are meant to address projects that go on for long time horizons.

The DFINITY foundation’s only goal is to improve the adoption of the IC so we want to sanity-check the projects we see necessary for growing the IC by having you (the ICP community) tell us what you all think of these active R&D threads we have.

6. What this means for the existing Roadmap or Projects

In terms of the current roadmap and proposals executed, those are still being worked on and have priority.

An intellectually honest way to look at this long-term R&D project is to see them as the upstream or “primordial soup” from which more baked projects emerge from. With this lens, these proposals are akin to asking, “what kind of specialties or strengths do we want to make sure DFINITY foundation has built up?”

Most (if not all) projects that the DFINITY foundation has executed or is executing are borne from long-running R&D threads. Even when community feedback tells the foundation, “we need X” or “Y does not work”, it is typically the team with the most relevant R&D area that picks up the short-term feature or project.

diegop · December 7, 2021, 4:45am

Please note:

Some folks gave asked if they should vote to “reject” any of the Long Term R&D projects as a way to signal prioritization. The answer is simple: “No, please, ACCEPT”

These long-term R&D projects are the DFINITY’s foundation’s thesis at R&D threads it should have across years (3 years is the number we sometimes use internally). We are asking the community to ACCEPT (pending 1-pager and more community feedback of course). Prioritization can come at a separate step.

rrkapitz · December 8, 2021, 2:32pm

Hi community, I have the pleasure to foster the discussion regarding the utilisation of trusted execution to further strengthen the security of the Internet Computer. As a systems researcher, I have a deep interest in secure and resilient distributed systems. My core research topics in the past years have been Byzantine fault tolerance and various aspects of trusted execution. I’m looking forward to your views, concerns and questions regarding the proposal.

rrkapitz · December 8, 2021, 5:15pm

Trusted execution enhanced IC (Motion proposal)

The Internet Computer (IC) is a general infrastructure for a wide variety of applications processing all kinds of security-sensitive data ranging from user-centric private information to financial digital assets. While the IC as of today already provides a high level of fault-tolerance and security, all available technical means to improve its security and resilience should be explored. With the wide-spread advent of hardware-based trusted execution, the current protection of the IC against attacks such as rogue data centers providers and intrusions at the host level can be further strengthened and fortified.

Objective

Devise a trusted execution harnessed IC to further increase its security in terms of integrity and privacy against privileged local attackers. Provide additional means for users to validate the integrity of the IC. Together, this will substantially increase the security of the IC and all its hosted assets and data.

Background

In a nutshell, recent hardware-aided trusted execution support ensures that code and data outside the CPU is handled in an encrypted form and only decrypted, while being processed by a secured execution context inside the CPU. In addition, remote attestation makes it possible to validate that the secured execution context has been initialized on trustworthy hardware in a certain state, thereby comprising only the requested code and data.

The IC is deployed and managed via tailored virtual machines that serve its code and data. Accordingly, trusted execution should be applied at the level of whole virtual machine instances as this puts all necessary code and data under its protection and is a natural fit. This scope also limits the integration overhead and paves the way for interoperability between different trusted execution technologies.

Why this is important

Securing the decentralized infrastructure of the IC is a key concern of the DFINITY Foundation. Along these lines the admission process of new node providers will be more and more relaxed with the aim to enable rapid growth and empower the community to participate in all matters of the IC. However, attached with a lightweight process to become a node provider, there is also the risk of abuse and the integration of improperly secured environments. In order to address these concerns, trusted execution will be utilized to additionally protect the virtual machines of the IC against unauthorized access from local privileged attackers. As a consequence the integration of trusted execution can be considered as an enabler for the further growth and decentralisation of the IC.

Proposal

To make the IC and its software ready for trusted execution, current technology has to be evaluated. As a prime candidate, AMD SEV-SNP as an upcoming technology that provides mechanisms to secure virtual machines and flexible remote attestation is considered. Furthermore, technologies such as Intel’s TDX will be explored.

Independent of the utilized technology, the current build and deployment process of the IC virtual machines has to be tailored for trusted execution. In particular, a deterministic build process and virtual machine instantiation enabling remote attestation has to be designed and implemented. While the IC virtual machines are already secured against various attacks, additional security measures have to be implemented to protect against local privileged attackers. Basing on this extended support, the IC will make use of remote attestation to enable the vetted integration of new virtual machine instances. The IC protocol will have to be extended to enable remote attestation performed from multiple locations to account for malicious nodes, thereby matching the failure model of the IC. The protocol extensions will undergo a security review and an approach for the migration has to be devised.

In the beginning, the plan is to focus on a certain trusted execution technology for the initial deployment, next the focus will be widened and at least one further technology will be investigated. The core motivation is not to depend on a single hardware provider and to devise the trusted execution support for the IC to be as vendor-neutral as possible.

While the work outlined so far primarily focuses on the internal security of the IC, it must be possible for the end-users to validate these enhancements in such a way that a user will perform remote attestation implicitly by accessing the IC. As a consequence, hardware-aided end-to-end security between the user and the IC virtual machines can be established, which requires the design, development, and testing of additional software executed on the user-side.

Key milestones

This motion proposal targets a long-term perspective and as such is planned for three years. Parts of it will be detailed and refined as the work progresses. At this stage six milestones are planned:

M1: The build and deployment process of the IC is ready for applying trusted execution and performing remote attestation
M2: Trusted execution has been deployed and secures the VMs of selected subnets featuring the next generation of the IC hardware
M3: Remote attestation has been integrated into the IC protocol
M4: Client-side validation via remote attestation of the IC nodes is enabled
M5: Widespread integration of trusted execution to a majority of the IC nodes
M6: Vendor-neutral support for trusted execution is enabled and demonstrated

Discussion leads

As members of the DFINITY research team Rüdiger Kapitza and Helge Bahmann will drive the proposal and are available for discussion.

Why the DFINITY Foundation should make this a long-running R&D project

While trusted execution is marked-available for a few years now and cloud-vendors have launched initial commercial offerings it is still a cutting-edge technology that rapidly evolves and new hardware extensions as well as updates to existing ones will surface. Integrating such a promising but also disruptive technology requires carefully adapting and evolving central parts of the IC over the next few years. This gets especially visible with the aim to integrate trusted execution despite the strong hardware dependencies as vendor-neutral as possible.

Skills and Expertise necessary to accomplish this

The implementation of this motion proposal will require a diverse set of expertise as applying trusted execution to the IC is a cross-cutting endeavour. At the lowest level it demands a solid understanding of the trusted execution hardware including being able to access and possibly mitigate aspects of side-channel attacks. At the system level the integration of the hardware into the operating system and the virtualization layer has to be performed and existing hardware-vendor support extended for the demands of the IC. Based on the devised system layer the offered functionality and services of the trusted execution hardware have to be integrated into the protocol of the IC.

To conclude the required expertise spans topics related to hardware security, operating systems, systems security and the engineering of distributed protocols.

In line with the above stated expertise and skills this will be a strong multi-team. In particular it is expected that initially the node team that is responsible for the operating system of the IC will lead the proposal. However, as the work progresses other teams such as the consensus and networking as well as the SDK team will play an increasingly important role. Finally, as the results of the proposal will need to be tested and deployed additional teams will be involved.

Open Research questions

Strengthening the security of IC by the crosscutting-integration of trusted execution poses a unique research challenge. While applying trusted execution to distributed applications has recently been a subject of various research efforts, there is to our knowledge no operative decentralised system that is near to the size of the IC which already employs trusted execution. In particular, this notion proposal will answer the following research questions:

How to integrate trusted execution in a scalable and decentralized infrastructure such as the IC? Any centralised control should be avoided.
How to design and implement a vendor-neutral system support for trusted execution? As the IC evolves the transparent use of different trusted execution technology should be enabled.
How to seamlessly integrate trusted execution into the IC protocol so distributed remote attestation that matches the fault model the IC becomes possible?
How to empower the users in validating the security of the IC via remote attestation? This validation step should be transparent while accessing the IC and not only directed to the accessed machine but span, for example, all nodes of a subnet.

Examples where community can integrate into project

Due to the wide scope of required expertise for this motion proposal it is expected that it will be carried out in tight interaction with the community. In particular it is planned to organise workshops as the motion proposal evolves to discuss hardware security, system integration and protocol integration. Especially, in scope of diversifying the hardware support the community could enable a rapid evolution. Furthermore, a critical assessment and discussion regarding the security implications of the utilized hardware is of strong interest.

What we are asking the community

Trusted execution is a novel and rapidly evolving technology that will require intense discussion and exchange to integrate it in a vendor-neutral and future-proof fashion. While hardware-based security is on the rise it also comes attached with concerns and vulnerabilities will surface. All along the implementation of this motion proposal a constant exchange with the community is essential to make it a success. Accordingly, we welcome comments, questions and feedback and hope for a positive vote on the proposal.

cryptoschindler · December 9, 2021, 9:49am

hey @rrkapitz ,

could this technology also help in this regard?

eg make it impossible for externals to know which node runs which canister so they cant be hold responsible?

many thanks

rrkapitz · December 9, 2021, 2:40pm

Hi @cryptoschindler, the proposed use of trusted execution makes the VM an opaque execution context. Of course, there are more things to consider first of all established network connections but also block access patters of encrypted disk volumes, other forms of side-channels related to memory access (main memory and cache) or even energy usage patterns. To address your question trusted execution helps, but only a bit, as the network traffic will exposed the relevant information.

lastmjs · December 11, 2021, 2:23pm

I’m wondering if a combination of TEEs in the boundary nodes and the replica nodes could accomplish this. If the request body, which includes the canister id, is only ever decrypted from within a TEE, then the boundary node could receive requests, decrypt and obtain IP address of replica node, then encrypt and forward messages to replica nodes, all without ever revealing the IP-address-to-canister-id relationship. Plausible?

rrkapitz · December 13, 2021, 10:39am

Hi @lastmjs, using trusted execution in the boundary nodes and the IC nodes helps to ensure the confidentiality of the exchanged data. Using a TEE is also on the agenda for the boundary nodes. However, if an external attacker has good visibility regarding traffic flows, she can determine where a canister is executed. E.g., have a client and ask the canister for service and analyse the resulting interactions. In case one truly likes to hide such things obfuscation at the routing level would be needed. (For example onion routing inspired approaches and systems like TOR come to mind.)

diegop · December 20, 2021, 7:28pm

proposal is live: Internet Computer Network Status

lastmjs · February 23, 2022, 6:06pm

It looks like AMD-SEV is completely broken: https://arxiv.org/pdf/2108.04575.pdf

diegop · February 23, 2022, 6:20pm

oooof… not good. thanks for sharing, jordan. i will pass along

rrkapitz · February 23, 2022, 7:09pm

Hi Jordan, we are aware this paper, it has also been published at a conference [1]. I exchanged with the main author and tried to understand the position of AMD regarding this issue. So the current situation seems to be that the secure processor can be tricked to load a modified code which invalidates attestation on the attacked machine but also the retrieved keys can be utilised to some extend on other machines. As it stands it is an attack at the hardware level that is difficult to patch. AMD has fixed other issues but it seems their position on this one is that it is outside of their attacker model. My assumption is with future hardware releases AMD will aim for a fix. So for now I see their TEE as an additional measure to improve the security but of course the gained protection is limited. Based on the development of the last years there will be more attacks and countermeasures. Since TEEs in their current form are rather new this might be expected. However, there is also in the cloud domain a strong demand for confidential computing and therefore even new TEEs surface. Thus, from the current point of view the situation is not optimal but as a long term effort it still is a perspective.

[1] https://doi.org/10.1145/3460120.3484779

diegop · February 23, 2022, 8:37pm

Thanks a bunch @rrkapitz

lastmjs · February 24, 2022, 12:22am

Are you optimistic that it is possible to create a TEE that doesn’t have relatively easy exploits?

rrkapitz · February 24, 2022, 9:39am

Well this is pure speculation from my side. However, we currently see two directions. One is to take a step back and assume we have a physically secured data center and the personal is trusted. In this case we can take attacks that require direct access to the hardware out of the equation. Essentially the TEE technology needs only to defend against privileged attackers, like a rogue administrator or a privileged SW (OS or hypervisor) outside the secured compartment. This is what conventional cloud provider seem to expect from confidential computing. So far a fair number of issues that break this setting have been addressed by the hardware manufactures and it is business relevant. Thus, the chances are good that it will get more and more complicated to mount successful and undetected attacks in this setting otherwise the whole confidential computing movement needs to be stopped.

Now we come to the second more challenging attacker model, where the attacker has physical access. This stetting is more complicated and hardware manufactures already tried to scope the possible attacks they can/would like to defend against – and there are less business relevant use cases. E.g., client side SGX has been put on pause (or retired?). Is it too complicated to defend or is there not enough interest?

The current attack that you referred to in the paper is tricky because it crosses the border. It is maybe outside that attacker model of confidential computing as the attacker needs physical access to the machine, but the results can be utilised (e.g. by a rogue administrator) to break assumptions of the confidential computing setting. So I believe AMD needs to fix this.

To answer your questions I’m currently optimistic that TEEs will provide reasonable protection under the flag of confidential computing. E.g. see the efforts here [1]. The questions regarding physical attacks is a bit open to me. The aim will be to defend also against these too, I would suspect, as the current issue demonstrates these attacks might interfere with the weaker attacker model of confidential computing. For the IC we are interested in a TEE that also guarantees protection against physical attacks but I see it as an additional measure and that we should not fully rely on it.

[1] https://www.amd.com/en/corporate/product-security/bulletin/amd-sb-1021

lastmjs · February 24, 2022, 1:47pm

Very informative thank you…though a bit depressing as protecting against physical attacks is so important to the security model of the IC.

How useful is MPC without TEEs? I suppose we still get BFT privacy in that case?

jzxchiang · February 25, 2022, 2:28am

What does BFT privacy entail? I thought BFT consensus could provide integrity but not confidentiality guarantees.

lastmjs · February 25, 2022, 2:41am

That’s where MPC comes in. With MPC you can get BFT privacy.

g302ge · March 22, 2022, 7:18pm

Hi guys, I am working for Phala now which is build a privacy computing on-chain with SGX. I am willing to involve this proposal to contribute IC. How can I join your current work and where to start work ?

rrkapitz · April 5, 2022, 8:28am

Hi, sorry for the very delayed reply. To my knowlege we are currently still in the beginning of community involvement regarding the core IC development. This should start at some point. Maybe @diegop can tell more. If you are still interested we can have an exchange. For the moment we focus on securing whole VMs, i.e., by using AMD SEV-SNP. In the past, I was also interested in securing smart contracts vis SGX ( Blockchain and Trusted Computing: Problems, Pitfalls, and a Solution for Hyperledger Fabric

Topic		Replies	Views
Motion Proposals on Long Term R&D Plans Roadmap	17	8122	June 10, 2022
Long Term R&D: Secure OS (proposal) Roadmap	4	1905	December 20, 2021
Long Term R&D: General Integration (Proposal) Roadmap	20	5313	May 16, 2023
Long Term R&D: Boundary Nodes (proposal) Roadmap	41	5267	May 24, 2022
Long Term R&D: PQ security (proposal) Roadmap	24	4033	December 11, 2024