DRAFT Motion Proposal: New Hardware specification and remuneration for IC nodes

DRAFT PROPOSAL

Below is an example of a proposal we (the DFINITY R&D team) intend to submit to the NNS in a few days/weeks, depending on community feedback as well as questions and answers. We want to hear what you all think, including any wording changes we need to make


Summary

TLDR: As part of the work to further decentralize the infrastructure layer, we want to submit a new motion proposal to introduce a new type of hardware spec (and its corresponding remuneration) for nodes on the IC. We would like to get community feedback on this before proposing.

1. State of the world

Nodes are remunerated based on their location and node type. The Node Rewards table shows the rates per location and node type.

The current types are are listed here:

https://wiki.internetcomputer.org/wiki/Node_provider_hardware

2. What we are proposing

If you vote ACCEPT, you are agreeing on two things:

  1. IC should Introduce a new node type. The new type has requirements independent of vendors (except for the CPU).

You can see the details for the new proposed type here:

  • Dual Socket AMD EPYC 7313 Milan 16C/32T 3 Ghz, 32K/512K/128M
    • optionally 7343, 7373, 73F3
  • 16x 32GB RDIMM, 3200MT/s, Dual Rank
  • 5 x 6.4TB NVMe Mixed Mode
  • Dual Port 10G SFP or BASE-T
  • TPM 2.0
  1. DFINITY foundation will determine the expected cost of the new node type based on data from several independent vendors and propose reward rates based on an expected node lifetime of 4 years. DFINITY R&D will research various ways to construct the vendor-generic node type and propose rates for this node type.

This is a governance proposal, so if this vote passes, there will be subsequent NNS proposals to introduce the new node type and reward rates to the NNS.

3. Why we are doing proposing this

This new node type is being introduced for two reasons:

a. The current node specifications are vendor-specific which is an unnecessary centralisation a year after launch. Vendor-specific specs also makes adding future nodes more difficult as it is harder to buy machines with older hardware specs.

b. The current node types do not support VM memory encryption and attestation which will be needed in future features.

4. What we are asking the community

  • Read proposal
  • Ask questions
  • Give feedback
14 Likes

For any questions, @Luis (who runs the node provider efforts at DFINITY) will take any questions!

3 Likes

Typically such proposals ideally should come with options that are cost versus performance driven; otherwise there is no particular debate or discussion to be had.

The real work of figuring out what the cost of this hardware is (including hosting) vs what the potential reward would be for the different options SHOULD BE CLARIFIED UP FRONT.

Otherwise what , exactly, are we discussing on?

3 Likes

Totally agree. That’s why I wanted to explain the additional cost/rewards to performance ratio in the upcoming updates on this thread and subsequent motion proposal. The debate internally was: “How baked does this effort needs to be for community to be looped in?”

In this case, we opted for “Lets let folks know our intent. Post updates as we learn more. Iterate based on feedback. Dont get too far down the project without feedback.”

2 Likes

I am not exactly sure why there is not more traction on this thread from other potential node-providers. This will impact their bottom lines.

But i agree with you, @luis. I think some amount of background work is necessary for something of this magnitude. Otherwise we will come back in year scratching our heads as to why only one option (which is really not an option within options) was covered.

For example, why limit ourselves to Milan? Why not Milan-x? Most of our current workloads have likely low L3 cache miss rates and high L3 cache coherency misses. But is this going to change in the future? I believe so because we should be gravitating towards exploiting the data currently in possession instead of using it just once. i.e. would a 7573X be an option?

2 Likes

For example, why limit ourselves to Milan? Why not Milan-x?

The Milan 7373 that we allow optionally is an X model. The reason we chose these models is trivial: We want the node type 3 spec to be as near as possible to the type1 in order to use both node types in the same subnet for a while. If this wouldn’t be possible we would need to onboard many new NPs before the first subnet with a sufficient decentralisation can be created. That could mean these nodes would cause costs before the can provide value to the network.

would a 7573X be an option?

The 7573X would cost more than the 7373X that we would like to allow. Using a X series would atm increase the node cost by about 30-50%. The reason why we chose to stay with the 7373 is that it has the same core configuration like all other 73’ models.
It’s already a lot of work and costs to build the tests and environments to get the confidence that the system is able to handle this additional diversity in the node hardware. That’s why we decided to only change as much as necessary to ensure that there is a bigger choice of vendors (decentralisation) and thereby a safer supply chain.

I am not exactly sure why there is not more traction on this thread from other potential node-providers. This will impact their bottom lines.

A reason could be that we (DFINITY) didn’t yet talk about a network growth strategy in general and in public yet. We are working on preparing community talks about all topics that are connected to the growth of the network like: NNS driven remuneration finding, node operation DAOs and other platform decentralisation related things. As some of you already know we are still working on the platform decentralisation roadmap from end of last year.

4 Likes