Below is an example of a proposal we (the DFINITY R&D team) intend to submit to the NNS in a few days/weeks, depending on community feedback as well as questions and answers. We want to hear what you all think, including any wording changes we need to make
Summary
TLDR: As part of the work to further decentralize the infrastructure layer, we want to submit a new motion proposal to introduce a new type of hardware spec (and its corresponding remuneration) for nodes on the IC. We would like to get community feedback on this before proposing.
1. State of the world
Nodes are remunerated based on their location and node type. The Node Rewards table shows the rates per location and node type.
If you vote ACCEPT, you are agreeing on two things:
IC should Introduce a new node type. The new type has requirements independent of vendors (except for the CPU).
You can see the details for the new proposed type here:
Dual Socket AMD EPYC 7313 Milan 16C/32T 3 Ghz, 32K/512K/128M
optionally 7343, 7373, 73F3
16x 32GB RDIMM, 3200MT/s, Dual Rank
5 x 6.4TB NVMe Mixed Mode
Dual Port 10G SFP or BASE-T
TPM 2.0
DFINITY foundation will determine the expected cost of the new node type based on data from several independent vendors and propose reward rates based on an expected node lifetime of 4 years. DFINITY R&D will research various ways to construct the vendor-generic node type and propose rates for this node type.
This is a governance proposal, so if this vote passes, there will be subsequent NNS proposals to introduce the new node type and reward rates to the NNS.
3. Why we are doing proposing this
This new node type is being introduced for two reasons:
a. The current node specifications are vendor-specific which is an unnecessary centralisation a year after launch. Vendor-specific specs also makes adding future nodes more difficult as it is harder to buy machines with older hardware specs.
b. The current node types do not support VM memory encryption and attestation which will be needed in future features.
Typically such proposals ideally should come with options that are cost versus performance driven; otherwise there is no particular debate or discussion to be had.
The real work of figuring out what the cost of this hardware is (including hosting) vs what the potential reward would be for the different options SHOULD BE CLARIFIED UP FRONT.
Totally agree. That’s why I wanted to explain the additional cost/rewards to performance ratio in the upcoming updates on this thread and subsequent motion proposal. The debate internally was: “How baked does this effort needs to be for community to be looped in?”
In this case, we opted for “Lets let folks know our intent. Post updates as we learn more. Iterate based on feedback. Dont get too far down the project without feedback.”
I am not exactly sure why there is not more traction on this thread from other potential node-providers. This will impact their bottom lines.
But i agree with you, @luis. I think some amount of background work is necessary for something of this magnitude. Otherwise we will come back in year scratching our heads as to why only one option (which is really not an option within options) was covered.
For example, why limit ourselves to Milan? Why not Milan-x? Most of our current workloads have likely low L3 cache miss rates and high L3 cache coherency misses. But is this going to change in the future? I believe so because we should be gravitating towards exploiting the data currently in possession instead of using it just once. i.e. would a 7573X be an option?
For example, why limit ourselves to Milan? Why not Milan-x?
The Milan 7373 that we allow optionally is an X model. The reason we chose these models is trivial: We want the node type 3 spec to be as near as possible to the type1 in order to use both node types in the same subnet for a while. If this wouldn’t be possible we would need to onboard many new NPs before the first subnet with a sufficient decentralisation can be created. That could mean these nodes would cause costs before the can provide value to the network.
would a 7573X be an option?
The 7573X would cost more than the 7373X that we would like to allow. Using a X series would atm increase the node cost by about 30-50%. The reason why we chose to stay with the 7373 is that it has the same core configuration like all other 73’ models.
It’s already a lot of work and costs to build the tests and environments to get the confidence that the system is able to handle this additional diversity in the node hardware. That’s why we decided to only change as much as necessary to ensure that there is a bigger choice of vendors (decentralisation) and thereby a safer supply chain.
I am not exactly sure why there is not more traction on this thread from other potential node-providers. This will impact their bottom lines.
A reason could be that we (DFINITY) didn’t yet talk about a network growth strategy in general and in public yet. We are working on preparing community talks about all topics that are connected to the growth of the network like: NNS driven remuneration finding, node operation DAOs and other platform decentralisation related things. As some of you already know we are still working on the platform decentralisation roadmap from end of last year.
DFINITY is working towards the new hardware specification. To ensure these machines work at least as well as IC nodes before asking the community to vote on the HW specification, DFINITY wants to run a battery of micro and macro benchmarks with synthetic and real-world workload.
More precisely, in addition to running single-machine benchmarks, we want to evaluate these machines under as realistic circumstances as possible. Therefore, we propose to add two such machines to mainnet subnets. We will start with two low-traffic subnets, namely lhg73 and shefu, where we’ll add one such node respectively.
Once they reside on these subnets, we want to run benchmarking experiments against them
to assess the behavior under workloads exercising canister creation, query and update call processing with more and less memory-intensive workloads
We will compare the metrics of the new hardware and observe if we can spot any anomalies compared to the first generation HW.
When these experiments have been successful, we’ll add the nodes to the following high-traffic subnets with at least one DFINITY node: eq6en, mpubz.
I need to verify, but my understanding is that aspiring NPs can currently use the wiki instructions but the hardware spec may block certain people that cannot get access to the machines. An aspiring node provider can follow the instructions to onboard themselves.
The work to make this much more user-friendly (e.g. by using the NNS Frontend dapp) is likely coming after SNS and hardware spec.~~
@ritvick@Sormarler i am not surprised I confused you all since it turns out I was wrong and team corrected me when I checked in with them. The new process is still under development and wiki instructions are still under development (they could work, but have some rough areas the team is still working on) so they are not the experience the IC should have.
That being said…the hardware stuff (which is the main theme of this thread) i believe is a necessary condition so @garym will post an update on hardware tests.
Apologies for confusion I caused. This is not my area. I should have verified earlier.
We have executed the validation plan described previously on two ASUS machines. These meet the generic Gen 2 hardware specifications as specified earlier in this thread. Some abbreviated specs:
Individual node checkpointing performance discrepancy of <3-6%.
This has no impact on subnet performance, but we’re still keeping an eye on it.
Passed
We have updated the node provider hardware wiki to include this ASUS server configuration. An example ASUS quote and bill of materials (BOM) is available for interested community members.
We plan on continuing validation of new Gen 2 hardware configurations and publishing the results. Many factors influence how we proceed, e.g., community input on hardware configurations/manufacturers, price, availability.
What We Are Asking The Community
Please comment on and prioritize next hardware choices (abbreviated specs):
5x HPE 6.4TB NVMe Gen4 Mainstream Performance Mixed Use SFF BC U.3 Static Multi Vendor SSD
Swiss price: 27’031.83 CHF
Lenovo
2 x ThinkSystem AMD EPYC 7343 16C 190W 3.2GHz Processor
16 x ThinkSystem 32GB TruDDR4 3200MHz (2Rx4 1.2V) RDIMM-A
5 x ThinkSystem U.3 Kioxia CM6-V 6.4TB Mainstream NVMe PCIe4.0 x4 Hot Swap SSD
Swiss price: 30’534.27 CHF
USA price: $28,525.54 USD
Note: Prices are provided as rough examples and don’t include tax. USA prices are provided for comparison - the hardware will be validated in Switzerland. Example quotes and BOM’s for these hardware configurations are available on request.
Having instances of identical hardware as node providers has an additional benefit: if node providers face problems, the DFINITY engineering team can reproduce and debug independently on an identical environment. This must be done without access to node provider owned machines.
Are variations like using Kioxia disks in Dell servers acceptable?
Yes. Some vendors may not provide components like the configurations above.
That said, performance characteristics of alternatives should be equivalent.