On Sunday, July 9, a neuron was reported with an age of 52 years, which indicated there was a bug somewhere in the NNS that allowed an erroneous adjustment of Neuron Ages.
The issue is caused by a conflict between how
aging_since_timestamp_seconds field in some cases and the change applied in
NeuronDeltas that applies an additional change to the same field.
NeuronDelta value of
aging_since_timestamp_seconds is generated based on the current
aging_since_timestamp_seconds value of the
Neuron, which in the case of a dissolved or dissolving neuron is
However, when the target
Neuron in the
Merge command has a dissolve delay less than the source neuron, an increased dissolve delay is also applied.
When the target neuron’s dissolve delay is increased to match the source neuron, it is done so through
increase_dissolve_delay. If the neuron is already dissolved (i.e. a state of
t < now) this changes the neuron from
NotDissolving, and applies a change to
aging_since_timestamp_seconds, changing it from
u64::MAX to a value that would allow it to age. The other change, calculated to reduce the value from
u64::MAX to the combined age of the two neurons, is applied after that, which results in zeroing out
.saturating_sub) which results in a neuron with the long age that was observed, since it is now aging since the unix epoch. This happens because the delta of
aging_since_timestamp_seconds is calculated based on the original value of
increase_dissolve_delay does not change the value of
aging_since_timestamp_seconds when both neurons have
dissolve_state set to
t > 0 or when
t > now, however, which is why the ManageNeuron::Merge command normally succeeded as designed. So only in the case of a neuron that successfully dissolved would the erroneous transformation be applied.
While there were extensive existing tests in place covering the complex rules around merging neurons, there were none that covered this exact edge case, and this was missed by the team.
The mitigation is to limit the states of both source and target neurons operated on during the
ManageNeuron::Merge command to only be applicable to the code path where the neurons are in a NotDissolving state, which means that they both have a dissolve_state of
DissolveDelaySeconds(t) where t > 0`.
The team decided this was an acceptable interim state based on the following.
- In that state, both neurons have a normal behavior of
aging_since_timestamp_secondswhere the field means exactly what it says.
- Both neurons can have
increase_dissolve_delaycalled without also changing
aging_since_timestamp_secondsin that function.
- There are no significant use cases for merging neurons that cannot be handled by
merge, or simply disbursing ICP to the other neuron’s account. There may be some cases where this results in an inconvenience, but ultimately merging neurons is largely a convenience feature.
The proposed mitigations are visible here:
Disable merging neurons that are dissolved or dissolving · dfinity/ic@5bb8262 · GitHub (re-enables merging neurons in limited circumstances)
FOLLOW-1134 Reset aging timestamp if it's older than GENESIS - PRE_AGE · dfinity/ic@5d526de · GitHub (partially resets age for neurons)
Our proposed cleanup is to find any neuron that has a longer-than-possible age based on genesis and neuron-pre-aging of the genesis neurons, and to set those neurons to have an age equal to genesis.
We chose this route because 1) it is not easily possible to find the exact age of the neurons affected by this issue before they were affected, and 2) most neurons were not affected, and 3) the affected neurons have relatively small stakes.
While this solution is not ideal, the effort to exactly fix the affected neurons would be large, and would take our focus away from a lot of other important work. We are currently investigating the effort needed to do a more thorough fix in the future.
In the longer-term, a more systematic review of Neuron logic and invariants is underway, and a restructuring of code to make those assumptions more explicit and easier to maintain is being planned.
We plan to add some safeguards and checks on neurons to ensure that every modification results in changes between expected bounds and ensure that each Neuron maintains its implicit invariants explicitly. We will add add additional test cases to prevent regressions, and continue to improve this code.
When these code changes happen, we will evaluate further relaxing the proposed restrictions on merging neurons together.