Efficient ways to clone a canister’s data

From How would Internet Identity handle a denial of service attack? - #10 by faraz.shaikh

This got me looking into CUPs, or catch-up packages, which are very briefly described in section 8.2 of The DFINITY Foundation.

From the whitepaper, section 8.2.

A CUP is a special message (not on the blockchain) that has (mostly) everything a
replica needs to begin working in a given epoch, without knowing anything about previous
epochs. It consists of the following data fields:

• The root of a Merkle hash tree for the entire replicated state (as opposed to the
partial, per-round certified state as in Section 6.1).
• The summary block for the epoch.
• The random beacon for the first round of the epoch.
• A signature on the above fields under the (n − f )-out-of-n threshold signing key for
the subnet.

To generate a CUP for a given epoch, a replica must wait until the summary block
for that epoch is finalized and the corresponding per-round state is certified. As already
mentioned, the entire replicated state must be hashed as a Merkle tree — even though a
number of techniques are used to accelerate this process, this is still quite expensive, which
is why it is only done once per epoch. Since a CUP contains only the root of this Merkle
tree, a special state sync subprotocol is used that allows a replica to pull any state that it
needs from its peers — again, a number of techniques are used to accelerate this process,
but it is still quite expensive. Since we are using a high-threshold signature for a CUP, we
can be sure that there is only one valid CUP in any epoch, and moreover, there will be
many peers from which the state may be pulled. Also, since the public key of the threshold
signature scheme remains constant over time, the CUP can be validated without knowing
the current participants of the subnet

@faraz.shaikh The section describes how “expensive” a CUP sync is, but are there any performance tests on how long it takes to CUP sync catch up something like 1GB of canister state?

The reason why I ask is I’m wondering if CUP syncs could be used for cloning or forking of a canister at a given point of time (spinning up a new, but distinct canister from a CUP sync).

1 Like