Documentation threshold ecdsa

Hi,

I heard about the GS23 paper, which described a new threshold ecdsa implementation, and was super excited to understand more about it. So I tried to go over the github GitHub - dfinity/ic: Internet Computer blockchain source: the client/replica software run by nodes , but I did not found enough documentation to help me. I only found this (DFINITY - Sign In), but it seems it is password protected, and I am not able to access with my forum account (maybe it needs another account).
Is there any link explaining the purpose of each part « the goal of this folder is X, the goal of this file is Y, … »?
Is there any advice/direction that you could give me so I have better chances to succeed?
My first goal would be to implement dkg in an isolated environement.

Thank you !

1 Like

Hi @Cryptouf! You can find our core implementation of the threshold ECDSA (tECDSA) protocol in the ic-crypto-internal-threshold-sig-ecdsa library crate, which lives in rs/crypto/internal/crypto_lib/threshold_sig/tecdsa. At the beginning of the contained src/lib.rs you find some high-level documentation explaining the purpose of the various parts.

To see the protocol in action, you can look at the various tests in that crate. For example, in tests/protocol you find a test called should_basic_signing_protocol_work that first performs some pre-computation by running several DKGs, and then threshold-signs a message and finally verifies the combined signature.

We also provide some documentation on tECDSA in ic/canister_threshold_sig.rs at master · dfinity/ic · GitHub. This is where we define the traits for implementing the tECDSA protocol on the Internet Computer’s crypto component, i.e., on a higher layer in the protocol stack where we don’t directly deal with raw keys but with nodes. If you want to see the protocol in action on that layer, I suggest you look at the tests in ic/canister_threshold_sigs.rs at master · dfinity/ic · GitHub. Particularly the test should_verify_sig_shares_and_combined_sig_successfully might be interesting to you, as it exercises the entire protocol including pre-computation, threshold signing, and signature verification.

6 Likes

This really helpful, I will study this. Thank you for your detailed answer !

1 Like

Hi @franzstefan ,

as I am still working on this, I am facing some issues.

First one concern verification steps. The create_dealings function is calling a privately_verify_dealing function which rely on a decrypt_and_check function, but then in open dealing, they re-use the decrypt_and_check function. Why is it done two times?

Second one is in transcript verification. There is a check to verify if two transcripts are equals, but they are constructed in the same way with the same arguments so they should always be equals. I don’t understand what is tested here.

Thank you for your help !

Hi @Cryptouf! Thanks for your interest in the protocol!

First one concern verification steps.

I could not follow the link, but I assume you are referring to this function. It is worth noting that this function is only used in tests and its purpose is to check that everything works as expected, i.e. it creates some dealings and checks that they pass public and private verification. Moreover, if the test requires it, it also allows to corrupt some of the dealings (i.e. mess up with the some of the ciphertexts) and then it checks that the corrupted dealings no longer pass the private verification.

Why is it done two times?

This question is still valid, however the above test utility may be a bit confusing because it emulates the dealing creation/verification protocol for all the nodes, and thus it may not be clear that there is some interaction between different nodes. Decryption of a dealing may happen in three different situations:

  • Verification of a private dealing: any receiver privately verify the dealing by checking the correctness of the ciphertext. This is done by decrypting it and checking that the share is consistent with commitment included in the dealing. Even though a receiver verified this dealing, it could happen that this dealing is not selected to be included in the transcript and therefore this share will not be used. For this reason the nodes do not actually store the share at this stage. In fact, the verification function does not even return the share, it just verifies it internally.
  • During the loading of a transcript: this happens after the nodes reached consensus on which dealings have been included in a transcript. In this step the nodes decrypt all the dealings in the transcript to obtain the shares from all the different dealings. These are then combined and the resulting share is stored on the node.
  • During the opening phase of the complaint protocol: it could happen that a dealing cannot be decrypted by a receiver, even though it has been included in a transcript. If this happens, the receiver can make a complaint against a specific dealing and broadcast it to the other receivers. The receivers can verify the correctness of the complaint and then help the complainer to reconstruct its own share. In this phase the openers, i.e. the receivers of a complaint, can use the open_dealing function to extract the share for that specific dealing and send this to the complainer. Since the nodes only store the combined secret they have to recompute this on demand.

Another couple of remarks:

  • Even if the nodes store the shares for all individual dealings in a transcript, it may be that the complainer is behind the rest of the subnet and therefore the opener may no longer have the share available and in the worst case they would have to recompute it.
  • Optimistically, the protocol should almost never go to the complaint phase because malicious dealers could be punished. Which is why we don’t heavily optimize for this scenario.
1 Like

Again, it may not be clear that some operations are done by different nodes. During transcript creation, the blockmaker would construct a transcript and send this out to other nodes in the subnet as part of a block. The validators of the subnet would then verify the block proposal, i.e. verify all messages included in the block, which may include a transcript proposal. Since the creation of the transcript is deterministic (given the same set of dealings), they can verify the transcript by recomputing it and checking that the result matches the proposed transcript.

Hope this helps!

2 Likes

Thank you for your answer @andrea !
Indeed, what was hard to understand where exactly there was an emulation to simulate different participants running different parts of this code.
Globally, this is what I understood :

  • Each dealer run the create_dealing function, and subsequently post the result on chain. In the emulated version, the dealer_index is selected through a loop iteration. In a real-world scenario, how is the dealer index assigned to each participant?
  • For every dealing that has been posted on chain, P_i decrypt_and_check his part. If the result is not matching the commitments, we go to complaints phase ( I have not understood all this part : I don’t see the FixBadShares implementation in the tecdsa folder. I only see generate_complaints ) . If the result is matching, then we go to the ACS part : P_i sign a message saying that the share of P_j is valid using a threshold bls sceme (again, I have not gone through all this part, so I don’t know exctaly when happens the dkg for the bls scheme), on post that signature on chain.
  • Once there are enough signatures for P_j, he can aggregate them into a certificate (with the threshold bls scheme) and post the result on chain. Every participant who is able to post a certificate on chain is now member of the verified dealers.
  • Once we have at least t certificates, we can create a transcript with the list of verified dealers. My question here is : when do we know that we have enough verified dealers? iIf we have t certificates posted online, that some dealers calculate a transcript, then that someone else post its certificate, we could have two group of people who agree on a different transcript. How do we find a consensus? Which step do we have to wait until the list of verified dealer cannot be modified?
  • Once everyone has agreed on a transcript, every verified dealer compute its secret share locally

Then, this process is repeated in a reshare_of_masked version so we can have the final version of the shares. Why do we need to do this? Why can’t we just run the random protocol? In the paper, it is explained section A.3 that we can sometime “relax the hiding property”. Why relax it, if it introduce more communications rounds, and less security?

In the process that I have described, I have not talked about optimizations yet : participants do not directly post on chain the bls signature for a dealing. They wait to have a batch of k dealers, and then they post all the signature in one transaction. How do we choose k? There are also other optimizations that I have not understood yet, particularly the DispersedLedger AVID optimization. Where is that part implemented in the repo?

Thanks again for your valuable input @andrea

1 Like

In a real-world scenario, how is the dealer index assigned to each participant?

It’s up to the consensus layer to determine who participates in a given instance of the IDKG protocol. At the moment we only have two cases: the participants of IDKG are either all members of the same subnet or 2 distinct subnets (i.e. the dealers from one subnet and the receivers on a different one). As part of the consensus protocol, nodes in a subnet agree on a registry version (which is included in a block) that specifies the members of all subnets at a given time. Whenever a subnet initiates an IDKG protocol all nodes construct an IDkgTranscriptParams, which is used to agree on the set of dealers and receivers as well as other general info. The receiver’s index corresponds to their position in the sorted set of all receivers node ID. The dealer’s index is only relevant when we either do a resharing of a previous transcript, or the product of 2 previous transcripts. This is equal to the position each dealer had (as a receiver) in the previous transcript, i.e. the one being reshared or multiplied.

For every dealing that has been posted on chain, P_i decrypt_and_check his part

Small clarification: most of the protocol actually happens off-chain to reduce latency. E.g., dealings are not individually put on chain. What goes on chain is the final transcript, which includes the set of dealings that must be used. So the complaint phase can only happen on a dealing included in a transcript. Note that also the complaint phase happens entirely off-chain.

I don’t see the FixBadShares implementation

It is part of the same library. The fix bad share protocol requires the following sub-routines to:

  • Issue a complaint, implemented by the generate_complaint function.
  • Verify a complaint, implemented by the verify_complaint function.
  • Force-open a dealing, implementing by the open_dealing function.
  • Verify the opening of a (forced-open) dealing, implemented by the verify_dealing_opening.
  • Compute the secret share using the openings for bad shares, which is implemented by the compute_secret_shares_with_openings.

If the result is matching, then we go to the ACS part

Not exactly. The complaint phase would only happen after the ACS. Before agreeing on the set of dealing to use (i.e. agree on a transcript), receiver only sign share that pass private verification. If a dealing does not pass this, the receivers won’t sign it. Even though a dealing does not privately verify for a certain receiver, it could still verify for all the others. Therefore it is possible such dealing could end up in a transcript. It’s only after they agreed on a transcript that the receiver may issue complaints about specific dealings.

using a threshold bls sceme […] on post that signature on chain.

Actually the “support” signature the receivers apply to the dealings is just a standard ed25519 signature, not a BLS threshold signature. At some point we used BLS multi-signatures, however the cost of the signature verification shares is one of the bottlenecks of the protocol, and verifying one BLS sig is about 2 order of magnitude more expensive than ed25519. Signatures are not individually posted on chain (again this is to minimize latency). First they are broadcasted to the other nodes, then anybody who has collected enough support shares can then “aggregate” them into a certificate. Eventually, a blockmaker will have enough certified dealings to construct a transcript and they can include this in their block proposal, and to add this to the chain.

when do we know that we have enough verified dealers?

I think this is now mostly answered from the above, but let me summarize it again. Only the final transcript goes on chain and all other artefacts are just broadcasted to all the participants. The nodes collect the various dealings, the signature shares, and combine these signatures into certificates. Once a blockmaker has enough dealings and certificates, then they can propose a new transcript by including it in a block. The other nodes would only accept this if the dealings and certificates are valid, and if it contains enough dealings. The minimum number of dealings to be collected is defined as the collection_threshold which can be computed, for example, from the IDkgTranscriptParams. After a transcript is included in a finalized blocks, the node would not try to collect any more dealings and would not accept a new transcript (for the same instance of the protocol).

Why do we need to do this? Why can’t we just run the random protocol?

This explained, a bit informally in the docs here. The paper describes a dedicated OpenPower sub-protocol, which is used to reveal the public key of a key generated via the Random protocol. This is needed, e.g., because during signing you need to know the public key corresponding to the secret. However, we decided not to implement such protocol and instead to reuse the reshare_of_masked protocol. The resharing protocol reveals the public key, exactly as in the OpenPower protocol, but it also enables the resharing of the secret key. The main reason behind this was to minimize the number of sub-protocols used in IDKG and to simplify their orchestration. Note that the resharing protocol is needed anyway, e.g. in case of a membership change in the subnet.

Why relax it, if it introduce more communications rounds, and less security?

We don’t do any trade off of security here. The hiding property is important during the generation of the key, e.g. not to introduce biases on the public key (which could actually lead to some attacks). Once the key is generated it cannot be changed and therefore we can change commitment scheme to an unmasking one. The only thing that is now revealed is the corresponding public key, which is exactly what it would be revealed by the OpenPower protocol. In fact, the same commitment scheme is used in that sub-protocol.

How do we choose k? There are also other optimizations that I have not understood yet, particularly the DispersedLedger AVID optimization. Where is that part implemented in the repo?

I think the paper specifies what is, asymptotically, the best choice of k. In practice, we would need to evaluate the concrete latency/throughput tradeoffs to set the k value. However note that these have not been implemented yet. We are currently still in the process of evaluating what is the best strategy to improve the concrete efficiency of the protocol.

1 Like

Actually the “support” signature the receivers apply to the dealings is just a standard ed25519 signature, not a BLS threshold signature

Have you tried any of the known optimizations for BLS multi-signatures before switching to ed25519? I am talking about the following:

Yes, we took into account batch verification techniques like the first one you mentioned. A more general treatment of these can be found in this paper. Note that the bottleneck of the protocol is the verification of the signature shares, not of the combine multisig. Of course you could try to skip the verification of the shares and try to combine and verify. However, if the combined verification fails, you still need to verify all the shares individually. So in the worst case the complexity is not better.

Regarding the second link, I think it has the same efficiency of the BLS multisig we are using. In our scheme we use proof of possession of the keys (which are verified once and for all), so that it is possible to combine the public keys by simply multiplying them. This is the same aggregation used in that paper. The difference is that they use a randomization of the key instead of a proof of possession. However, this randomization of the key looks very similar to a proof of possession. So I think the main difference is in how they prove security of the scheme.

For the subnet sizes we dealing with now it is more efficient to use ed255219, since it is a couple orders of magnitude more efficient than BLS. Note that we use batch verification even for ed25519. If we get to a point where we deal with much larger subnets, then BLS could become more convenient. However I suspect it is more likely that we would have implemented batch dealings at that point, so this would be less relevant.

1 Like

What’s the point of fixing the bad shares if the transcript is already published? Are these shares used elsewhere afterwards?

Also, I know your threshold scheme is an (f + 1)-out-of-n. Is f ≤ t − 1? and n > 2t − 1? I can’t find such information in your paper

The reason why @mathcrypto is asking if n>= 2t-1 is that, when we run this test with node = 10, threshold = 6, and number_of_dealings_corrupted = 0, we get an InsufficentDealing error. After investigation, we found out that this error comes from the multiply function, in the create_transcript function . Is this normal? From your original code, we only changed values for threshold and number_of_dealings_corrupted .