Tackling CertifiedData in Motoko

Continuing the conversation in: Recommended usage of CertifiedData - #14 by skilesare

I’ve taken the liberty to set up: Motoko Playground - DFINITY so that we can reason about this.

Super simple. addItem associates a Nat with another Nat. It also adds the key and value to a Merkle tree as provided by @nomeata . The return is a tuple (data put into CertifiedData.set(the hashTree of the MerkleTree), the witness to the object in the tree).

When you getItem(item) I return a tuple (key, the certificate as returned by CertifedData.getCertificate, the current witness to the item, the root of the MerkleTree).

I’ve messed with MerkleTrees a good bit, but am not really an expert. I’m missing a number of things here that would be nice to solve for:

  1. Why isn’t the root part of the witness? I’m guessing there is a theoretical function in MerkleTree called proveWitness(root, witness) that will roll up the witness from the bottom by hashing the items at each leaf and confirm that they roll up to the top where you should find the MerkleRoot. Seems like this root should be part of the witness. Maybe a #root(Hash, Witness, Witness)?

  2. Why isn’t there a function to do this in MerkleTree? It doesn’t seem like that bad of a function to write if I can get a couple of assumptions cleared.
    a). how is

        
        //this
        let keyPrefix = hp(k);
        let keyHash = prefixToHash(keyPrefix);
 
        //different from this
        let valueHash = h(v);

       //hp takes the sha256 and prefix converts it to a blob
       // h does both in the same function; can this be refactored?
b)Looks like the leafs and forks get a prefix all the time? or just sometimes?  see mkLeaf, mkFork
  1. The certificate looks like a long blob. I’m guessing there exists a function verifyCertificate(cert, data) that will confirm the certificate for the data. In this case the data is the Merkle root.
    a). What is the function and what is the output?
    b). Is one of the outputs the current Signing Key for MainNet?
    c). If so how do I get that?

  2. I also think that I probably shouldn’t be returning the certificate in my getItem and instead provide just the witness and data. I should get the certificate somewhere else. Where do I get it? I’m guessing it is a system canister call, but I can’t seem to find it on ICRocks anywhere.

  3. I don’t want to mess around with writing a witness prover if a new paradigm is “coming soon” © Dom. :wink:

I don’t have much knowledge of this stuff, but just in case you missed it, @nomeata did an overview here:

2 Likes

That was super helpful and helped answer some questions. One question it leaves me with is what the format of the certificate is. It looks like it should have Time, Principal, and Data in it, but I’m not sure how to parse it out.

It also looks like a client could get certified data from the system state tree for double verification. Are we able to get the things listed here: The Internet Computer Interface Specification :: Internet Computer in motoko? Perhaps there is another way to go about it. I was hoping the The Internet Computer Interface Specification :: Internet Computer aaaaa-aaa canister might expose the cansiter state, but I don’t see that in the did spec.

ReadState would be more useful if we had access to the messageID, but I don’t think we have access to that in motoko. Perhaps it could be added to the msg object for shared functions? I could see wanting to do some retroactive analysis on requests and if they were rejected or accepted.

Finally…according to The Internet Computer Interface Specification :: Internet Computer the certificates are encoded in CBOR. This begs the question as to whether there is a CBOR library for motoko anywhere?

As far as I know there is no CBOR implementation for Motoko yet. Perhaps this would be decent starting point: GitHub - ygrek/ocaml-cbor: OCaml CBOR generic decoder/encoder, RFC 7049, http://cbor.io/ but I can’t vouch for it. Note OCaml use 32-bit chars to represent bytes and 32-bit char strings for byte sequences (yuck).

I’m not sure why Motoko would want to verify the certificates - isn’t the use-case to verify the results of unreliable ingress queries (ie. from a browser). An inter-canister query call goes through consensus and doesn’t require certification.

Perhaps the best folk to chime in here are @nomeata and @roman-kashitsyn.

I’m not sure why Motoko would want to verify the certificates - isn’t the use-case to verify the results of unreliable ingress queries (ie. from a browser). An inter-canister query call goes through consensus and doesn’t require certification.

I was informed that query calls are subject to a bad actor because query just goes to one canister and thus are not subject to consensus. It is a huge bit of complexity off my plate if that is not the case…but I almost think I’d like it that way because it means that I’m getting the fastest response to my query that I can…I at least want some way to NOT have to go through consensus if I can get the same guarantees through a witness/cert.

Yeah, this is confusing.
Query ingress message are vulnerable because they talk to a single replica (not canister) and don’t require consensus.

When a canister’s update method makes a call to a query method, consensus is required, or the entire update would be less trustworthy. This is why query calls from update methods don’t buy the same performance gains as ingress calls to query methods.

At least, that’s my understanding.

That would be great to have confirmation on. And it might be worth putting a shim in there long term for folks that know what they are doing because trusted queries during update calls would be really nice and amplify scalability. I get that it would require expert usage. I guess I’ll stop working on a motoko cbor library :joy:

Motoko demo putting a bunch of pieces together and writing a new reconstruct function.

https://m7sm4-2iaaa-aaaab-qabra-cai.raw.ic0.app/?tag=3177226102

2 Likes

I was off the grid for a while and lost track of which questions are still open here, if any

  1. Confirmation that you don’t need to verify a cert after a query call fromnone canister to another because it goes through concensus.

  2. Comment on if would be possible to have a raw query during intercannister calls that would avoid concensus and rely on the canister operator to verify a returned certificate.(maybe this breaks determinism?)

  3. What is the format of a cert and can we get a motoko function for converting it into a witness, signature, etc(cbor serializer/deserializer) in case there emerges a reason to verify old certs. We’d also need the motoko version of certificate and signature verification.

  4. Can motoko call the state functions to get data on calls and/or certificate information from the state tree? Example: I want to record the current root and a cert now and later someone can prove they had a value in the canister at time x by providing a witness.

I’ll review the old thread and see if there is anything else.

Correct. The update/query call distinction doesn’t even exist for inter-canister calls.

Not possible in the current architecture, precisely because it breaks determinism.

One can imagine a “fast path consensus for non-mutating calls” that would provide a middle ground, but that is far future work.

You need to distinguish between the system certificate (which is CBOR as per the Interface Specification), and which contains only the “root hash” of the application’s data structure, and the application’s certificate, which of course is completely up to the application.

I don’t follow what the use case is here?

Why? Verification only happens on the client side, and Motoko is (so far) purely a canister-side language.

No, it has to use the system API. Data on calls is intentionally not exposed there (it’d be a layering violation), the same way a Unix application using TCP doesn’t see IP packets or sequence numbers.

The system certificate can and should be obtained via the system API, and in Motoko via the CertifiedData module.

So my suggestion for a Motoko service that certifies it’s query calls is to use a library like my motoko-merkle-tree, and use the built in Candid support to transfer the witness (a.k.a. pruned tree) to the client, where you can turn it into the “decoded” data structure used by the JS library for verification. This way no CBOR is needed.

For certified HTTP assets, some simple CBOR and base64 encoding needs to be written first, indeed.

1 Like

Hmmm…the signed data is up to me, but when I call CertifiedData.getCertificate() it definitely returns a byte array that I’d not know how to interpret. I see my hash that I signed in there with a 32 in front of it(which I’ve deduced from the cbor spec).

Use case:

I’m thinking of a use case where you want to prove data was committed to in a previous time period.

Consider an auction

T=0 - Camister B,C,D separately and secretly commit to a bid.

T=1 Canister A asks Cansister B,C,D for their current certificates.

T=2 - Canister A closes the auction.

T=3 - Canister B,C ,D reveal their witness showing their committed value.

T=4 -Canister confirms witnesses and awards auction to highest bidder.

To do this it would be great to be able to read the certificate metadata in getCertificate() and be able to verify the dig in T=1. Maybe even just a getCertificate Candid would be helpful?

Generally, is it potentially useful for a motoko canister to act like a client?

It seems that rust canisters get to walk this line between a pure canister and accessing underlying system calls. Should “any canister that could be written in rust should also be possible to write in motoko?” If not the language becomes much less appealing for a developer to commit to learning.

Re: merkletrees

I’ve put the example together at Motoko Playground - DFINITY that shows what comes back from getCertificate, and yes, uses your library to create the root.

Only the client ought to have to interpret that, the canister (and hence Motoko) just passes it through.

Your example doesn’t involve outside clients, but only canisters, so everything is happening in the happy world of deterministic execution, and I don’t think you need to or should reach for certification here. Instead, you can solve it differently. For example, the canisters could just commit by sending their bits to another canister that notes them together with a timestamp, and forwards them when done?

I would phrase it as Motoko vs. raw Wasm-level System API (rust just happens to provide a relatively low level mapping of the raw System API - but not even rust allows you to do every useful thing possible there).

That is probably worth its own thread, as it’s a hard question. For example Motoko fully embraces Candid, which is a deliberate choice. But is it the right choice?

Again analogies can be helpful. The Unix System API gives user level programs access to TCP, but not raw IP. Is that good or bad? Some programs cannot be written, but on the other hand interoperability is improved by that restriction.

Thanks for the reply. This makes sense. I may try to document the system calls in a table and show how you can/cannot do things in rust vs. motoko. I’ve already discussed with @claudio that we really do need a call_raw(blob):blob. For basic pass through wallet functionality.

I feel a bit better about this from a motoko standpoint. I’d still love to see a functional spec of verifying a signature in case I get crazy and want to write a cobol client.:grimacing: I think the site has a good tech spec, but a functional, step by step process with exact curves and encryption schemes would be instructive.

I think it is something like:

  1. Decode cert using cbor. Components: Signature, tree
    A. Cert is a _________ type signature.
    B. Is a merkle tree with definition (link to merkle tree class)
  2. Find cert in leaf of merkle tree and verify root of tree using .reconstruct.
  3. Retrieve root bls key from system
  4. Verify signature is signed by root bls key and is of data root of tree using (function) of (crypto class)

Yes, such a tutorial/explainer would be useful. The pieces are all there (in the Interface Specification), but hard to put together.

There is also a talk on response certification which has been recorded a good while ago, and will hopefully soon appear at … oh, has been published without me knowing!

(A bit more high level than what you wanted to know, but still helpful for background information.)

I think I though of a decent use case:

Canister A hods some state that is important to a decision on Canister B. Canister B could query Canister A for that state during an update query, but that will result in the cycle cost of a XCanister Call and a round trip for waiting for consensus.

Instead it would be nice if A could push the certified data to B once and then the client could deliver the certificate with the update call with a witness and validate any number of pieces of data on A without the XCanister call.

Example: A is a ledger of Tokens X. B awards a Token Y if you burn tokens X. A can update B on the root and all burns → mint can be executed without a XCanister call.

Maybe a bit contrived, but sure, I’ll take it. Now we just need volunteers to write the corresponding libraries for Motoko :slight_smile:

What are those for the certificate?

I think it is cbor encoded so we’d need to parse a subset of cbor values. What are the fields and types?

What crypto library do we need for validating the signature?