Agent-JS 0.20.0 is released! (replica signed query edition)

Significant new features in this release!

Most notably, @dfinity/agent version 0.20.0 now handles query signature verification. All queries will automatically check for signatures in the state tree signed by the node that replies. This enables the client to verify that there have been no man-in-the-middle attacks without a developer needing to implement certified variables or use update calls.

important note: if you are using a lower version of dfx than 0.15.2, you will need to set {verifyQuerySignatures: false} in your HttpAgent options, as the signatures were not present in earlier replica versions.

Additionally, this change replaces the package tweetnacl with @noble/curves for Ed25519 curve signatures and validation.

What’s Changed

Full notes available at: Release v0.20.0 · dfinity/agent-js · GitHub

9 Likes

While this is undoubtedly a nice feature, I assume it might impact performance. Do you have any benchmarks to share?

Additionally, how can a developer verify that the signature verification was successful?

As a quick follow-up, 0.20.1 is now released. It has new retry and cache pruning logic for the case where you have a cached set of subnet keys, but the actual node you have queried has recently been updated or replaced.

2 Likes

I don’t currently have performance benchmarks for the agent’s performance. The query certificates will be present regardless of whether we check them or not (they have already been on mainnet), so the main new overhead is in checking the subnet status to gather the valid public keys of nodes that are allowed generate query responses. That is also cached for subsequent requests, and the request_status call is made in parallel to the query to reduce blocking time.

The validation is done automatically unless you disable the feature, and failed checks will throw errors, but the process for checking lives here:

You can also use the CanisterStatus.request utility and request the subnet path in order to get access to the subnet / node info for a given canister ID and run the check yourself. I admit that inserting a mid-flight hook into the query method (and updates for that matter) to inspect the call and run custom code is a worthwhile improvement item we should look into as well

Thanks for the feedback!

While I do really appreciate the addition of this feature, I must admit that I would have been quite curious to see some benchmarks to better understand its impact. I of course do trust and assumes performance will still be great but still, would have appreciated some numbers rather than “try and see”.

Regarding caching, what strategy is applied to cache the information? In all the apps I create for the foundation or myself, I do not preserve any state on the app side regarding the agent or the actor. In other words, every request recreates a new actor. Is this pattern supported by the caching strategy?

Sounds good. In the code you linked, this particular line throws an Error. Is this expected?

Btw, regarding the same snippet, shouldn’t this particular line, throw an error as well? I assume it can never be reached, or do we assume that a response without a signature is a valid response?

1 Like

@Shuo would you be able to run some benchmarks with an old agent and replica and see what the real world impact is?

Is this pattern supported by the caching strategy

No, you don’t have any benefit from caching using this strategy. I intend to externalize the cache as an enhancement, but the cache is local to the HttpAgent instance currently

As for the errors, I just got back from PTO, and I’ll add them to my list of things to review!

1 Like

So, if I re-create an HttpAgent everytime I create an actor, I get zero caching at the moment and the certificate is fetched for every query?

Just asking to be sure I get it right.

1 Like

I mean, if every single query or update you make creates a new Actor and HttpAgent instance, than yes, you will have no caching benefits.

I don’t know exactly why someone would build their app that way - one Actor instance per canister seems pretty reasonable to me, but yeah, that’d be really inefficient

edit: I think I misread. Subsequent queries from an Actor will use a cache of the subnet status as long as the keys for that canister ID have been fetched

Why not? It’s not documented anywhere that agent-js is not stateless and must require a persistent state.

Thanks for the confirmation. It means it’s going to be quite some work to migrate any of the projects I’ve been involved in. We also have to think if and how we should enforce developers using ic-js level to use a single instance of the HttpClient.

It’s not documented anywhere that agent-js is not stateless

Fair point! My intuition as a dev is that classes generally hold some degree of state, but not everyone necessarily thinks that way!

Would you be interested in a set of purely functional exports to use instead of the Actor and Agent model?

that’d be really inefficient

I probably shouldn’t speak too glibly about this. My intuition is that it would be somewhat inefficient, moreso now that we’ve introduced the cache. I do not have profiling that indicates that creating a new Actor for every request takes up a problematic amount of processing or memory in practice

1 Like

I personally would definitely prefer pure functions. The most important thing, I believe, is to have one single approach that is well-documented and that the entire community can refer to.

Well noted. Nevertheless, seems that treating HttpAgent as a singleton tends to be the expected pattern. I need to think a bit about it.

Thanks for the explanations and feedback!

Sorry to interrupt, but in which file should we change the setting of http agent and replaces the package tweetnacl with @noble/curves ? I come across the same problem. Thanks!

Linking the duplicate question of @JJ_2100Invalid signature from replica - #7 by JJ_2100

If you use the latest release, tweetnacl is no longer a dependency

I just upgraded Azle to "@dfinity/agent": "^1.0.0" and now we're getting various Invalid certificate: Signature verification failed` intermittently in our tests.

Here are the tests, without digging in too deeply they just seem to happen randomly/intermittently. We are using dfx 0.16.1, and these are from our property tests that essentially execute very basic query/update calls over and over again with random inputs. Is there some kind of issue with the icx-proxy or replica where the keys are thrown out after a while or somehing?:

I’ve been rerunning those tests to try and see if it’s just a fluke, so some of those might now be passing. Here’s the main run which still has some of these tests failing: testing mac · demergent-labs/azle@be8406a · GitHub

Interesting - I have mainnet tests that fire off 20 queries in my e2e suite and I haven’t encountered this yet. I’ll have to scale those up, and I’ll also see what I can do to identify where this might be going wrong.

1 Like