Timestamp failed to pass the watermark after retrying the configured 3 times

senior.joinu · April 1, 2024, 8:24am

After upgrading agent-js to v1.2 we started to observe the following error during ICP token balance fetching:

Timestamp failed to pass the watermark after retrying the configured 3 times. We cannot guarantee the integrity of the response since it could be a replay attack.

All the other tokens work fine - we’re able to fetch the balance without this error most of the time, but sometimes it appears for other tokens as well. Seems like the probability for this error to appear is something like 95% for ICP and 10% for other tokens.

Maybe, something is wrong with our latency expectations. How do we fix that?

Thanks in advance!

anon74414410 · April 1, 2024, 4:10pm

Interesting! It may have something to do with the volume of transactions coming through the ICP canister. This is great feedback, and there are a few ways we can handle this.

Things you can do right now

You could wait a second and retry the query if it fails watermarking
You could set a higher retry time count

Things I can do:

Add additional / exponential delay to the retries
Test against the ICP ledger on mainnet and set a more reliable default

senior.joinu · April 1, 2024, 6:18pm

Thanks! We will try this first thing tomorrow and report back to you.

I don’t know if it helps, but maybe you could reproduce this issue setting the browser into throttling mode.

senior.joinu · April 2, 2024, 10:41am

Yes, retrying more times helps.
Set to 10 just to be sure.

Thanks!

anon74414410 · April 2, 2024, 6:40pm

Here’s a PR - I’d appreciate your feedback on the design and naming!

github.com/dfinity/agent-js

feat: retry delay strategy

dfinity:main ← dfinity:kai/SDK-1562-watermark-retry-delay

opened 06:39PM - 02 Apr 24 UTC

krpeacock

+84 -4

# Description Developers have noticed the agent is more frequently erroring w…ith the new watermark protections against replay attacks / stale data. This feature adds a delay strategy for retries that will allow for more time for nodes to catch up, with exponential increases to the rate Fixes SDK-1562 # How Has This Been Tested? new e2e tests # Checklist: - [ ] My changes follow the guidelines in [CONTRIBUTING.md](https://github.com/dfinity/agent-js/blob/main/CONTRIBUTING.md). - [ ] The title of this PR complies with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/). - [ ] I have edited the CHANGELOG accordingly. - [ ] I have made corresponding changes to the documentation.

peterparker · April 20, 2024, 8:52am

I just faced it took in Oisy Wallet. Any progress on the fix?

anon74414410 · May 1, 2024, 10:14pm

Sorry for the delay. The new strategy has been merged today - feat: retry delay strategy by krpeacock · Pull Request #871 · dfinity/agent-js · GitHub

anon74414410 · May 1, 2024, 11:13pm

And now the fix is out - Agent-JS 1.3.0 is released!

_Eric · July 4, 2024, 5:50am

hi, I’m having this problem again with agent-js 1.3.0.

anon74414410 · July 9, 2024, 3:35pm

Which canister is this targeting? Is it the ICP ledger again?

I believe the delay is working correctly, so this might just mean that the retryTimes count should be increased

timo · July 12, 2024, 7:55pm

What is this “watermark protections against replay attacks / stale data”? Is it documented somewhere? I would like to understand what causes and throws the error. Is it the gateway, boundary node, replica? And under what circumstances exactly?

senior.joinu · July 16, 2024, 2:59pm

I see two potential places where this problem could happen.

Here blsVerify is passed instead of the actual request. Typescript doesn’t catch that, becase the request is of type any in the definition of pollForResponse.

github.com

dfinity/agent-js/blob/main/packages/agent/src/actor.ts#L544


      
          requestId;
          response;
          requestDetails;
          
          if (!response.ok || response.body /* IC-1462 */) {
            throw new UpdateCallRejectedError(cid, methodName, requestId, response);
          }
          
          const pollStrategy = pollingStrategyFactory();
          // Contains the certificate and the reply from the boundary node
          const { certificate, reply } = await pollForResponse(
            agent,
            ecid,
            requestId,
            pollStrategy,
            blsVerify,
          );
          reply;
          const shouldIncludeHttpDetails = func.annotations.includes(ACTOR_METHOD_WITH_HTTP_DETAILS);
          const shouldIncludeCertificate = func.annotations.includes(ACTOR_METHOD_WITH_CERTIFICATE);

The fix is:

const { certificate, reply } = await pollForResponse(
  agent,
  ecid,
  requestId,
  pollStrategy,
  undefined,
  blsVerify,
);

And here blsVerify isn’t propagated through the recursion.

github.com

dfinity/agent-js/blob/main/packages/agent/src/polling/index.ts#L71


      
              reply: lookupResultToBuffer(cert.lookup([...path, 'reply']))!,
              certificate: cert,
            };
          }
          
          case RequestStatusResponseStatus.Received:
          case RequestStatusResponseStatus.Unknown:
          case RequestStatusResponseStatus.Processing:
            // Execute the polling strategy, then retry.
            await strategy(canisterId, requestId, status);
            return pollForResponse(agent, canisterId, requestId, strategy, currentRequest);
          
          case RequestStatusResponseStatus.Rejected: {
            const rejectCode = new Uint8Array(
              lookupResultToBuffer(cert.lookup([...path, 'reject_code']))!,
            )[0];
            const rejectMessage = new TextDecoder().decode(
              lookupResultToBuffer(cert.lookup([...path, 'reject_message']))!,
            );
            throw new Error(
              `Call was rejected:\n` +

The fix is:

return pollForResponse(agent, canisterId, requestId, strategy, currentRequest, blsVerify);

anon74414410 · July 23, 2024, 9:44pm

This is an agent error. It was introduced to prevent allowing stale data through that has a timestamp before the last known block that came in as a call.

This can prevent against ordinary stale data, or a malicious MITM replay attack. Since a node can fall behind, a valid canister signature may still come back, but we know that the state may have changed in a more recent block.

In theory, another request or two with a slight delay should hit a different node, or allow the behind node to catch up.

Despite the security advantages of this feature, this has been leading to increased client errors and degrading the user experience.

anon74414410 · July 23, 2024, 10:11pm

Thank you for identifying this mistake! I’ve opened a PR here - fix: passing request correctly during pollForResponse Processing status by krpeacock · Pull Request #909 · dfinity/agent-js · GitHub. I don’t know of a good way to test this flow to verify the theory that this was causing the error, and the fix will resolve it, though. There isn’t any tooling to produce a Processing response back from a test replica that I know of

timo · July 24, 2024, 4:58am

So the correct way to handle it is simply to retry the call?

nadiein · October 22, 2024, 2:17pm

heya frens,

after updating agent-js to v2.1.2 we encountered this error again.
is anyone else facing it again?

@anon74414410, by any chance do you already aware of it?

thanks

anon74414410 · October 22, 2024, 3:51pm

@timo also called this out to me. It’s on my radar and important, but I have to get a couple other things taken care of before I can investigate fully. It’s possible this is happening more frequently with the higher load incidences, but I’ll hunt for a flaw in the logic

icme · January 19, 2025, 6:01am

Bump

We just had a user hit this error in production today at 2025-01-19T05:39:53.000Z.

Hard to say what the conditions were that caused it, but our alerting let us know that someone encountered the issue. If canister ids or subnets would be helpful in diagnosing, I’d be happy to provide.

We’re using "@dfinity/agent": "^2.1.2",

let4be · March 17, 2025, 8:48am

I’m seeing this again and again on the front-end
What’s up?

calls are failing all the damn time
“@dfinity/agent”: “^2.3.0”

let4be · March 17, 2025, 8:50am

at least give an option to disable this check if user doesn’t care about stale data or can take the risk?

Topic		Replies	Views
Watermark error when trying to access the canister via agent-js JavaScript agent	9	61	April 15, 2025
Agent-JS 2.1.0 is Released! JavaScript	12	311	September 27, 2024
Error occurs during intensive download requests involving multiple Canisters Developers	3	70	May 16, 2025
Ingress_expiry not within expected range error Motoko Functional-Programming , community-consideration , Frontend	60	609	October 5, 2024
Automatic retry for `agent-js/agent/HttpAgent` (mitigate 503 certified state unavailable, service overload, etc.) Developers	11	696	August 3, 2022

Timestamp failed to pass the watermark after retrying the configured 3 times

Related topics