Http_request: is there a cache somewhere? can it be busted?

Is there a cache somewhere or above canisters that implements http_request?

Can the cache be busted with a call or else?

Context

I have implemented my own storage to publish my users content. In a process, my app upload to the IC files such as robots.txt or index.html.

As a result and while testing, I often notice that the files I just uploaded are not updated straight away if I request these through the browser (incognito mode included). Spent quite some times this morning to debug the all process and can confirm that the content is correctly uploaded to my canisters.

Likewise just noticed the same effect while deleting content. Began to debug to understand why a file was still provided through http even though I had deleted it, find nothing. Finally 5-10 minutes later I tried to fetch it again and the file was successfully gone.

3 Likes

Have you tried setting cache control headers?

1 Like

Thanks. Yes, I do set HTTP headers information regarding cache such as max-age. I currently specify a one hour (max-age=3600) for the content, the index.html.

I can try to remove or shorten such header but, I reproduce my issue with incognito tabs. If close all tabs, even close the browser, reopen it, open a brand new incognito tab, then I land on some cached content too.

That’s why I am thinking there is another cache somewhere.

I think boundary nodes do some caching, or at least it’s on their roadmap to do caching.

Boundary nodes do cache query calls. The ttl is one second

2 Likes

Do you mind linking that code (if it’s open source)? Didn’t know it was so short.

The boundary node code is not open sourced. It’s on the roadmap along with the boundary node decentralization proposal

Currently there is no way for forcefully purging the boundary node caches.

The key for the cache is a hash of the cbor request, so a trivial way to invalidate is to generate a unique cbor request when contents change

Code for deriving the cache keys

2 Likes

It’s interesting, thanks for the feedback. The cache issue I notice, least more than a 1 second, like a couple of minutes :man_shrugging:.

Hello Peter.

I would recommend adding a cache buster to your query. Any random data, will do, or in this case I think the timestamp would be fine. This will ensure that the query does not match any historical queries, so won’t hit a cache.

Best wishes, Max

That can / would work for some assets I am holding in the canister (like images), good idea @bitdivine thx :+1:

To the contrary, not sure that it can be applied for the rest of my use case I would say spontaneously.

Users are publishing content, like blog posts or presentations, in their canisters. This material is shipped as index.html and then served through urls such as https://zhx6p-.....w.ic0.app/d/converting-svg-to-image-in-javascript.

So adding a token to the queries would also mean having to update the urls everywhere once shared.

I agree that this seems problematic for queries made via http_request and icx-proxy.

Does anyone know if update calls behave differently?

I’m wondering if that could be a way to work around this, albeit a more expensive one.

It would rely on this PR though: feat: upgrade HTTP calls upon canister's request by paulyoung · Pull Request #6 · dfinity/icx-proxy · GitHub

1 Like