Boundary node http response headers

Nope. They block the header for some reason.

I thought every boundary node (that every request must go through) runs icx-proxy. Can you clarify what you mean by a “regular IC subdomain”?

I think @lastmjs is referring to boundary nodes, whereas I think @skilesare is saying that Origyn is running their own instance of icx-proxy somewhere.

Yes. This is what I mean. Although I still have confusion as to if when we run icx proxy and point to ic0.app if we are ourselves using a boundary node. are we proxying to a proxy?

Which command line flags do you use when calling icx-proxy?

I assume --dns-alias, hopefully not --fetch-root-key, but what do you provide to --address?

–replica “https://ic0.app” --address 0.0.0.0:5000

Sorry, yes, I meant --replica, not --address.

Now I’m really confused :thinking:

My understanding is that the DNS record for ic0.app points to IP addresses of boundary nodes (the closest one by geographical proximity).

I didn’t know you could bypass a BN.

I don’t know that you can…that is what I’m asking. I don’t know that I’m bypassing it. What I do know is that if I use my own proxy then the chunk requests aren’t stripped off.

on use of icx-proxy:

The requests which are CBOR encoded do not go through icx-proxy.
HTTP requests are not CBOR encoded and require protocol conversion from http<>cbor. This conversion is the main is done job of the icx-proxy. icx-proxy’s services can be invoked on the boundary node or the service worker. You can run any number for reverse proxies in serial, adding value add services. @skilesare is using his own icx-proxy in proxy mode to serve content but from a different domain nft.origyn

Let’s list down the type of requests handled by the Internet computer IC and go through the paths taken by the requests.

Invariant: canisters ultimately understand only CBOR[1] encoded requests. Talking to them using any other protocol requires marshaling and unmarshaling to and from that protocol Ex. http<>cbor

Types of requests

1. CBOR encoded requests. (No ICX proxy)

 These requests interact with the canisters by sending an HTTP POST request to REST endpoints of the from

  http://<canisterid>.ic0.app/canisterid/api/v2/[*call | query*]

These requests originate from stand-alone NON-browser clients like dfx & quill which use the underlying agent-rs or agent-js libraries. Handling these requests is relatively simple. A HTTP put request with CBOR encoded data a request body is sent to the closet boundary nodes resolved by ic0.app. The boundary node forwards the message to the destination replica. Thus in this path, there is NO need for icx-proxy to be invoked by the boundary node routing infra as the request/response is already in the IC native CBOR format.

2. Non-CBOR encoded request. (ICX proxy or service worker path)

You can interact with canisters using any legacy protocol. One of the protocols is HTTP/s. HTTP request looks like text blocks of the form

GET /index.html HTTP/1.1
Host: www.<canisterid>.ic0.app/

Such HTTP requests originate from browsers. Now the HTTP request above has to be converted to CBOR before sending it to the replica (see invariant above), and the CBOR response has to be marshaled back into the HTTP response on its way back to the browser. These non-CBOR requests are further categorized into 2 types.

2.a Non-CBOR plain request Example https://canisterid.ic0.app/index.html

These plain non-CBOR (aka HTTP requests) are intercepted in the browser itself by the service worker. The service worker converts them CBOR and then the CBOR request is handed to the boundary nodes. After this, the path for the request is the same as CBOR requests.

2.b Non-CBOR raw request Example https://canisterid.raw.ic0.app/index.html

These non-CBOR requests bypass the service worker and onto the boundary node as HTTP requests. Here the boundary node handles the CBOR<>HTTP conversion by employing a locally run icx-proxy. That is each boundary node is locally running a icx-proxy.

Other interesting interactions.

IC local development uses an icx-proxy that routes requests to the locally run replica. Here locally mean the developers laptop :slight_smile:

@skilesare scenario - Any of the above-mentioned paths can be augmented by placing yet another icx-proxy in front of the boundary nodes.

We are constantly trying to simplify the paths, but are limited by the past decision which dictate certain paths be maintained for backward compatibility. Some of the raw interactions are also because of certificate verification limitations like lack of streaming in the service worker and memory restrictions in the browser that prohibit certificate checks for big sized assets

[1]CBOR - Wikipedia

3 Likes

Ok…my guess is that this is what is happening when I request https://nft.origyn.network/-/nftforgood_uffc/-/ogy.nftforgood_uffc.1

  1. Browser Gets to nft.origyn.network and is answered by the ICX proxy running there.
  2. The ICX proxy there converts the request to CBOR and sends to canister.ic0.app as specified by our internal translation table. Thus, since our ICX proxy does not strip any headers the range request header is being encoded into cbor.
  3. The boundary node that answers asks the IC for the query.
  4. A Node answers.
  5. The Boundry node sends the cbor encoded data to the ICX Proxy.
  6. The ICX proxy converts it back to an HTTP response and sends it to the browser.

:slight_smile: yes but I won’t count on that to work forever.

All icx-proxy copies come from the same code base, so there is no reason to believe that icx-proxy run on nft origyn behaves any different than the one thats runs on boundary nodes.

Any behavioral difference is accidental and not intentionally engineered. So it may break as things reconcile.


Also looks like this is not doing range queries
https://nft.origyn.network/-/nftforgood_uffc/-/ogy.nftforgood_uffc.1

Most likely ICX proxy on nft.origyn is downloading the entire video. Can’t say - but try with a very large video of 1G or so. I think it will not start streaming until it downloads the whole video on the nft.origyn server first and then streams.

IMHO nft.origyn is not solving the streaming issue. Please try with a large video asset. With range queries, working correctly the streaming should start immediately

Thank you for the great explanation.

A couple of questions:

  1. It seems like there is no way to directly talk to replica nodes. Everything must go through boundary nodes, even when the request is already CBOR-encoded. Is that right?

  2. Why does the service worker handle the CBOR encoding/decoding for non-raw requests but can’t do so for raw requests?

  3. How do boundary nodes discover the IP addresses of replica nodes? Where do they look up the mapping from canister ID to replica IPs?

1 Like

Pretty sure they query from the NNS, which keeps a list of all the ICs canisters/nodes/IPs.

What happens if I do that query myself and directly call the replica? Is that a security risk?

1 Like

If this is a fork, why would it change unless we wanted it too?

Also looks like this is not doing range queries

Try with Safari. It is the one pissy with range requests and it was hell to get it working. With chrome you can just stream the file down and that this what we do at the moment.

I think the issue here is that we are using our own NGINX proxy and we’re not stripping headers so they are making through to our ICXProxy and then we turn it into cbor and forward it along. The regular boundary nodes have NGINX set up to strip the range request. Range response is allowed.

We are working on certifying chunks so they come back through with certified headers.

1 Like