Content Filtering via Boundary Nodes

Hi Folks,

We saw a few comments on social media about how content filtering on boundary nodes work, both its design intent and current implementation. There is a bit of information to share, so I think it’s best to not be clever or fancy, but start with the most concrete user experience and grab the bull by the horns.

This (very long) post has three sections:

  1. From a User’s POV
  2. How it works now & How it we think it should eventually work
  3. Specific Lessons from Canister vrnyn-miaaa-aaaao-aaaba-cai

I appreciate everyone’s input and I expect to get questions and feedback, all welcome, of course!

13 Likes

1. From a User’s POV

Some users in the US and Switzerland went to https://vrnyn-miaaa-aaaao-aaaba-cai.ic0.app and saw “HTTP 451 response code - Unavailable For Legal Reasons”:

Fig 1.

What exactly happened here? Why does the canister URL show this page? Why does it link to a “Code of Conduct” for boundary nodes? How are boundary nodes different from consensus nodes?

I will use this particular case for informational purposes and then broaden the information:

In simple, user-facing terms:

  1. When a user inputs https://vrnyn-miaaa-aaaao-aaaba-cai.ic0.app into the browser, the browser asks its nearest boundary node, “please serve the content from this canister”.

  2. Typically, the boundary node knows “canister vrnyn-miaaa-aaaao-aaaba-cai lives on subnet X, lets serve the content living in subnet X”.

  3. Before the boundary node serves the content to user, the boundary node checks two things:

  4. If the canister in question has any restrictions for geography

  5. Where the user’s request comes from

(Why the BN does this is a very important point we will return to in part #2 below)

  1. In the case of canister vrnyn-miaaa-aaaao-aaaba-cai , the BN saw that this canister had been flagged as one that should not be served to US or Swiss users. So requests from US or Switzerland are served the page in Fig 1. Users who visited the canister outside those locations or who used VPN were served the canister’s UI from the Boundary node.

  2. Slightly more technical…Not only was the BN not serving US and Swiss users the website UI from vrnyn-miaaa-aaaao-aaaba-cai , the BN also did not serve the canister’s API endpoints. This had the consequence that the US-based developer who created the canister could not access their own canister’s cycles (if developer used a VPN or was located elsewhere their experience would be different). Please note that this API denying is being changed for future cases (see question/answer 3.5).

12 Likes

2. How it works now & How it we think it should eventually work

The first thing to note is that this experience is the current state of the world. The current user and developer experience are a mix of:

  1. short term implementations that are knowingly imperfect
  2. long term design needs necessary for the health of the IC.

In short, there is a gap between the long term design and the implementation. Let’s tease those apart.

2.1 What is Content Filtering?

Content Filtering is when a Boundary Node does not serve requests relating to a canister. The canister’s code and state remains on-chain and intact.

Examples of content filtering:

  • BN in Germany does not serve a gambling dapp to Switzerland-based users because online gambling is illegal in Switzerland, but it does serve the gambling dapp to Netherlands-based users where online gambling is legal.
  • It does not matter whether the BN is in Germany, Switzerland, Netherlands, Mexico, India, US etc… What matters is where the user requests are from.

Content Filtering does not:

  • Delete or modify the canister in question. It remains on-chain with its state preserved.

2.2 Why does the BN care about what content it serves and to whom (e.g. “content filtering”)?

There are two levels to this question:

  1. Why the IC needs content filtering in the first place
  2. Why does the content filtering live at the boundary node level

Let’s go into each of these:

2.2.1 Why the IC needs content filtering options

There are detailed threads on content filtering such as in this forum post, but here is a slight simplification:

In today’s world, servers (or data centers) get takedown notices from entities (e.g. countries, companies, etc). If a server does not comply, the server is typically removed from a data center and/or the domain in question is blocked by ISPs. For example: Swiss ISPs may block a server in the US if it considers it problematic (e.g., because it is a gambling site)… eliminating access to Swiss users of that server. Rather, than naively have an “all or nothing” strategy for content on the infrastructure edge of the IC, it is healthier to address the problem directly by:

  • Enabling boundary nodes to choose what content they serve. This ensures that the decision to comply (or not) with the local laws belongs solely to the boundary node operators. The ones that choose to comply may have higher probability to remain online or accessible in the environment they are in or for the communities they want to serve. This maximizes the number of servers online and number of users getting access to IC content.
  • Making it easier for people to run their own servers so that there are enough variety of policies and nodes that users can find access to the canister of their choice. This maximizes the probability of any given canister being served.

2.2.2 Why does the IC’s content filtering live at the boundary node level

As mentioned in this post from February 2022, the design intent is (emphasis my own):

At a high level, the plan is to make boundary node operators responsible for deciding whether to forward requests to specific canister smart contracts. This will enable them to filter access to canisters, for example in response to takedown notices that legally oblige them to stop serving content or face sanctions and fines. Since boundary nodes generally serve specific geographies and jurisdictions, this makes it possible that canisters will be accessible in some places, but not others, depending upon where legal action occurs. Having boundary nodes limit access to canisters, on a case-by-case basis, is seen as superior from a decentralization perspective to having node operators attempting to remove or freeze canisters by submitting proposals to the Network Nervous System (a special DAO that automates the management and governance of the Internet Computer blockchain network by the community).

Each operator of boundary nodes will be responsible for defining their own policy and practices. As a boundary node operator, DFINITY is developing its own policy, which will be made public in the coming days and defines in what circumstances it may decide to refuse to route HTTP requests to a canister on the blockchain. Currently, while subnet blockchain node machines are run by independent node providers from data centers around the world, boundary nodes are run exclusively by the DFINITY Foundation, which has allowed it to help protect the blockchain.”

Since then, DFINITY’s Boundary Node policy has been made public.

Worth noting that post is a result of a significant brainstorming with the IC community where the community proposed choosing Boundary Nodes as the way to handle Content Filtering. This was what the IC community wanted to do (as a matter of fact it was against DFINITY’s original proposal).

2.3 What are the main differences between the current implementation and the long term design?

The boundary node roadmap goes into further detail, but I will lay out the basics:

Soon, “boundary nodes” will be split into two::

  1. “HTTP gateways”: provide an endpoint to the users that terminates TLS, serves the service worker and translates the users’ HTTP requests to API canister calls. Anybody will be able to run any HTTP gateway they want without needing any NNS proposal. These nodes will NOT be rewarded with ICP.

  2. “API boundary nodes”: provide an endpoint that handles API canister calls by routing them to the correct subnet and replica node, and provides caching and rate-limiting to protect the IC. These nodes will be allowed into the IC via NNS proposal and will be rewarded with ICP for their work (like consensus nodes).

A clear and immediate consequence of this upcoming work: If this were complete, in the recent example of canister vrnyn-miaaa-aaaao-aaaba-cai, even if DFINITY’s HTTP gateways did not serve canister vrnyn-miaaa-aaaao-aaaba-cai to US or Swiss users, the canister developer (or anyone really) could spin up their own HTTP gateway and serve the content vrnyn-miaaa-aaaao-aaaba-cai themselves.

2.4 Which entities run Boundary Nodes currently?

While consensus nodes are fairly decentralized (with increasing number of nodes and node providers), as of January 2023, only DFINITY runs BNs. Worth stressing again that the BNs do not affect anything on-chain.

The reason has to do with security: until the boundary node roadmap to replace “boundary node” architecture is complete, it is not secure (yet) to have more entities running boundary nodes (I am skipping the technical details for simplicity).

And yes, DFINITY as the only entity running BNs is an intermediate phase that will soon be over (ETA Q2 2023) and anyone in the community will be able to run their own HTTP gateways or API boundary nodes

2.5 How will more “HTTP gateways” and “API Boundary Node providers” help?

Few obvious benefits:

  1. The thesis is that in a world with hundreds or thousands of HTTP gateways providers, if a user is looking for canister X, there will be some HTTP gateway provider that is willing to serve that canister (perhaps it’s the canister’s developer themselves).

  2. As an additional measure, users will be able target directly an API boundary node with an IC-native client (e.g., browser extension), which doesn’t exist yet, but could be built by any community member. Using a browser extension, one will be able to access a canister under icp://canisterid.

  3. iOS or android apps will be able to directly communicate with the API boundary nodes, so they will not need to trust the gateways nor the boundary nodes but can verify canister data directly using the ICP public key.

2.6 If a canister developer does not trust other HTTP Gateway providers to serve a canister, would it not be easier for the developer to run their app on a single server instead of deploying to the IC and run their own HTTP gateway?

Simple: a canister developer can rely on the tamperproof qualities of the IC and its consensus… while relying on their own HTTP Gateway to guarantee access.

As an example, users may trust a lottery or defi smart contract on the IC because it is executed on multiple consensus nodes, but they may not trust it if it ran on one machine entirely in control of the developer.

2.7 How close is the IC to the new architecture?

As of January 2023, the Boundary roadmap is here and the IC is months (not weeks, not years) from having the new architecture composed of HTTP gateways and API boundary nodes.

Q1 2023:

  • The team starts implementing the new boundary node architecture that decomposes boundary nodes into “HTTP gateways” and “API Boundary Nodes”.

Q2 2023:

  • New architecture with HTTP Gateway and API Boundary Nodes is shipped

Q3 2023

  • Architecture will be managed by the NNS (API Boundary nodes are approved by NNS DAO).

2.8 If I want to create a dapp that is always accessible no matter the policies of BNs, what do I have to do? Are there any temporary solutions?

The obvious answer is “wait for the new architecture that does it right” then a developer can roll out your own HTTP Gateway (Or you can also provide an IC-native client that directly targets the API Boundary Nodes). Then you would not need to rely on anyone but yourself. In fact… unlike consensus nodes, which are rather beefy machines running in data centers to keep up with consensus, the design intent is that HTTP Gateways be so simple they could run from people’s homes.

However, the DFINITY R&D has come up with a few temporary workaround (and are looking into some more):

  1. A user could use something like a browser-extension. One can always directly target a boundary node with an IC-native client (e.g., browser extension), which doesn’t exist yet, but could be built by any community member.
  2. A developer who wants to go the extra mile could use IC Front (GitHub - dfinity/icfront ) and set up a custom domain so they do not go through ic0.app (DSCVR does this for example).

And we are looking into others…

12 Likes

3. Specific Lessons from Canister vrnyn-miaaa-aaaao-aaaba-cai

Even though the architecture will change soon, there are some lessons worth sharing about this canister because they help illustrate the way Internet infrastructure (not just ICP) works.

Now, I want to be clear:

This canister was flagged as a “gambling dapp”, but the canister’s developer has said the dapp is just a static website (which is reasonable), so we in DFINITY have reached out and asked the developer to formally appeal, and I have as well. To be fair to the developer, I will talk about “gambling dapps” further in these questions, but I do not mean to contradict that it is a static page. I am using the example for illustration or explanatory purposes of what happened.

3.1 How did vrnyn-miaaa-aaaao-aaaba-cai violate DFINITY’s boundary node policy?

It was flagged as “online gambling” so the standard DFINITY process is to review the dapp and make sure that users from the US and Switzerland are not served the content.

3.2 Is there a way for people to appeal to DFINITY that there was a mistake?

Absolutely! And people do! DFINITY’s intent is to have a policy that protects the IC from ISP blocking while we work on the new architecture, not police Web3.

In fact, we encouraged this particular developer to appeal.

But this is Web3, so the intent is that ANYONE can appeal for anything (not just the controller of a dapp).

(For those wondering, why we ask a developer appeals, one reason is because yes we have seen the social media and other communications the developer has, but we are deliberately not counting tweets as what the developer’s intent is. We want to give the developer a chance to provide their official stance, to give them the best opportunity)

3.3 How much does the location of a BN matter?

It honestly matters less than one may assume in some cases.

For example:

Suppose a user is in Switzerland (where online gambling is illegal) and their closest BN is in Malta (which allows online gambling). Even though online gambling is legal in Malta, it’s common practice that a server decides on what to serve based on where the user is,not where the server is.

This may sound counterintuitive so it’s easiest to explain why by assuming the Malta BN owner thinks, “I don’t care what Swiss law is, it’s legal in Malta so my machine will serve Swiss users.” The BN owner will probably be right that Swiss law has little to no reach for them, but in practice it’s possible that Swiss ISPs decide to block all content from that machine (thereby blocking all Swiss users from that machine). If we extend this a bit further into the IC, imagine how it would affect the IC how an “all or nothing” strategy would affect it: if BNs could not choose what to serve, they may find themselves being inaccessible to ISPs all over the world.

This is why this tweet by a user has a bit of an accidental red herring:

  • It is a red herring that the user is seeing their nearest BN is in Nevada (a state that allows online gambling). In practice, what matters is where the user is at. A BN (or any machine) in Nevada that serves online gambling indiscriminately to users in places where it is illegal, may find itself blocked entirely by other ISPs.

3.4 There are mentions of countries, but not of regions like states, provinces, cantons, etc… could BNs or HTTP Gateways be more fine grained?

BNs operated by DFINITY are currently only granular at the country level when it comes to deciding what content to serve. Hence users from a US state like Nevada (where online gambling is legal) are not being served gambling dapps. .

Could it be more fine grained? Absolutely. It’s just more work, but very possible.

3.5 Why did the developer of the vrnyn-miaaa-aaaao-aaaba-cai canister say they lost access to the cycles of their dapp?

Because when a canister is not served by a BN, it does not only deny access to the UI, it also denies access to the underlying APIs. To be honest, boundary nodes do not need to do this and DFINITY wants to filter as little as possible so DFINITY submitted a pull request to change this, based on this particular experience.

3.6 Why does DFINITY’s “code of conduct” for its BNs care about online gambling at all?

Simply, DFINITY sees itself as a steward of the IC so it has a very conservative attitude to make sure that the IC is not blocked by local ISPs before the new architecture is rolled out. Once the new architecture is rolled out, and others join as HTTP gateways and API Boundary Nodes, I expect to see diversity in risk calculus across owners.

19 Likes

This is a great post Diego!

Thank you for sharing this in the forum! (RT’d on Twitter)

I’m going to share the Twitter Space link (just discussed today)

We had a great discussion with leaders in the IC Ecosystem! Thank you to Diego, Andra, Jan, Lomesh, Jordan Last, theguy.icp, IC Biebert (Aaron) and my Co-Host Daniel for taking the time to speak on this topic!

The Swop Twitter Space - IC Boundary Nodes & Censorship

4 Likes

Wow epic post. If only you had these answers last year when all the maxis were popping off about how we shouldn’t ban mario cart. Such an elegant design. Bravo.

2023 is the year of the IC!! Ahoy Diego!!

6 Likes

Will Boundary Node providers need to get paid more than typical consensus node providers? It seems like quite a bit of work to be a boundary node provider with the need for constant monitoring if canisters are violating some policy. Once the IC replaces AWS how can BN providers reasonably scan the entire IC for canisters that are violating their policy? Even for the most conservative BN providers it feels like a never ending game of whackomole.Over and above this, as a conservative BN provider I would need to educate myself as to all the laws all around the world!

@diegop Since the Http gateways are being split off, what protocol will the API Boundry Nodes use for API calls? Will it still use HTTP/CBOR or something different?

Few users in the US and Switzerland who cannot see the canister asked that I share what the website looks like, so a friend in India was able to screenshot it for me.

Here is what users outside the US and Switzerland would see:

2 Likes

I am not aware of any change, so I have asked @raymondk (Sr. Engineering Manager of the boundary nodes team) to reply.

2 Likes

What about countries where you need a gambling license in order to operate?

Hi @Gekctek, there are some details in this boundary node roadmap forum post: Boundary Node Roadmap

The short answer is that the API boundary nodes will continue using the current http/cbor API that is used today to talk to the ic0.app/api endpoint.
The http gateway’s primary function will be to terminate TLS and serve the service worker.

4 Likes

@dfisher Clarification…when you say “boundary nodes”, do you mean “http gateways” (which wont get remunerated) or “API Boundary nodes” (which are remunerated)?

I assume the latter.

But to answer your question, I am not aware what the remuneration table for those will be. I expect some supply/demand to be a part of it, but thats me speculating.

For http gateways, i expect many people to run them for their own access… but if there are not enough, I would also expect the design is revisited and iterated upon.

1 Like

agreed! Great Twitter Space

1 Like

For the non technical, do you mind explaining the difference EL5 between http gateways and API Boundary Nodes?

Do you mind also explaining how it will be possible to keep up with all the illicit canisters? Is that a difficult problem to solve?

2 Likes

This article is great. Sorry for the trouble. I know this project is heading in the right direction. Also I was able to use dfx without a vpn this morning thanks to your upgrades. I’ve also cleaned up the canister in question to avoid any future problems.

2 Likes

Fwiw, I am a strong believer that cases like this allow for a strengthening of the network. Allows community to come together and improve the network. My classic example is standup comedians. I believe standup comedians get INSTANT feedback. Not laughing is feedback.

The best standup comedians are borne from thousands of joke iterations, so you can always tell the ones who worked at small clubs all the way to the HBO special. Each set and joke that bombed is another opportunity for iteration.

Similarly, each case where community has to decide improves the network for the big time.

Thank you for taking the time to layout the issue.
Presently, a BN has implemented enforcing the national policy (e.g. no gambling apps accessible to Switerland and US). You have said this architecture was selected to protect the blockchain.

If the number of BN increases (e.g. home boundary nodes) there will be a wide view of the risk calculus. Meaning, some one will permit serving the app. Would this not mean that the block chain would be put at even greater risk?

Good question. In the new architecture, the risk would be on the HTTP Gateways being blocked themselves. The intent of the design is that there CAN BE many risk calculi, but maximizing access to the blockchain* is not at risk. If a HTTP Gateway were blocked, more would be spun up trivially.

The current BN architecture implements a filtering policy that permits the IC blockchain to operate and keep regulators happy because it implements a single policy that complies with state regulation/law.

If multiple BNs serving a region do not implement the same policy the regulators will quickly realize that engaging a BN is ineffective. It is likely that they will then pursue other options that you were attempting to avoid.

1 Like