Same for me here.
yes, we are currently experiencing a surge in traffic that is causing the dashboard and other dapps hosted on the IC to load slowly. The IC is working as always, but the boundary nodes cannot keep up with the increased traffic and route all requests to the subnets. We are working on improving the performance and increasing capacity.
From my side it looked like a complete Major Outage for about 10 minutes and before this services became barely responsive(dashboard, Internet Identity auth wasn’t working - timeout)
We need to address this asap, brings bad reputation…
Could DFinity team elaborate a bit and provide some technical details so we as a community could get a better grip on what’s going on?
This is NOT the first time, I’ve seen at least 4 partial outages for the past 2 months…
Absolutely! Usually the team publishes a postmortem with all the details after the incident is resolved and all the root causes are understood. The first priority right now is to fix and then discuss later.
You can find some previous postmortems by searching “incident retrospective” in this forum.
Is this why I have been unable to get on the nns for the past two day? Every time I try to log in it just loads forever.
This issue has been resolved if it persists kindly follow on Twitter to rectify this (@Tiannreed)
I do not have a twitter
Nns is still just loading for me
This is an unrelated issue that has been identified and a fix is on the way as posted here:
To maintain its reputation for reliability, the number of boundary nodes in the $ICP should be increased tenfold, from the current count of 20 to 200. This enhancement is crucial because if users are unable to access services on the chain, they will perceive the $ICP as unreliable, regardless of whether the chain itself is functioning properly.
I recognize the rationale behind optimizing based on usage. Nevertheless, the boundary nodes represent a significant vulnerability. Whenever there’s a surge in usage, issues emerge due to the boundary nodes’ inability to cope with the influx of visits.
Imagine a “world computer” while assuming that a mere 20 “webservers” will be adequate for everyone. This misconception must be rectified prior to any significant adoption.
Beyond subnet rental, an alternative model could involve services exclusively renting a stack of boundary nodes, thus enhancing the reliability of their service.
The most straightforward method to disrupt ICP is by DDoS attack on its boundary nodes.
I agree with you. This is a preventable problem. Let’s fix it.
Yea, having just 20 boundary nodes is kind of hilarious tbh
Even a minor DDOS can kill such a small network easily
I’m also curious what happens to the UX when some boundary nodes become overloaded, is there some kind of balancer in front of boundary nodes to exclude overloaded/dead boundary nodes?
Is there already an official explanation of the incident?
You are right about this and there has long been an awareness of the limitations of the current configuration of IC boundary nodes which has been in place since genesis of the IC mainnet. There has been an ongoing project by Dfinity to rebuild the boundary node architecture and topology as documented in the Boundary Node Roadmap thread. A recent summary of progress was given in November:
So over the next few months we should have new ic-boundary API Nodes being auto-deployed across IC datacentre locations around the world by NNS proposals utilising spare IC Node hardware servers already registered with the NNS. This will expand coverage across the internet for accessing canisters using secure and decentralized API services.
And ic-gateway services which convert HTTP internet requests into ic-boundary API calls will be hosted by node providers, project teams or individual developers and users on their own computers or virtual servers.
The new architecture for these IC boundary services will both improve decentralization and offer much more flexibility to setup more public, protected and private entry points to the Internet Computer. So a solution to blocking or avoiding DDOS attacks and heavy load peaks is on the way soon.
@rbirkner feel free to correct me if anything above is incorrect or misleading