Looking for Better Rate Limiting / DDoS Protection Strategies on ICP

Hey all,

I’ve implemented a basic rate limiter on my canister with a 1h window and a 12h block if the limit is exceeded — but it has issues:

  • It’s per-function, so users can abuse other functions.
  • Users can wait out the window and burst requests again.
  • Registering with new principals bypasses limits.
  • Limiter is called late, so attackers can still burn cycles.
  • Restrict map grows but entries aren’t cleaned up.

I’m looking for better alternatives:

  • Any patterns/tools you’d recommend to stop spam or DDoS? (but specifically looking for api limiter sort of thing)
  1. Token Bucket: MMOs and Micro-Burst Fairness
    What it is: Imagine a bucket that fills at 1 token/minute. Each request costs a token. If your bucket has 10 tokens, you can burst 10 requests — but it refills slowly.

Why it works: Allows occasional burst while enforcing long-term fairness.

Web2 Example: MMOs like WoW throttle chat messages this way — you can spam a bit, but then cooldown kicks in.

Apply to IC: Track tokens + last refill timestamp per principal/IP. Smooths traffic and discourages burst spam.

  1. Sliding Window: Netflix-Style Consistency
    What it is: Instead of resetting at exact intervals (e.g., every hour), track timestamps of recent requests and count how many happened in the last 60 minutes.

Why it works: Prevents bursts right after a reset.

Web2 Example: Netflix monitors API usage per developer via a sliding log. Keeps rate smooth and fair even if usage is irregular.

Apply to IC: Keep a list of timestamps per principal and prune on access. Slightly more storage cost, but way more precise.

  1. Pre-flight Checks: Early Rejection (Firewall Mentality)
    What it is: Reject bad requests before doing any heavy processing.

Why it works: Stops cycle-wasting DDoS.

Web2 Example: SaaS APIs check auth tokens and quota before routing to the main app server. Edge firewalls (like Cloudflare) even block at the TCP level.

Apply to IC: Add a lightweight can_i_call() check or separate “guard” function that runs before any logic-heavy call.

  1. Sybil Resistance: Per-IP/Device Limits like Web2 Logins
    What it is: Don’t limit by principal only. Use:

IP address (if in HTTP context)

Device fingerprint (via frontend)

Hash of caller ID with some salt

Why it works: Stops spam from mass-creating principals.

Web2 Example: Every Web2 login form limits failed attempts per IP or device, not per account.

Apply to IC: Combine identity + request fingerprinting + history into a per-entity usage score.

  1. Memory Decay: Garbage Collection for Spammers
    What it is: Track activity with timestamps or usage scores that decay over time. Prune old/stale entries regularly.

Why it works: Keeps your memory from exploding with millions of one-time users.

Web2 Example: CRMs and analytics platforms age out old users after inactivity to keep things fast and lean.

Apply to IC: Use decay factors or TTLs to remove stale data periodically — either on access or via heartbeat.

  1. Progressive Proof-of-Work: Anti-Bot Login Pages
    What it is: Require the caller to solve a challenge (like hashing a nonce) before allowing access.

Why it works: Makes abuse expensive — not worth it for attackers.

Web2 Example: Some login endpoints (or email verifications) enforce computational cost when they detect abuse patterns.

Apply to IC: Require a hashcash-like nonce, especially for anonymous calls or during high load.

Optional Setup: Central Rate-Limit Canister
If you’re running multiple canisters or want global rate limits, consider building a separate “RateLimiter” canister that:

Tracks usage across all endpoints

Responds to can_i_call(“do_thing”) requests

Shared across services

Why? Easier to maintain, test, and apply consistent logic.

1 Like

thanks for providing these many prevention technique.

As far as I know, the Rust CDK provides a #[inspect_message] macro, which can be used to intercept certain messages at the edge node—such as filtering out malicious function parameter inputs.

One rate-limiting solution that was commonly used before involved requiring the user to provide a proof-of-work (PoW). Only when the PoW met a certain threshold could the user submit a request. The inspect_message handler would then check whether the PoW value met the required threshold. Additionally, inspect_message can also read the canister state, which makes it quite flexible.

However, its limitation is that inspect_message only works at the edge node. If the call is from one canister to another (canister-to-canister), inspect_message will not be invoked.

for now i am using the sliding window approach, i think it will met my requirements. I am also using the inspect message wisely but gonna look on the approach which you are suggesting.