JS2GO EP.43 Rate Limiting and Throttling in Go and Node.js

Ensure system stability, prevent DDoS, and keep your API healthy even under tens of thousands of requests per second.

Rate Limiting is a core technique that tells your system: 👉 “How many requests can one user send within a specific timeframe?” It helps prevent your backend from being overwhelmed or exploited.

Rate Limiting protects you from:

API crashes caused by excessive traffic
Hacker brute-force login attempts
Bot spam & automated scripts
Backend overload (DB/Cache/Microservices)
Unnecessary server cost spikes
Degraded Quality of Service (QoS)

In this EP, you will learn Production-grade Rate Limiting Patterns for both Go and Node.js — with code that is nearly drop-in ready ✔

⭐ 1) Why Rate Limiting is Essential in Production

If your system does NOT implement rate limits → It will fail under pressure.

Problem	Impact
User accidentally triggers thousands of requests (infinite loop)	API becomes unresponsive
Bots sending requests repeatedly	CPU spikes to 100%
Hackers brute-force login	Security breach
API Gateway overloaded	Latency skyrockets
DB flooded with too many queries	System-wide failure
Microservices calling each other without limits	Cascade failure

👉 Rate Limiting = Safety Shield of your API

⭐ 2) Three Production-Proven Techniques

🔶 Token Bucket (Most Popular)

Concept:

A bucket contains tokens.
Tokens refill over time.
Each request consumes 1 token.
If empty → request is blocked or delayed.

Best for:

Burst traffic
Public APIs
Microservices communication

Why it’s great:

✔ Allows bursts
✔ Simple & efficient
✔ Works well for distributed rate limiting

🔶 Leaky Bucket

Concept:

Requests enter the bucket at any speed, but exit at a fixed rate — ensuring stable throughput.

Best for:

Logging systems
Streaming workloads
Background job queues

Benefits:

✔ Smooth & predictable traffic
✔ Protects downstream services

🔶 Sliding Window

Concept:

Track requests in the most recent time window (e.g., last 1 minute).

Best for:

Login systems
Anti brute-force protection
Public APIs requiring precision

Benefits:

✔ More accurate than fixed windows
✔ Reduces request spikes

⭐ 3) Production-Ready Go (Golang) Implementation

🟦 Token Bucket Middleware (Go)

package main

import (
	"net/http"
	"sync"
	"time"
)

type TokenBucket struct {
	tokens     float64
	capacity   float64
	fillRate   float64 // token per second
	lastFilled time.Time
	mu         sync.Mutex
}

func NewTokenBucket(capacity, fillRate float64) *TokenBucket {
	return &TokenBucket{
		tokens:     capacity,
		capacity:   capacity,
		fillRate:   fillRate,
		lastFilled: time.Now(),
	}
}

func (tb *TokenBucket) Allow() bool {
	tb.mu.Lock()
	defer tb.mu.Unlock()

	now := time.Now()
	elapsed := now.Sub(tb.lastFilled).Seconds()
	tb.tokens = tb.tokens + elapsed*tb.fillRate

	if tb.tokens > tb.capacity {
		tb.tokens = tb.capacity
	}

	tb.lastFilled = now

	if tb.tokens >= 1 {
		tb.tokens -= 1
		return true
	}
	return false
}

func RateLimit(tb *TokenBucket) func(http.Handler) http.Handler {
	return func(next http.Handler) http.Handler {
		return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
			if !tb.Allow() {
				http.Error(w, "Too Many Requests", http.StatusTooManyRequests)
				return
			}
			next.ServeHTTP(w, r)
		})
	}
}

func main() {
	tb := NewTokenBucket(10, 5) // capacity=10, fillRate=5 tokens/s

	mux := http.NewServeMux()
	mux.HandleFunc("/api", func(w http.ResponseWriter, r *http.Request) {
		w.Write([]byte("OK"))
	})

	http.ListenAndServe(":8080", RateLimit(tb)(mux))
}

This implementation:

✔ Supports burst traffic
✔ Thread-safe with mutex
✔ Middleware-ready

🟦 Sliding Window (Go)

var (
	requests = make(map[string][]time.Time)
	mu       sync.Mutex
	limit    = 5
	window   = 10 * time.Second
)

func SlidingWindow(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		ip := r.RemoteAddr
		now := time.Now()

		mu.Lock()
		history := requests[ip]

		// keep only valid timestamps
		valid := history[:0]
		for _, t := range history {
			if now.Sub(t) <= window {
				valid = append(valid, t)
			}
		}

		if len(valid) >= limit {
			mu.Unlock()
			http.Error(w, "Too Many Requests", http.StatusTooManyRequests)
			return
		}

		requests[ip] = append(valid, now)
		mu.Unlock()

		next.ServeHTTP(w, r)
	})
}

This pattern provides:

✔ High accuracy
✔ Great for security endpoints

⭐ 4) Production-Ready Node.js Implementation

🟧 Token Bucket (Node.js)

class TokenBucket {
  constructor(capacity, fillRate) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.fillRate = fillRate;
    this.lastRefill = Date.now();
  }

  allow() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;

    this.tokens = Math.min(
      this.capacity,
      this.tokens + elapsed * this.fillRate
    );

    this.lastRefill = now;

    if (this.tokens >= 1) {
      this.tokens -= 1;
      return true;
    }
    return false;
  }
}

function rateLimit(bucket) {
  return (req, res, next) => {
    if (!bucket.allow()) {
      return res.status(429).send("Too Many Requests");
    }
    next();
  };
}

Works great for:

✔ Microservices
✔ CPU-sensitive endpoints
✔ API Gateway layers

🟧 Sliding Window (Node.js)

const windowMs = 10 * 1000;
const limit = 5;
const history = new Map();

function slidingWindow(req, res, next) {
  const ip = req.ip;
  const now = Date.now();

  const reqLog = history.get(ip) || [];
  const filtered = reqLog.filter(t => now - t <= windowMs);

  if (filtered.length >= limit) {
    return res.status(429).send("Too Many Requests");
  }

  filtered.push(now);
  history.set(ip, filtered);

  next();
}

Perfect for:

✔ Login rate limits
✔ Fraud prevention
✔ Abuse detection

⭐ 5) Production Tips (VERY Important)

✔ Use Redis for Multi-Instance Environments

If you run multiple pods/containers, in-memory counters won't sync.

✔ Apply Multi-Layer Rate Limiting

API Gateway (Cloudflare, Nginx, Traefik)
Load Balancer
Application-level

✔ Sliding Window → Best for Public APIs

Accurate and security-friendly.

✔ Token Bucket → Best for Microservices

Burst-friendly and efficient.

✔ Use exponential backoff for retries

Protects system stability.

✔ Log & Monitor

Expose metrics for:

Rejected requests
Token usage
CPU/Memory load

✔ Return clear error responses

{
  "error": "Too Many Requests",
  "retry_after": 3
}

⭐ Final Summary

Rate Limiting is the core defensive layer of any API in production.
It protects your system from:

Traffic spikes
Bots and attackers
Resource overload
Cost escalation
Unstable user experience

Both Go and Node.js handle Rate Limiting effectively but with different strengths

🔵 NEXT EP JS2GO EP.44

Database Connections: SQL & NoSQL in Go and JavaScript

You will learn:

PostgreSQL / MySQL / MongoDB / Redis connections
Example code (Go Fiber + Node.js Express)
Recommended ORM/Query Builders
Connection Pool Best Practices
Production Do’s & Don'ts

Follow Us: