08/05/2026 06:52am

JS2GO EP.43 Rate Limiting and Throttling in Go and Node.js
#Rate Limiting
#Throttling
#Go
#Token Bucket
#Node.js
Ensure system stability, prevent DDoS, and keep your API healthy even under tens of thousands of requests per second.
Rate Limiting is a core technique that tells your system: π βHow many requests can one user send within a specific timeframe?β It helps prevent your backend from being overwhelmed or exploited.
Rate Limiting protects you from:
- API crashes caused by excessive traffic
- Hacker brute-force login attempts
- Bot spam & automated scripts
- Backend overload (DB/Cache/Microservices)
- Unnecessary server cost spikes
- Degraded Quality of Service (QoS)
In this EP, you will learn Production-grade Rate Limiting Patterns for both Go and Node.js β with code that is nearly drop-in ready β
β 1) Why Rate Limiting is Essential in Production
If your system does NOT implement rate limits β It will fail under pressure.
| Problem | Impact |
|---|---|
| User accidentally triggers thousands of requests (infinite loop) | API becomes unresponsive |
| Bots sending requests repeatedly | CPU spikes to 100% |
| Hackers brute-force login | Security breach |
| API Gateway overloaded | Latency skyrockets |
| DB flooded with too many queries | System-wide failure |
| Microservices calling each other without limits | Cascade failure |
π Rate Limiting = Safety Shield of your API
β 2) Three Production-Proven Techniques
πΆ Token Bucket (Most Popular)
Concept:
- A bucket contains tokens.
- Tokens refill over time.
- Each request consumes 1 token.
- If empty β request is blocked or delayed.
Best for:
- Burst traffic
- Public APIs
- Microservices communication
Why itβs great:
β Allows bursts
β Simple & efficient
β Works well for distributed rate limiting
πΆ Leaky Bucket
Concept:
Requests enter the bucket at any speed, but exit at a fixed rate β ensuring stable throughput.
Best for:
- Logging systems
- Streaming workloads
- Background job queues
Benefits:
β Smooth & predictable traffic
β Protects downstream services
πΆ Sliding Window
Concept:
Track requests in the most recent time window (e.g., last 1 minute).
Best for:
- Login systems
- Anti brute-force protection
- Public APIs requiring precision
Benefits:
β More accurate than fixed windows
β Reduces request spikes
β 3) Production-Ready Go (Golang) Implementation
π¦ Token Bucket Middleware (Go)
package main
import (
"net/http"
"sync"
"time"
)
type TokenBucket struct {
tokens float64
capacity float64
fillRate float64 // token per second
lastFilled time.Time
mu sync.Mutex
}
func NewTokenBucket(capacity, fillRate float64) *TokenBucket {
return &TokenBucket{
tokens: capacity,
capacity: capacity,
fillRate: fillRate,
lastFilled: time.Now(),
}
}
func (tb *TokenBucket) Allow() bool {
tb.mu.Lock()
defer tb.mu.Unlock()
now := time.Now()
elapsed := now.Sub(tb.lastFilled).Seconds()
tb.tokens = tb.tokens + elapsed*tb.fillRate
if tb.tokens > tb.capacity {
tb.tokens = tb.capacity
}
tb.lastFilled = now
if tb.tokens >= 1 {
tb.tokens -= 1
return true
}
return false
}
func RateLimit(tb *TokenBucket) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if !tb.Allow() {
http.Error(w, "Too Many Requests", http.StatusTooManyRequests)
return
}
next.ServeHTTP(w, r)
})
}
}
func main() {
tb := NewTokenBucket(10, 5) // capacity=10, fillRate=5 tokens/s
mux := http.NewServeMux()
mux.HandleFunc("/api", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("OK"))
})
http.ListenAndServe(":8080", RateLimit(tb)(mux))
}
This implementation:
β Supports burst traffic
β Thread-safe with mutex
β Middleware-ready
π¦ Sliding Window (Go)
var (
requests = make(map[string][]time.Time)
mu sync.Mutex
limit = 5
window = 10 * time.Second
)
func SlidingWindow(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip := r.RemoteAddr
now := time.Now()
mu.Lock()
history := requests[ip]
// keep only valid timestamps
valid := history[:0]
for _, t := range history {
if now.Sub(t) <= window {
valid = append(valid, t)
}
}
if len(valid) >= limit {
mu.Unlock()
http.Error(w, "Too Many Requests", http.StatusTooManyRequests)
return
}
requests[ip] = append(valid, now)
mu.Unlock()
next.ServeHTTP(w, r)
})
}
This pattern provides:
β High accuracy
β Great for security endpoints
β 4) Production-Ready Node.js Implementation
π§ Token Bucket (Node.js)
class TokenBucket {
constructor(capacity, fillRate) {
this.capacity = capacity;
this.tokens = capacity;
this.fillRate = fillRate;
this.lastRefill = Date.now();
}
allow() {
const now = Date.now();
const elapsed = (now - this.lastRefill) / 1000;
this.tokens = Math.min(
this.capacity,
this.tokens + elapsed * this.fillRate
);
this.lastRefill = now;
if (this.tokens >= 1) {
this.tokens -= 1;
return true;
}
return false;
}
}
function rateLimit(bucket) {
return (req, res, next) => {
if (!bucket.allow()) {
return res.status(429).send("Too Many Requests");
}
next();
};
}
Works great for:
β Microservices
β CPU-sensitive endpoints
β API Gateway layers
π§ Sliding Window (Node.js)
const windowMs = 10 * 1000;
const limit = 5;
const history = new Map();
function slidingWindow(req, res, next) {
const ip = req.ip;
const now = Date.now();
const reqLog = history.get(ip) || [];
const filtered = reqLog.filter(t => now - t <= windowMs);
if (filtered.length >= limit) {
return res.status(429).send("Too Many Requests");
}
filtered.push(now);
history.set(ip, filtered);
next();
}
Perfect for:
β Login rate limits
β Fraud prevention
β Abuse detection
β 5) Production Tips (VERY Important)
β Use Redis for Multi-Instance Environments
If you run multiple pods/containers, in-memory counters won't sync.
β Apply Multi-Layer Rate Limiting
- API Gateway (Cloudflare, Nginx, Traefik)
- Load Balancer
- Application-level
β Sliding Window β Best for Public APIs
Accurate and security-friendly.
β Token Bucket β Best for Microservices
Burst-friendly and efficient.
β Use exponential backoff for retries
Protects system stability.
β Log & Monitor
Expose metrics for:
- Rejected requests
- Token usage
- CPU/Memory load
β Return clear error responses
{
"error": "Too Many Requests",
"retry_after": 3
}
β Final Summary
Rate Limiting is the core defensive layer of any API in production.
It protects your system from:
- Traffic spikes
- Bots and attackers
- Resource overload
- Cost escalation
- Unstable user experience
Both Go and Node.js handle Rate Limiting effectively but with different strengths
π΅ NEXT EP JS2GO EP.44
Database Connections: SQL & NoSQL in Go and JavaScript
You will learn:
- PostgreSQL / MySQL / MongoDB / Redis connections
- Example code (Go Fiber + Node.js Express)
- Recommended ORM/Query Builders
- Connection Pool Best Practices
- Production Doβs & Don'ts