Golang The Series EP 126: Implementing DDoS Protection and Rate Limiting for High Availability

Welcome back, fellow Gophers, to the most intensive series on Go development!

In EP 125, we laid the foundation of secure communication using TLS and WSS. It was like building a "Great Wall" around your city—strong, sturdy, and excellent at keeping secrets.

However... even the strongest wall can crumble if millions of people try to rush through the city gates simultaneously. This is what we call a DDoS (Distributed Denial of Service) attack. In the world of Backend development, it doesn't matter how optimized your code is; if you lack a proper traffic management system, your service will go down the moment a botnet starts flooding it with requests.

Today, we are going to build a "Smart Immigration Gate" by implementing Rate Limiting and DDoS Protection at the Application Layer (Layer 7) using Go.

1. Why Does Your System Need Rate Limiting? (The Why)

If you allow every request to access your resources (Database, CPU, Memory) freely, you are exposed to three major risks:

Resource Exhaustion: A single malicious user (or a buggy client) could write a loop that fires requests until your Database connections are maxed out.
Cost Management: If you use serverless or cloud services (like AWS Lambda or APIs billed per request), a bot attack can literally drain your bank account overnight.
Security Risks: Allowing unlimited login attempts is a massive vulnerability for Brute-force attacks.

2. Deep Dive into Rate Limiting Algorithms: Which One to Choose?

Before we jump into the code, we need to understand the logic behind "throttling." There are several popular methods in the Go ecosystem:

Token Bucket (Most Popular in Go): Imagine a bucket filled with tokens (coins) that are added at a constant rate. Every time a request comes in, it must take one token. If the bucket is empty, the request must wait or be rejected. This is great because it allows for "Bursts" (temporary spikes in usage).
Leaky Bucket: Imagine a bucket with a hole at the bottom. Water (requests) enters at any speed but leaks out at a constant rate. If the bucket overflows, the extra water is discarded. This is ideal for systems that require a very smooth, constant traffic flow.
Fixed Window Counter: This counts requests within a specific timeframe (e.g., 100 requests per minute). However, it suffers from the "Edge Case" problem where a user could send 100 requests at 11:59:59 and another 100 at 12:00:01, effectively doubling the limit in a two-second window.

3. Implementation: Using golang.org/x/time/rate

The Go team provides a powerful implementation of the Token Bucket algorithm in the rate package (part of the sub-repositories).

The Core: NewLimiter(r, b)

r (Limit): The rate at which tokens are added to the bucket (tokens per second).
b (Burst): The maximum number of tokens the bucket can hold (the maximum capacity for simultaneous requests).

Advanced Code Example: IP-Based Middleware

Instead of hardcoding limits inside our handlers, we will create a Middleware to make our protection reusable across any endpoint.

Go
package main

import (
	"net/http"
	"sync"
	"time"

	"golang.org/x/time/rate"
)

// IPlimiter stores the Limiter state for each unique IP
type IPlimiter struct {
	ips map[string]*rate.Limiter
	mu  sync.RWMutex
}

func NewIPlimiter() *IPlimiter {
	return &IPlimiter{
		ips: make(map[string]*rate.Limiter),
	}
}

// GetLimiter finds or creates a new Limiter for a specific IP
func (i *IPlimiter) GetLimiter(ip string) *rate.Limiter {
	i.mu.Lock()
	defer i.mu.Unlock()

	limiter, exists := i.ips[ip]
	if !exists {
		// Allow 2 requests per second with a Burst capacity of 5
		limiter = rate.NewLimiter(rate.Every(500*time.Millisecond), 5)
		i.ips[ip] = limiter
	}

	return limiter
}

func limitMiddleware(next http.Handler, iplimiter *IPlimiter) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		// Get the user's IP (Note: check X-Forwarded-For if behind a Proxy/Load Balancer)
		ip := r.RemoteAddr

		limiter := iplimiter.GetLimiter(ip)
		if !limiter.Allow() {
			w.Header().Set("X-RateLimit-Limit", "2")
			w.Header().Set("X-RateLimit-Remaining", "0")
			http.Error(w, "Too Many Requests: Please slow down, our servers are breathing.", http.StatusTooManyRequests)
			return
		}

		next.ServeHTTP(w, r)
	})
}

func main() {
	iplimiter := NewIPlimiter()
	mux := http.NewServeMux()

	handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		w.Write([]byte("Welcome to Superdev Academy API!"))
	})

	// Wrap the handler with our Middleware
	http.ListenAndServe(":8080", limitMiddleware(handler, iplimiter))
}

Why do we need sync.RWMutex? In Go, the http.Server handles requests using Goroutines. If multiple requests from different IPs arrive simultaneously and we try to write to the map at the same time without a lock, we will trigger a Race Condition and the program will panic.

4. Elevating DDoS Protection at the Application Level (L7)

Rate limiting by count alone might not be enough to stop sophisticated DDoS attacks designed to exhaust resources. We should add these strategies:

A. Strict Server Timeouts

Some DDoS attacks, like Slowloris, try to keep connections open as long as possible until the server can't accept new clients. We must configure http.Server strictly:

Go
server := &http.Server{
    Addr:         ":8080",
    ReadTimeout:  2 * time.Second,  // Headers must be read within 2 seconds
    WriteTimeout: 5 * time.Second,  // Response must be sent within 5 seconds
    IdleTimeout:  30 * time.Second, // Close inactive connections
}

B. Distributed Rate Limiting (The Redis Approach)

The code above has one weakness: it stores state in memory. If you run 3 server instances (Scaling), each will count requests independently. A user could theoretically send 3x more requests than intended.

The Solution: Use Redis as a centralized counter. Use libraries like go-redis or the redis-cell module to maintain a global state across all your instances.

5. Best Practices & Caveats

Don't Block Legitimate Users: Setting limits too strictly can break the experience for real users (e.g., a page loading many images at once). Always tune your Burst settings.
Graceful Response: When blocking a request, always send HTTP Status 429 Too Many Requests and include a Retry-After header to tell the client when they can try again.
Whitelisting: Don't forget to whitelist internal services or trusted partners so they don't get caught in your "immigration gate."

Conclusion

In EP 126, we've built a defense system that lets our server "breathe" during traffic spikes. Understanding the Token Bucket algorithm and managing IP-level state are the skills that separate "coders" from "system architects."

But... blocking traffic is just the beginning. What happens when your internal services (like other Microservices) start slowing down or failing? How do we prevent the whole architecture from collapsing like a house of cards?

Get ready, because in EP 127, we will dive into Connection Management & Circuit Breakers—the automatic "kill switches" that will save your system from a domino effect failure. Stay tuned!

Article by: Superdev Academy "Empowering you to master code through real-world challenges." If you found this article helpful, don't forget to share it with your fellow devs and follow us on all platforms!

Follow Us: