Connection Pooling
Connection Pooling Evolution – Step by Step
Step 1: The Problem – Traditional Database Connections
Without Connection Pooling:
HTTP Request → API Server → Database
↓ ↓ ↓
POST Create Accept
New New
Connection Connection
↓ ↓
Fire Query → Process
↓ ↓
Get Response ← Return
↓ ↓
Close Conn ← Close Conn
ASCII Flow Diagram:
Client → API Server → Database
| | |
| [New TCP Conn] |
| | |
| [3-way handshake]|
| | |
| [Fire Query] |
| | |
| [Get Response] |
| | |
| [2-way teardown] |
| | |
Pros:
- Simple implementation
- No resource management needed
- Straightforward debugging
Cons:
- High overhead for micro-operations
- TCP handshake/teardown for every request
- Database connection limits quickly exhausted
- Poor performance under load
Performance Impact Example:
SQL Query Time: 2ms
Connection Setup: 1ms
Total Time: 3ms
Overhead: 50% (1ms/2ms)
vs.
SQL Query Time: 100ms
Connection Setup: 1ms
Total Time: 101ms
Overhead: 1% (1ms/100ms)
Step 2: Understanding the Bottleneck
Connection Overhead Analysis:
Micro-operations (2ms query):
┌─────────────┬──────────┐
│ Setup (1ms) │Query(2ms)│ = 50% overhead
└─────────────┴──────────┘
Large operations (100ms query):
┌─┬────────────────────────────────────────────────────────┐
│S│ Query (100ms) │ = 1% overhead
└─┴────────────────────────────────────────────────────────┘
Connection Limits Issue:
Database Max Connections: 100
Concurrent Requests: 1000
Result: ERROR - "too many connections"
Timeline:
Request 1-100: ✓ Connected
Request 101: ✗ Connection refused
Request 102: ✗ Connection refused
...
Request 1000: ✗ Connection refused
Step 3: Connection Pool Concept
Basic Pool Architecture:
Connection Pool
┌─────────────────────────────┐
│ [Conn1] [Conn2] [Conn3] │
│ [Conn4] [Conn5] ... │
└─────────────────────────────┘
↑ ↓
Get() Put()
↑ ↓
API Requests
Pool Workflow:
1. Server Startup:
Pool = [Conn1, Conn2, Conn3] (Pre-established)
2. Request Arrives:
connection = pool.Get() // Takes Conn1
Pool = [Conn2, Conn3]
3. Execute Query:
result = connection.query("SELECT ...")
4. Return Connection:
pool.Put(connection) // Returns Conn1
Pool = [Conn2, Conn3, Conn1]
Step 4: Implementation Details
Blocking Queue Implementation:
Pool Structure:
┌─────────────────────────────────┐
│ Connections Array: │
│ [DB_Conn1, DB_Conn2, DB_Conn3] │
│ │
│ Channel (blocking queue): │
│ [signal1, signal2, signal3] │
│ │
│ Mutex Lock for thread safety │
└─────────────────────────────────┘
Get() Operation:
func Get() Connection {
<-pool.ch // Wait for signal (blocks if empty)
pool.lock.Lock() // Exclusive access
conn = pool.connections[0] // Take first connection
pool.connections = pool.connections[1:] // Remove from array
pool.lock.Unlock()
return conn
}
Put() Operation:
func Put(conn Connection) {
pool.lock.Lock()
pool.connections = append(pool.connections, conn) // Add back
pool.lock.Unlock()
pool.ch <- struct{}{} // Signal availability
}
Step 5: Blocking Behavior Example
Scenario: Pool Size = 3, Requests = 5
Time T0: Pool = [C1, C2, C3]
Requests: [R1, R2, R3, R4, R5] arrive
Time T1:
R1 gets C1, R2 gets C2, R3 gets C3
Pool = []
R4, R5 are BLOCKED waiting
Time T2:
R1 completes, returns C1
Pool = [C1]
R4 gets C1, continues
R5 still BLOCKED
Time T3:
R2 completes, returns C2
Pool = [C2]
R5 gets C2, continues
Step 6: Performance Comparison
Benchmark Results from Transcript:
Without Connection Pool:
10 goroutines: 43ms ✓
100 goroutines: 57ms ✓
1000 goroutines: ERROR - "too many connections" ✗
With Connection Pool:
10 goroutines: 24ms ✓
100 goroutines: 30ms ✓
1000 goroutines: Works! ✓ (but slower due to queuing)
Performance Benefits:
┌─────────────────┬──────────┬──────────────┐
│ Metric │ No Pool │ With Pool │
├─────────────────┼──────────┼──────────────┤
│ Connection Time │ Every req│ One-time │
│ Resource Usage │ High │ Controlled │
│ Scalability │ Limited │ Excellent │
│ Error Rate │ High │ Low │
└─────────────────┴──────────┴──────────────┘
Step 7: Advanced Configuration
Min/Max Pool Settings:
Pool Configuration:
┌─────────────────────────────┐
│ minConnections: 3 │
│ maxConnections: 10 │
│ idleTimeout: 5 minutes │
│ createOnDemand: true │
└─────────────────────────────┘
Startup: Pool = [C1, C2, C3] (min connections)
High Load: Pool = [C1...C10] (scales up to max)
Idle Period: Pool = [C1, C2, C3] (scales down, closes idle)
Connection Lifecycle:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Created │───▶│ In Pool │───▶│ In Use │
│ │ │ (Available)│ │ │
└─────────────┘ └─────────────┘ └──────┬──────┘
▲ ▲ │
│ └─────────────────┘
│ (Returned)
│
┌─────────────┐
│ Closed │◀─── (Idle timeout or shutdown)
│ (Removed) │
└─────────────┘
Step 8: Real-World Examples
Popular Connection Pool Libraries:
Go:
// pgx for PostgreSQL
pool, err := pgxpool.Connect(ctx, "postgres://...")
// mysql2 equivalent
db := sql.Open("mysql", "user:password@tcp(localhost:3306)/db")
db.SetMaxOpenConns(10)
db.SetMaxIdleConns(5)
Java:
// HikariCP
HikariConfig config = new HikariConfig();
config.setMaximumPoolSize(10);
config.setMinimumIdle(3);
config.setIdleTimeout(300000); // 5 minutes
Python:
# SQLAlchemy
engine = create_engine(
'postgresql://user:pass@localhost/db',
pool_size=10,
max_overflow=5,
pool_pre_ping=True
)
Industry Usage Examples:
Netflix: Hystrix connection pools for microservices
Uber: Connection pooling for high-frequency ride matching
Amazon RDS: Built-in connection pooling with RDS Proxy
Google: Connection pools in Cloud SQL
Facebook: TAO system uses connection pooling for social graph
Step 9: Common Pitfalls & Solutions
Scaling Pitfall:
Before Scaling:
1 server × 10 min connections = 10 DB connections ✓
After Auto-scaling:
100 servers × 10 min connections = 1000 DB connections ✗
Result: Database crashes!
Solution: Adjust min connections when scaling
100 servers × 2 min connections = 200 DB connections ✓
Configuration Guidelines:
Database Max Connections: 1000
Number of Servers: 10
Average Load per Server: 50 connections
Recommended Settings per Server:
┌─────────────────────┬─────────┐
│ Min Connections │ 10 │ (50/5)
│ Max Connections │ 75 │ (50 × 1.5)
│ Idle Timeout │ 5 min │
│ Create on Demand │ true │
└─────────────────────┴─────────┘
Total DB Connections:
Min: 10 × 10 = 100 (safe startup)
Max: 75 × 10 = 750 (within DB limit of 1000)
Step 10: Monitoring & Optimization
Key Metrics to Track:
Pool Health Dashboard:
┌─────────────────────────────────────┐
│ Active Connections: █████ 5/10 │
│ Idle Connections: ███░░ 3/10 │
│ Queue Wait Time: 2ms avg │
│ Connection Create Rate: 0.5/sec │
│ Connection Close Rate: 0.3/sec │
│ Pool Exhaustion Events: 0 │
└─────────────────────────────────────┘
Optimization Strategies:
Problem: High queue wait times
Solution: Increase max connections
Problem: Too many idle connections
Solution: Decrease idle timeout
Problem: Frequent connection creation
Solution: Increase min connections
Problem: Database overload
Solution: Implement circuit breaker pattern
Summary: Evolution Benefits
Final Architecture:
Client Requests → Load Balancer → API Servers (with pools) → Database
↓
┌─────────────────────────────────────┐
│ Connection Pool │
│ ┌───┐ ┌───┐ ┌───┐ ┌───┐ ┌───┐ │
│ │C1 │ │C2 │ │C3 │ │C4 │ │C5 │ │
│ └───┘ └───┘ └───┘ └───┘ └───┘ │
│ Managed & Reused │
└─────────────────────────────────────┘
Key Improvements:
- Performance: 50% overhead → ~1% overhead for micro-operations
- Reliability: No more “too many connections” errors
- Scalability: Handles 1000+ concurrent requests gracefully
- Resource Control: Predictable database load
- Auto-scaling Safe: Configurable limits prevent DB overload