Skip to main content

Sharding vs Replication vs Partitioning

A comprehensive guide to database scaling strategies with practical examples and decision frameworks.

Overviewโ€‹

These three strategies solve different problems in distributed systems:

  • Replication: Copying the same data across multiple servers
  • Partitioning: Splitting data logically within a database
  • Sharding: Distributing data across multiple physical databases

๐Ÿ”„ Replicationโ€‹

What is it?โ€‹

Replication creates multiple copies of the same dataset across different servers. Each replica contains the full dataset (or a subset in partial replication).

Architecture Patternsโ€‹

Master-Slave Replicationโ€‹

Master-Master Replicationโ€‹

Key Characteristicsโ€‹

AspectDescription
PurposeHigh availability & read scalability
Data DistributionFull copy on each server
ConsistencyEventual or strong (depending on sync method)
Failure ImpactRedundancy protects against server failure

Replication Strategiesโ€‹

Synchronous Replication

  • Write confirmed only after all replicas acknowledge
  • Strong consistency
  • Higher latency
  • Lower availability (all replicas must be online)

Asynchronous Replication

  • Write confirmed immediately
  • Eventual consistency
  • Lower latency
  • Higher availability
  • Risk of data loss on master failure

Use Casesโ€‹

โœ… When to use Replication:

  • High read-to-write ratio (95% reads)
  • Need for high availability and fault tolerance
  • Geographic distribution for low latency
  • Analytics workloads on read replicas
  • Disaster recovery requirements

โŒ When NOT to use:

  • Write-heavy workloads (all replicas must handle writes)
  • Storage constraints (full copy per replica)
  • Complex conflict resolution required

Example Scenarioโ€‹

E-commerce Product Catalog

Master DB: Handles product updates (rare)
Replica 1 (US East): Serves customer browsing
Replica 2 (US West): Serves customer browsing
Replica 3 (Europe): Serves customer browsing

๐Ÿ“Š Partitioningโ€‹

What is it?โ€‹

Partitioning divides a large table into smaller pieces within the same database instance. Each partition contains a subset of rows but shares the same schema.

Partitioning Typesโ€‹

Horizontal Partitioning (Row-based)โ€‹

Vertical Partitioning (Column-based)โ€‹

Partitioning Strategiesโ€‹

Range Partitioning

-- By date
Partition P1: created_at BETWEEN '2023-01-01' AND '2023-12-31'
Partition P2: created_at BETWEEN '2024-01-01' AND '2024-12-31'

-- By ID
Partition P1: user_id BETWEEN 1 AND 1000000
Partition P2: user_id BETWEEN 1000001 AND 2000000

Hash Partitioning

-- Distribute evenly using hash function
Partition P1: HASH(user_id) % 4 = 0
Partition P2: HASH(user_id) % 4 = 1
Partition P3: HASH(user_id) % 4 = 2
Partition P4: HASH(user_id) % 4 = 3

List Partitioning

-- By discrete values
Partition P1: region IN ('US', 'Canada')
Partition P2: region IN ('UK', 'Germany', 'France')
Partition P3: region IN ('India', 'Singapore')

Key Characteristicsโ€‹

AspectDescription
PurposeQuery performance & manageability
ScopeSingle database instance
TransparencyOften transparent to application
ScalabilityLimited by single server resources

Use Casesโ€‹

โœ… When to use Partitioning:

  • Large tables slowing down queries
  • Time-series data with predictable access patterns
  • Easy data archival/deletion (drop old partitions)
  • Maintenance operations on specific data ranges
  • Query patterns that filter on partition key

โŒ When NOT to use:

  • Small tables (< 1GB)
  • Queries don't align with partition key
  • Need to scale beyond single server
  • Uniform access across all data

Example Scenarioโ€‹

Analytics Platform with Time-Series Data

logs_2024_01: January data (can archive after 90 days)
logs_2024_02: February data
logs_2024_03: March data (hot data, frequently queried)
logs_2024_04: April data (hot data, frequently queried)

๐Ÿ—‚๏ธ Shardingโ€‹

What is it?โ€‹

Sharding distributes data across multiple independent database servers (shards). Each shard contains a unique subset of the data and operates autonomously.

Sharding Architectureโ€‹

Sharding with Replicationโ€‹

Sharding Strategiesโ€‹

Range-Based Sharding

Shard 1: user_id 1 to 1,000,000
Shard 2: user_id 1,000,001 to 2,000,000
Shard 3: user_id 2,000,001 to 3,000,000

Hash-Based Sharding

Shard = HASH(user_id) % number_of_shards

user_id 12345 โ†’ HASH โ†’ 2 โ†’ Shard 2
user_id 67890 โ†’ HASH โ†’ 1 โ†’ Shard 1

Geographic Sharding

Shard US: users where region = 'US'
Shard EU: users where region = 'EU'
Shard ASIA: users where region = 'ASIA'

Entity/Directory-Based Sharding

Shard mapping stored in lookup table:
tenant_id: 1001 โ†’ Shard 1
tenant_id: 1002 โ†’ Shard 3
tenant_id: 1003 โ†’ Shard 1

Key Characteristicsโ€‹

AspectDescription
PurposeHorizontal scalability for writes & storage
ScopeMultiple independent database servers
ComplexityHigh (routing, joins, transactions)
ScalabilityNearly unlimited (add more shards)

Challengesโ€‹

Cross-Shard Queries

-- This query spans multiple shards
SELECT COUNT(*) FROM users WHERE created_at > '2024-01-01'
-- Must query all shards and aggregate results

Cross-Shard Joins

-- Users on Shard 1, Orders on Shard 2
SELECT users.name, orders.total
FROM users
JOIN orders ON users.id = orders.user_id
-- Very expensive or impossible

Distributed Transactions

  • Maintaining ACID across shards is complex
  • Often requires 2-phase commit
  • Can use eventual consistency instead

Rebalancing

  • Adding/removing shards requires data migration
  • Can cause downtime or performance issues
  • Hash-based sharding makes this harder

Use Casesโ€‹

โœ… When to use Sharding:

  • Massive data volume (TBs/PBs)
  • High write throughput requirements
  • Single database can't handle load
  • Multi-tenant applications (shard by tenant)
  • Data sovereignty requirements (geographic sharding)

โŒ When NOT to use:

  • Can scale with replication/partitioning
  • Frequent cross-shard queries needed
  • Complex transactions across entities
  • Small team without devops expertise

Example Scenarioโ€‹

Social Media Platform

Shard 1: Users with last_name A-F (10M users)
Shard 2: Users with last_name G-M (10M users)
Shard 3: Users with last_name N-S (10M users)
Shard 4: Users with last_name T-Z (10M users)

Each shard has its own master + 2 replicas

๐ŸŽฏ Comparison Matrixโ€‹

FeatureReplicationPartitioningSharding
Data DistributionFull copy per nodeSubset per partitionSubset per shard
ScalabilityRead scalabilitySingle server limitUnlimited horizontal
Write PerformanceNo improvementImproved for specific queriesLinear improvement
Read PerformanceLinear improvementImproved for partition-aware queriesImproved
ComplexityLow-MediumLowHigh
Storage CostHigh (full copies)Same as originalDistributed
Query ComplexitySimpleSimpleComplex (cross-shard)
Failure ImpactOther replicas availableSingle point of failureOnly affected shard
Setup DifficultyEasyEasyDifficult
MaintenanceMediumEasyDifficult

๐ŸŽจ Combined Strategiesโ€‹

Real-world systems often combine these approaches:

Example: E-commerce Platformโ€‹

Strategy:

  • Sharding by geographic region (data locality)
  • Replication within each shard (high availability + read scaling)
  • Partitioning by year (query performance + easy archival)

๐Ÿšฆ Decision Treeโ€‹


๐Ÿ“ Quick Decision Guideโ€‹

Start with Replication if:โ€‹

  • ๐ŸŽฏ Your reads outnumber writes 10:1 or more
  • ๐ŸŽฏ You need high availability
  • ๐ŸŽฏ You want geographic distribution
  • ๐ŸŽฏ Your data fits comfortably on one server
  • ๐ŸŽฏ You're starting out and need simple scaling

Add Partitioning if:โ€‹

  • ๐ŸŽฏ Single tables are becoming very large (>100GB)
  • ๐ŸŽฏ Queries have predictable access patterns (time-based, range-based)
  • ๐ŸŽฏ You need to archive/delete old data regularly
  • ๐ŸŽฏ Query performance is degrading despite indexes
  • ๐ŸŽฏ You're still on a single database server

Move to Sharding when:โ€‹

  • ๐ŸŽฏ You've exhausted vertical scaling (bigger servers)
  • ๐ŸŽฏ Replication doesn't help (write bottleneck)
  • ๐ŸŽฏ Data exceeds single server capacity (multi-TB)
  • ๐ŸŽฏ You need linear write scalability
  • ๐ŸŽฏ You have multi-tenant architecture
  • ๐ŸŽฏ Data sovereignty requires geographic separation

๐Ÿ’ก Best Practicesโ€‹

Replicationโ€‹

  • Monitor replication lag
  • Use read replicas for analytics
  • Implement retry logic for failover
  • Consider async for performance, sync for consistency

Partitioningโ€‹

  • Choose partition key carefully (used in WHERE clauses)
  • Keep partitions relatively equal in size
  • Plan for growth (add partitions in advance)
  • Use partition pruning in queries

Shardingโ€‹

  • Choose a stable shard key (won't change)
  • Distribute data evenly (avoid hot shards)
  • Plan for rebalancing from day one
  • Minimize cross-shard operations
  • Use consistent hashing for dynamic sharding
  • Keep shard mapping external (not hardcoded)

๐Ÿ”ง Implementation Examplesโ€‹

Replication Setup (PostgreSQL)โ€‹

-- On Master
CREATE PUBLICATION my_publication FOR ALL TABLES;

-- On Replica
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=master_host dbname=mydb user=repl_user'
PUBLICATION my_publication;

Partitioning Setup (PostgreSQL)โ€‹

-- Create partitioned table
CREATE TABLE orders (
order_id BIGSERIAL,
created_at TIMESTAMP,
customer_id INTEGER,
total DECIMAL
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE orders_2023 PARTITION OF orders
FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE orders_2024 PARTITION OF orders
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

Sharding Logic (Application Level)โ€‹

def get_shard(user_id, num_shards=4):
"""Hash-based sharding"""
shard_id = hash(user_id) % num_shards
return f"shard_{shard_id}"

def get_connection(user_id):
"""Get database connection for user"""
shard = get_shard(user_id)
return connection_pool[shard]

# Usage
user_id = 12345
db = get_connection(user_id)
user = db.query("SELECT * FROM users WHERE id = ?", user_id)

๐Ÿ“š Summaryโ€‹

Replication = Same data, multiple places โ†’ High availability & read scaling

Partitioning = Split data logically, same server โ†’ Query performance & manageability

Sharding = Split data physically, multiple servers โ†’ Unlimited horizontal scaling

Start simple (replication + partitioning), move to sharding only when necessary. Most applications never need sharding.


๐ŸŽ“ Further Readingโ€‹

  • PostgreSQL Replication Documentation
  • MySQL Sharding Strategies
  • MongoDB Sharding Architecture
  • Vitess (Sharding layer for MySQL)
  • Citus (Distributed PostgreSQL)
  • Consistent Hashing Algorithms
  • CAP Theorem and Distributed Systems