Sharding Architecture
This document describes how General Bots distributes data across multiple database shards for horizontal scaling.
Overview
Sharding enables General Bots to scale beyond single-database limits by distributing data across multiple database instances. Each shard contains a subset of tenants, and data never crosses shard boundaries during normal operations.
Shard Configuration
Shard Config Table
The shard_config table defines all available shards:
CREATE TABLE shard_config (
shard_id SMALLINT PRIMARY KEY,
region_code CHAR(3) NOT NULL, -- ISO 3166-1 alpha-3: USA, BRA, DEU
datacenter VARCHAR(32) NOT NULL, -- e.g., 'us-east-1', 'eu-west-1'
connection_string TEXT NOT NULL, -- Encrypted connection string
is_primary BOOLEAN DEFAULT false,
is_active BOOLEAN DEFAULT true,
min_tenant_id BIGINT NOT NULL,
max_tenant_id BIGINT NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
Example Configuration
-- Americas
INSERT INTO shard_config VALUES
(1, 'USA', 'us-east-1', 'postgresql://shard1.db:5432/gb', true, true, 1, 1000000),
(2, 'USA', 'us-west-2', 'postgresql://shard2.db:5432/gb', false, true, 1000001, 2000000),
(3, 'BRA', 'sa-east-1', 'postgresql://shard3.db:5432/gb', false, true, 2000001, 3000000);
-- Europe
INSERT INTO shard_config VALUES
(4, 'DEU', 'eu-central-1', 'postgresql://shard4.db:5432/gb', false, true, 3000001, 4000000),
(5, 'GBR', 'eu-west-2', 'postgresql://shard5.db:5432/gb', false, true, 4000001, 5000000);
-- Asia Pacific
INSERT INTO shard_config VALUES
(6, 'SGP', 'ap-southeast-1', 'postgresql://shard6.db:5432/gb', false, true, 5000001, 6000000),
(7, 'JPN', 'ap-northeast-1', 'postgresql://shard7.db:5432/gb', false, true, 6000001, 7000000);
Tenant-to-Shard Mapping
Mapping Table
CREATE TABLE tenant_shard_map (
tenant_id BIGINT PRIMARY KEY,
shard_id SMALLINT NOT NULL REFERENCES shard_config(shard_id),
region_code CHAR(3) NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
Routing Logic
When a request comes in, the system:
- Extracts
tenant_idfrom the request context - Looks up
shard_idfromtenant_shard_map - Routes the query to the appropriate database connection
#![allow(unused)]
fn main() {
// Rust routing example
pub fn get_shard_connection(tenant_id: i64) -> Result<DbConnection> {
let shard_id = SHARD_MAP.get(&tenant_id)
.ok_or_else(|| Error::TenantNotFound(tenant_id))?;
CONNECTION_POOLS.get(shard_id)
.ok_or_else(|| Error::ShardNotAvailable(*shard_id))
}
}
Data Model Requirements
Every Table Includes Shard Keys
All tables must include tenant_id and shard_id columns:
CREATE TABLE user_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id BIGINT NOT NULL, -- Required for routing
shard_id SMALLINT NOT NULL, -- Denormalized for queries
user_id UUID NOT NULL,
bot_id UUID NOT NULL,
-- ... other columns
);
Foreign Keys Within Shard Only
Foreign keys only reference tables within the same shard:
-- Good: Same shard reference
ALTER TABLE message_history
ADD CONSTRAINT fk_session
FOREIGN KEY (session_id) REFERENCES user_sessions(id);
-- Bad: Cross-shard reference (never do this)
-- FOREIGN KEY (other_tenant_data) REFERENCES other_shard.table(id)
Snowflake ID Generation
For globally unique, time-sortable IDs across shards:
CREATE OR REPLACE FUNCTION generate_snowflake_id(p_shard_id SMALLINT)
RETURNS BIGINT AS $$
DECLARE
epoch BIGINT := 1704067200000; -- 2024-01-01 00:00:00 UTC
ts BIGINT;
seq BIGINT;
BEGIN
-- 41 bits: timestamp (milliseconds since epoch)
ts := (EXTRACT(EPOCH FROM NOW()) * 1000)::BIGINT - epoch;
-- 10 bits: shard_id (0-1023)
-- 12 bits: sequence (0-4095)
seq := nextval('global_seq') & 4095;
RETURN (ts << 22) | ((p_shard_id & 1023) << 12) | seq;
END;
$$ LANGUAGE plpgsql;
ID Structure
64-bit Snowflake ID
┌─────────────────────────────────────────────────────────────────┐
│ 41 bits timestamp │ 10 bits shard │ 12 bits sequence │
│ (69 years range) │ (1024 shards) │ (4096/ms/shard) │
└─────────────────────────────────────────────────────────────────┘
Shard Operations
Creating a New Shard
- Provision new database instance
- Run migrations
- Add to
shard_config - Update routing configuration
- Begin assigning new tenants
# 1. Run migrations on new shard
DATABASE_URL="postgresql://new-shard:5432/gb" diesel migration run
# 2. Add shard config
psql -c "INSERT INTO shard_config VALUES (8, 'AUS', 'ap-southeast-2', '...', false, true, 7000001, 8000000);"
# 3. Reload routing
curl -X POST http://localhost:3000/api/admin/reload-shard-config
Tenant Migration Between Shards
Moving a tenant to a different shard (e.g., for data locality):
-- 1. Set tenant to read-only mode
UPDATE tenants SET settings = settings || '{"read_only": true}' WHERE id = 12345;
-- 2. Export tenant data
pg_dump -t 'user_sessions' -t 'message_history' --where="tenant_id=12345" source_db > tenant_12345.sql
-- 3. Import to new shard
psql target_db < tenant_12345.sql
-- 4. Update routing
UPDATE tenant_shard_map SET shard_id = 5, region_code = 'DEU' WHERE tenant_id = 12345;
-- 5. Remove read-only mode
UPDATE tenants SET settings = settings - 'read_only' WHERE id = 12345;
-- 6. Clean up source shard (after verification)
DELETE FROM user_sessions WHERE tenant_id = 12345;
DELETE FROM message_history WHERE tenant_id = 12345;
Query Patterns
Single-Tenant Queries (Most Common)
-- Efficient: Uses shard routing
SELECT * FROM user_sessions
WHERE tenant_id = 12345 AND user_id = 'abc-123';
Cross-Shard Queries (Admin Only)
For global analytics, use a federation layer:
-- Using postgres_fdw for cross-shard reads
SELECT shard_id, COUNT(*) as session_count
FROM all_shards.user_sessions
WHERE created_at > NOW() - INTERVAL '1 day'
GROUP BY shard_id;
Scatter-Gather Pattern
For queries that must touch multiple shards:
#![allow(unused)]
fn main() {
async fn get_global_stats() -> Stats {
let futures: Vec<_> = SHARDS.iter()
.map(|shard| get_shard_stats(shard.id))
.collect();
let results = futures::future::join_all(futures).await;
results.into_iter().fold(Stats::default(), |acc, s| acc.merge(s))
}
}
High Availability
Per-Shard Replication
Each shard should have:
- 1 Primary (read/write)
- 1-2 Replicas (read-only, failover)
- Async replication with < 1s lag
Shard 1 Architecture:
┌─────────────┐
│ Primary │◄──── Writes
└──────┬──────┘
│ Streaming Replication
┌───┴───┐
▼ ▼
┌──────┐ ┌──────┐
│Rep 1 │ │Rep 2 │◄──── Reads
└──────┘ └──────┘
Failover Configuration
# config.csv
shard-1-primary,postgresql://shard1-primary:5432/gb
shard-1-replica-1,postgresql://shard1-replica1:5432/gb
shard-1-replica-2,postgresql://shard1-replica2:5432/gb
shard-1-failover-priority,replica-1,replica-2
Monitoring
Key Metrics Per Shard
| Metric | Warning | Critical |
|---|---|---|
| Connection pool usage | > 70% | > 90% |
| Query latency p99 | > 100ms | > 500ms |
| Replication lag | > 1s | > 10s |
| Disk usage | > 70% | > 85% |
| Tenant count | > 80% capacity | > 95% capacity |
Shard Health Check
-- Run on each shard
SELECT
current_setting('cluster_name') as shard,
pg_is_in_recovery() as is_replica,
pg_last_wal_receive_lsn() as wal_position,
pg_postmaster_start_time() as uptime_since,
(SELECT count(*) FROM pg_stat_activity) as connections,
(SELECT count(DISTINCT tenant_id) FROM tenants) as tenant_count;
Best Practices
- Shard by tenant, not by table - Keep all tenant data together
- Avoid cross-shard transactions - Design for eventual consistency where needed
- Pre-allocate tenant ranges - Leave room for growth in each shard
- Monitor shard hotspots - Rebalance if one shard gets too busy
- Test failover regularly - Ensure replicas can be promoted
- Use connection pooling - PgBouncer or similar for each shard
- Cache shard routing - Don’t query
tenant_shard_mapon every request
Migration from Single Database
To migrate an existing single-database deployment to sharded:
- Add
shard_idcolumn to all tables (default to 1) - Deploy shard routing code (disabled)
- Set up additional shard databases
- Enable routing for new tenants only
- Gradually migrate existing tenants during low-traffic windows
- Decommission original database when empty
See Regional Deployment for multi-region considerations.