DA
Indexing
Database performance v1.0.0
Database Indexing
Overview
Database indexing is the primary mechanism for query performance optimization. Proper index strategy ensures queries complete in milliseconds rather than seconds, but over-indexing degrades write performance and wastes storage. In the full-lifecycle pipeline, the @database-engineer agent generates index recommendations during Phase 5 based on API query patterns identified in Phase 4.
Key Concepts
PostgreSQL Index Types
| Index Type | Use Case | Operators | Example |
|---|---|---|---|
| B-tree (default) | Equality, range, sorting | =, <, >, BETWEEN, ORDER BY | CREATE INDEX idx_users_email ON users (email) |
| Hash | Equality only | = | Rarely used (B-tree handles equality too) |
| GIN | Array, JSONB, full-text | @>, ?, @@, && | CREATE INDEX idx_data ON items USING GIN (metadata) |
| GiST | Geometric, range, proximity | <<, >>, @>, <@ | CREATE INDEX idx_geo ON locations USING GIST (coords) |
| BRIN | Large tables with natural ordering | <, >, = | CREATE INDEX idx_ts ON events USING BRIN (created_at) |
Index Anatomy
┌──────────────────────────────────────────────────┐
│ B-tree Index │
├──────────────────────────────────────────────────┤
│ │
│ Root Node: [ M ] │
│ / \ │
│ Internal: [ D H ] [ R V ] │
│ / | \ / | \ │
│ Leaf: [A-C][D-G] [R-T][V-Z] → table rows │
│ │
│ Leaf nodes are doubly-linked for range scans │
└──────────────────────────────────────────────────┘
Index Strategies
| Strategy | Pattern | When to Use |
|---|---|---|
| Single-column | CREATE INDEX ON t (col) | Simple equality/range on one column |
| Composite | CREATE INDEX ON t (a, b, c) | Multi-column WHERE/ORDER BY (leftmost prefix rule) |
| Partial | CREATE INDEX ON t (col) WHERE active = true | Query only touches a subset of rows |
| Covering | CREATE INDEX ON t (a) INCLUDE (b, c) | Index-only scans (avoid heap access) |
| Expression | CREATE INDEX ON t (LOWER(email)) | Queries use function on column |
| Unique | CREATE UNIQUE INDEX ON t (email) | Enforce uniqueness + fast lookup |
Best Practices
- Index every foreign key — JOINs on un-indexed FKs cause full table scans
- Use composite indexes wisely — Column order matters: high-cardinality first for equality, range column last
- Use partial indexes for hot data —
WHERE is_active = truewhen most rows are inactive - Use CONCURRENTLY in production —
CREATE INDEX CONCURRENTLYavoids table locks - Monitor index usage — Drop unused indexes (
pg_stat_user_indexes.idx_scan = 0) - Keep indexes under 5 per table for write-heavy tables — Each index adds write overhead
- Use EXPLAIN ANALYZE — Always verify index usage before and after
Code Examples
✅ Good: Strategic Index Design
-- Orders table with well-designed indexes
CREATE TABLE orders (
id BIGSERIAL PRIMARY KEY,
customer_id BIGINT NOT NULL REFERENCES customers(id),
status VARCHAR(20) NOT NULL DEFAULT 'PENDING',
total_amount NUMERIC(19,4) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
-- FK index (mandatory)
CREATE INDEX idx_orders_customer_id ON orders (customer_id);
-- Partial index: only active orders queried frequently
CREATE INDEX idx_orders_pending ON orders (customer_id, created_at)
WHERE status = 'PENDING';
-- Covering index for dashboard query (index-only scan)
CREATE INDEX idx_orders_status_amount ON orders (status)
INCLUDE (total_amount, created_at);
-- EXPLAIN ANALYZE verification
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT customer_id, total_amount, created_at
FROM orders
WHERE status = 'PENDING' AND customer_id = 42;
-- Expected: Index Scan using idx_orders_pending
❌ Bad: Index Anti-Patterns
-- Redundant index (composite already covers single-column)
CREATE INDEX idx_orders_customer ON orders (customer_id);
CREATE INDEX idx_orders_customer_status ON orders (customer_id, status);
-- First index is redundant — composite covers single-column queries
-- Index on low-cardinality boolean (not selective)
CREATE INDEX idx_orders_is_deleted ON orders (is_deleted);
-- Only 2 values — use partial index instead
-- Too many indexes on write-heavy table
CREATE INDEX idx1 ON orders (col1);
CREATE INDEX idx2 ON orders (col2);
CREATE INDEX idx3 ON orders (col3);
CREATE INDEX idx4 ON orders (col4);
CREATE INDEX idx5 ON orders (col5);
CREATE INDEX idx6 ON orders (col6);
-- 6 indexes = 6x write amplification
Anti-Patterns
- Index everything — Over-indexing kills write performance and wastes storage
- Missing FK indexes — Implicit sequential scans on JOINs
- Wrong column order in composite — Equality columns first, range column last
- Not using CONCURRENTLY — Table locks during index creation cause downtime
- Never checking index usage — Unused indexes waste space and slow writes
- Indexing low-cardinality columns — Boolean/status columns with 2-3 values (use partial)
Testing Strategies
- EXPLAIN ANALYZE in tests — Assert expected index usage in integration tests
- Index-only scan verification — Check
Heap Fetches: 0for covering indexes - Performance benchmarks — Measure query time with and without index on realistic data
- Write performance impact — Benchmark INSERT/UPDATE throughput with index count changes
- pg_stat_user_indexes — Monitor
idx_scancounts in staging/production