Skip to content
Home / Skills / Database / Schema Design
DA

Schema Design

Database core v1.0.0

Schema Design

Overview

Database schema design transforms domain models and API contracts into relational database structures. Good schema design balances data integrity (normalization) with query performance (strategic denormalization), using appropriate data types, constraints, and relationships. In the full-lifecycle pipeline, schema design in Phase 5 takes API resource models from Phase 4 and produces the data foundation for Phase 6 backend implementation.


Key Concepts

Domain-to-Schema Mapping

┌─────────────────────────────────────────────────────────────┐
│              Domain Model → Schema Mapping                   │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  Domain Entity    →   Database Table                         │
│  Value Object     →   Embedded columns or lookup table       │
│  Aggregate Root   →   Primary table with owned children      │
│  Association      →   Foreign key or join table              │
│  Enum             →   VARCHAR + CHECK or reference table     │
│  Collection       →   Child table with FK (1:N)              │
│  Bi-directional   →   FK on owning side only                │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Normalization Guide

Normal FormRuleExample Violation
1NFAtomic values, no repeating groupstags: "java,spring" → split to join table
2NFNo partial dependencies on composite PKOrder item stores product name → separate
3NFNo transitive dependenciesOrder stores customer city → use FK to customer
BCNFEvery determinant is a candidate keyRare — apply only if 3NF has anomalies

Relationship Patterns

RelationshipSchema PatternJPA Annotation
1:1FK with UNIQUE on child@OneToOne
1:NFK on the “many” side@OneToMany / @ManyToOne
M:NJoin table with two FKs@ManyToMany (or explicit join entity)
Self-referentialFK to same table@ManyToOne on same entity
PolymorphicSingle table with discriminator OR table-per-type@Inheritance

Best Practices

  1. Start with 3NF — Normalize first, denormalize only when query patterns demand it
  2. Use BIGSERIAL for internal PKs — 8 bytes, sequential, index-friendly; UUID for distributed/exposed IDs
  3. Always include audit columnscreated_at, updated_at, created_by, version
  4. Use TIMESTAMPTZ, never TIMESTAMP — Always store timezone-aware timestamps
  5. Use NUMERIC(19,4) for money — Never FLOAT or DOUBLE for financial data
  6. Add CHECK constraints — Enforce business rules at database level
  7. Prefer VARCHAR(n) over TEXT for bounded fields — Explicit limits catch bugs
  8. Design for soft delete only when audit trail is required — Otherwise, hard delete with GDPR compliance

Code Examples

✅ Good: Production Schema

-- Audit function (reusable)
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = NOW();
    RETURN NEW;
END;
$$ language 'plpgsql';

-- Users table
CREATE TABLE users (
    id              BIGSERIAL PRIMARY KEY,
    email           VARCHAR(255) NOT NULL,
    display_name    VARCHAR(100) NOT NULL,
    role            VARCHAR(50) NOT NULL DEFAULT 'CUSTOMER',
    is_active       BOOLEAN NOT NULL DEFAULT true,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    version         INTEGER NOT NULL DEFAULT 0,
    
    CONSTRAINT uq_users_email UNIQUE (email),
    CONSTRAINT chk_users_role CHECK (role IN ('CUSTOMER', 'AGENT', 'ADMIN'))
);

CREATE TRIGGER trg_users_updated_at
    BEFORE UPDATE ON users
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();

CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_users_role ON users (role) WHERE is_active = true;

❌ Bad: Common Mistakes

-- No constraints, wrong types, no audit
CREATE TABLE users (
    id INT,                    -- Should be BIGSERIAL, no PK
    email TEXT,                -- No UNIQUE, no NOT NULL
    name TEXT,                 -- No length limit
    created TIMESTAMP,         -- Missing timezone (TZ)
    balance FLOAT              -- Never FLOAT for money
);

Anti-Patterns

  1. God Table — One table with 50+ columns (split into related tables)
  2. EAV (Entity-Attribute-Value) — Storing everything as key-value pairs (use JSONB for flexible schema)
  3. No Foreign Keys — “We’ll enforce it in the application” (data integrity belongs in the DB)
  4. Implicit Enums — Using integers for status without documentation (use VARCHAR + CHECK)
  5. Missing Indexes on FKs — Every foreign key column should have an index
  6. Premature Denormalization — Optimizing for performance before measuring

Testing Strategies

  • Migration Tests — Run Flyway migrations against TestContainers PostgreSQL
  • Constraint Tests — Verify NOT NULL, UNIQUE, CHECK constraints with invalid data
  • Referential Integrity Tests — Verify FK cascades and restrictions
  • Schema Drift Detection — Compare schema in DB vs. migration scripts

References