Pythonic Patterns

Pythonic patterns represent idiomatic Python code that leverages language features for clarity, efficiency, and maintainability. This skill covers list comprehensions for concise data transformation, generators for memory-efficient iteration, context managers for resource management, decorators for cross-cutting concerns, and duck typing for flexible interfaces. Mastering these patterns enables writing elegant, performant Python code that follows community conventions and best practices.

Key Concepts

List Comprehensions - Concise syntax for creating lists from iterables with optional filtering
Generator Expressions - Memory-efficient lazy evaluation for large sequences
Generator Functions - Functions using yield to produce values on-demand
Context Managers - Protocol for resource acquisition and cleanup using with statement
Decorators - Functions that modify or enhance other functions without changing their code
Duck Typing - Type compatibility based on behavior (“if it walks like a duck…”)
Protocols - Structural subtyping for defining interfaces without inheritance
Iterator Protocol - Implementing __iter__ and __next__ for custom iteration
EAFP vs LBYL - “Easier to ask forgiveness than permission” vs “Look before you leap”
Pythonic Idioms - Community-accepted patterns like enumerate, zip, and unpacking

Best Practices

Prefer comprehensions over map/filter - List/dict/set comprehensions are more readable than map/filter chains
Use generators for large sequences - Replace list comprehensions with generators for memory efficiency
Always use context managers - Use with statement for files, locks, and resources to ensure cleanup
Decorate for cross-cutting concerns - Apply decorators for logging, timing, caching, and validation
Embrace duck typing - Focus on object capabilities rather than explicit type checks
Use enumerate() and zip() - Avoid manual indexing; use built-in iteration helpers
Unpack sequences - Use tuple unpacking for multiple assignment and function returns
Prefer EAFP over LBYL - Try operations and handle exceptions rather than checking preconditions
Write generator functions - Use yield for streaming data and avoiding memory allocation
Define protocols for interfaces - Use typing.Protocol for structural typing without inheritance

Code Examples

List Comprehensions and Generator Expressions

# ✅ GOOD: List comprehension for data transformation
from typing import Sequence

def process_users(users: Sequence[dict]) -> list[dict]:
    """Transform user data using list comprehension."""
    return [
        {
            "id": user["id"],
            "name": user["name"].upper(),
            "email": user["email"].lower(),
            "active": user.get("is_active", True)
        }
        for user in users
        if user.get("verified", False)
    ]

# Dictionary comprehension
def create_user_index(users: list[dict]) -> dict[int, str]:
    """Create ID to name mapping."""
    return {
        user["id"]: user["name"]
        for user in users
    }

# Set comprehension
def extract_unique_domains(emails: list[str]) -> set[str]:
    """Extract unique email domains."""
    return {
        email.split("@")[1]
        for email in emails
        if "@" in email
    }

# Generator expression for memory efficiency
def sum_large_sequence(n: int) -> int:
    """Sum large sequence without creating list."""
    return sum(x**2 for x in range(n))
    # Constant memory vs list comprehension's O(n) memory

# Nested comprehension
def flatten_matrix(matrix: list[list[int]]) -> list[int]:
    """Flatten 2D matrix to 1D list."""
    return [
        item
        for row in matrix
        for item in row
    ]

# ❌ BAD: Verbose loops instead of comprehensions
def process_users_verbose(users: list[dict]) -> list[dict]:
    result = []
    for user in users:
        if user.get("verified", False):
            processed = {
                "id": user["id"],
                "name": user["name"].upper(),
                "email": user["email"].lower(),
                "active": user.get("is_active", True)
            }
            result.append(processed)
    return result

Generator Functions for Lazy Evaluation

# ✅ GOOD: Generator function for streaming data
from typing import Iterator, TypeVar
from collections.abc import Iterable

T = TypeVar('T')

def batch_iterator(
    items: Iterable[T],
    batch_size: int
) -> Iterator[list[T]]:
    """
    Yield items in batches of specified size.
    
    Memory efficient: only holds one batch in memory at a time.
    
    Args:
        items: Iterable of items to batch
        batch_size: Number of items per batch
        
    Yields:
        Lists of items with length up to batch_size
        
    Example:
        >>> for batch in batch_iterator(range(100), batch_size=10):
        ...     process_batch(batch)
    """
    batch = []
    for item in items:
        batch.append(item)
        if len(batch) == batch_size:
            yield batch
            batch = []
    
    # Yield remaining items
    if batch:
        yield batch

# Infinite generator
def fibonacci_generator() -> Iterator[int]:
    """
    Generate Fibonacci sequence indefinitely.
    
    Yields:
        Next Fibonacci number
        
    Example:
        >>> fib = fibonacci_generator()
        >>> [next(fib) for _ in range(10)]
        [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
    """
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Generator with state and send()
def coroutine_accumulator() -> Iterator[int]:
    """
    Coroutine that accumulates sent values.
    
    Yields:
        Running total after each send
        
    Example:
        >>> acc = coroutine_accumulator()
        >>> next(acc)  # Prime the coroutine
        0
        >>> acc.send(10)
        10
        >>> acc.send(5)
        15
    """
    total = 0
    while True:
        value = yield total
        if value is not None:
            total += value

# Generator for file processing
def read_large_file(file_path: str, chunk_size: int = 8192) -> Iterator[str]:
    """
    Read large file in chunks.
    
    Memory efficient: doesn't load entire file into memory.
    
    Args:
        file_path: Path to file
        chunk_size: Bytes to read per iteration
        
    Yields:
        File chunks as strings
    """
    with open(file_path, 'r') as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            yield chunk

# Pipeline with generators
def process_data_pipeline(file_path: str) -> Iterator[dict]:
    """
    Process data using generator pipeline.
    
    Each stage yields results one at a time, avoiding memory overhead.
    """
    # Stage 1: Read lines
    def read_lines():
        with open(file_path) as f:
            for line in f:
                yield line.strip()
    
    # Stage 2: Parse JSON
    import json
    def parse_json(lines):
        for line in lines:
            if line:
                yield json.loads(line)
    
    # Stage 3: Filter and transform
    def filter_transform(records):
        for record in records:
            if record.get("active"):
                yield {
                    "id": record["id"],
                    "value": record["value"] * 2
                }
    
    # Chain generators
    lines = read_lines()
    records = parse_json(lines)
    return filter_transform(records)

# ❌ BAD: Loading entire file into memory
def process_large_file_bad(file_path: str) -> list[dict]:
    with open(file_path) as f:
        lines = f.readlines()  # Loads entire file into memory
    
    records = [json.loads(line) for line in lines if line]
    filtered = [r for r in records if r.get("active")]
    return [{"id": r["id"], "value": r["value"] * 2} for r in filtered]

Context Managers for Resource Management

# ✅ GOOD: Custom context manager using class
from typing import Any, Iterator
from pathlib import Path
import tempfile

class TemporaryDirectory:
    """
    Context manager for temporary directory.
    
    Creates directory on enter, removes on exit.
    
    Example:
        with TemporaryDirectory() as tmpdir:
            file_path = tmpdir / "data.txt"
            file_path.write_text("content")
        # Directory automatically cleaned up
    """
    
    def __init__(self, prefix: str = "tmp"):
        self.prefix = prefix
        self.path: Path | None = None
    
    def __enter__(self) -> Path:
        """Create temporary directory."""
        self.path = Path(tempfile.mkdtemp(prefix=self.prefix))
        return self.path
    
    def __exit__(self, exc_type, exc_val, exc_tb) -> None:
        """Remove temporary directory."""
        if self.path and self.path.exists():
            import shutil
            shutil.rmtree(self.path)

# Context manager using contextlib
from contextlib import contextmanager
import time

@contextmanager
def timing_context(operation_name: str) -> Iterator[None]:
    """
    Context manager for timing operations.
    
    Args:
        operation_name: Name for timing report
        
    Yields:
        None
        
    Example:
        with timing_context("data processing"):
            process_large_dataset()
        # Prints: "data processing took 1.23s"
    """
    start = time.perf_counter()
    try:
        yield
    finally:
        duration = time.perf_counter() - start
        print(f"{operation_name} took {duration:.2f}s")

# Database transaction context manager
@contextmanager
def database_transaction(connection: Any) -> Iterator[Any]:
    """
    Context manager for database transactions.
    
    Commits on success, rolls back on exception.
    
    Args:
        connection: Database connection
        
    Yields:
        Connection object
        
    Example:
        with database_transaction(conn) as cursor:
            cursor.execute("INSERT INTO users VALUES (?)", (user_data,))
        # Automatically commits
    """
    cursor = connection.cursor()
    try:
        yield cursor
        connection.commit()
    except Exception:
        connection.rollback()
        raise
    finally:
        cursor.close()

# Async context manager
from typing import AsyncIterator
import aiohttp

class AsyncAPIClient:
    """Async context manager for HTTP client."""
    
    def __init__(self, base_url: str):
        self.base_url = base_url
        self.session: aiohttp.ClientSession | None = None
    
    async def __aenter__(self) -> "AsyncAPIClient":
        """Create session on enter."""
        self.session = aiohttp.ClientSession(base_url=self.base_url)
        return self
    
    async def __aexit__(self, exc_type, exc_val, exc_tb) -> None:
        """Close session on exit."""
        if self.session:
            await self.session.close()
    
    async def get(self, path: str) -> dict:
        """Make GET request."""
        if not self.session:
            raise RuntimeError("Session not initialized")
        async with self.session.get(path) as response:
            return await response.json()

# Usage with async with
async def fetch_data():
    async with AsyncAPIClient("https://api.example.com") as client:
        data = await client.get("/users")
        return data
    # Session automatically closed

# ❌ BAD: Manual resource management
def process_file_bad(file_path: str) -> str:
    file = open(file_path)
    content = file.read()
    file.close()  # Forgotten if exception occurs
    return content

# ✅ Fix: Always use context manager
def process_file_good(file_path: str) -> str:
    with open(file_path) as file:
        return file.read()

Decorators for Cross-Cutting Concerns

# ✅ GOOD: Function decorator with arguments
from functools import wraps
from typing import Callable, TypeVar, ParamSpec, Any
import time
import logging

P = ParamSpec('P')
T = TypeVar('T')

def retry(
    max_attempts: int = 3,
    delay: float = 1.0,
    exponential_backoff: bool = True
) -> Callable[[Callable[P, T]], Callable[P, T]]:
    """
    Decorator to retry function on exception.
    
    Args:
        max_attempts: Maximum number of retry attempts
        delay: Initial delay between retries in seconds
        exponential_backoff: Double delay after each retry
        
    Returns:
        Decorated function with retry logic
        
    Example:
        @retry(max_attempts=5, delay=2.0)
        def fetch_data():
            return requests.get("https://api.example.com/data")
    """
    def decorator(func: Callable[P, T]) -> Callable[P, T]:
        @wraps(func)
        def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
            current_delay = delay
            last_exception: Exception | None = None
            
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    last_exception = e
                    if attempt < max_attempts - 1:
                        time.sleep(current_delay)
                        if exponential_backoff:
                            current_delay *= 2
                        continue
                    raise
            
            raise last_exception or RuntimeError("All retries failed")
        
        return wrapper
    return decorator

# Logging decorator
def log_execution(
    logger: logging.Logger | None = None,
    log_args: bool = True
) -> Callable[[Callable[P, T]], Callable[P, T]]:
    """
    Decorator to log function execution.
    
    Args:
        logger: Logger instance (uses default if None)
        log_args: Whether to log function arguments
        
    Example:
        @log_execution(logger=my_logger, log_args=True)
        def process_data(data: list[int]) -> int:
            return sum(data)
    """
    if logger is None:
        logger = logging.getLogger(__name__)
    
    def decorator(func: Callable[P, T]) -> Callable[P, T]:
        @wraps(func)
        def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
            func_name = func.__name__
            
            if log_args:
                logger.info(f"Calling {func_name} with args={args}, kwargs={kwargs}")
            else:
                logger.info(f"Calling {func_name}")
            
            try:
                result = func(*args, **kwargs)
                logger.info(f"{func_name} completed successfully")
                return result
            except Exception as e:
                logger.error(f"{func_name} failed with {type(e).__name__}: {e}")
                raise
        
        return wrapper
    return decorator

# Class decorator
def singleton(cls: type[T]) -> type[T]:
    """
    Decorator to make class a singleton.
    
    Example:
        @singleton
        class Database:
            def __init__(self):
                self.connection = create_connection()
        
        db1 = Database()
        db2 = Database()
        assert db1 is db2  # Same instance
    """
    instances: dict[type, Any] = {}
    
    @wraps(cls)
    def get_instance(*args: Any, **kwargs: Any) -> T:
        if cls not in instances:
            instances[cls] = cls(*args, **kwargs)
        return instances[cls]
    
    return get_instance  # type: ignore

# Property decorator for validation
class User:
    """User with validated email property."""
    
    def __init__(self, email: str):
        self._email = ""
        self.email = email  # Triggers validation
    
    @property
    def email(self) -> str:
        """Get email address."""
        return self._email
    
    @email.setter
    def email(self, value: str) -> None:
        """Set email with validation."""
        if "@" not in value:
            raise ValueError("Invalid email address")
        self._email = value.lower()

# Stacking decorators
@retry(max_attempts=3)
@log_execution(log_args=False)
def fetch_remote_data(url: str) -> dict:
    """
    Fetch data with retry and logging.
    
    Decorators are applied bottom-to-top:
    1. log_execution wraps fetch_remote_data
    2. retry wraps the result of (1)
    """
    import requests
    response = requests.get(url)
    response.raise_for_status()
    return response.json()

# ❌ BAD: Duplicate cross-cutting code
def fetch_data1(url: str) -> dict:
    logger.info(f"Fetching {url}")
    for attempt in range(3):
        try:
            response = requests.get(url)
            response.raise_for_status()
            return response.json()
        except Exception as e:
            if attempt == 2:
                raise
            time.sleep(1)

def fetch_data2(url: str) -> dict:
    logger.info(f"Fetching {url}")
    for attempt in range(3):  # Duplicated retry logic
        try:
            response = requests.get(url)
            response.raise_for_status()
            return response.json()
        except Exception as e:
            if attempt == 2:
                raise
            time.sleep(1)

Duck Typing and Protocols

# ✅ GOOD: Protocol for structural typing
from typing import Protocol, runtime_checkable
from collections.abc import Iterable

@runtime_checkable
class Drawable(Protocol):
    """
    Protocol for drawable objects.
    
    Any class with draw() method satisfies this protocol.
    No inheritance required.
    """
    
    def draw(self) -> None:
        """Draw the object."""
        ...

class Circle:
    """Circle class - no inheritance from Drawable."""
    
    def __init__(self, radius: float):
        self.radius = radius
    
    def draw(self) -> None:
        """Draw circle."""
        print(f"Drawing circle with radius {self.radius}")

class Rectangle:
    """Rectangle class - also no inheritance."""
    
    def __init__(self, width: float, height: float):
        self.width = width
        self.height = height
    
    def draw(self) -> None:
        """Draw rectangle."""
        print(f"Drawing rectangle {self.width}x{self.height}")

def render_shapes(shapes: Iterable[Drawable]) -> None:
    """
    Render all shapes.
    
    Works with any object that has draw() method.
    Type checker validates protocol compliance.
    """
    for shape in shapes:
        shape.draw()

# Usage - type checker validates
shapes: list[Drawable] = [
    Circle(5.0),
    Rectangle(10.0, 20.0)
]
render_shapes(shapes)

# EAFP (Easier to Ask Forgiveness than Permission)
def get_value_eafp(data: dict, key: str, default: Any = None) -> Any:
    """
    Get value from dict using EAFP.
    
    Pythonic: try first, handle exception if it fails.
    """
    try:
        return data[key]
    except KeyError:
        return default

# ❌ BAD: LBYL (Look Before You Leap) - not Pythonic
def get_value_lbyl(data: dict, key: str, default: Any = None) -> Any:
    """Non-Pythonic: check before accessing."""
    if key in data:
        return data[key]
    else:
        return default
    # EAFP is faster when key usually exists (one dict lookup vs two)

# Duck typing with hasattr
def process_any_iterable(data: Any) -> list:
    """
    Process any iterable object.
    
    Uses duck typing: if it's iterable, use it.
    """
    try:
        return list(data)
    except TypeError:
        # Not iterable, wrap in list
        return [data]

# Protocol for file-like objects
@runtime_checkable
class FileLike(Protocol):
    """Protocol for file-like objects."""
    
    def read(self, size: int = -1) -> str:
        """Read from file."""
        ...
    
    def write(self, data: str) -> int:
        """Write to file."""
        ...

def process_file(file: FileLike) -> str:
    """
    Process any file-like object.
    
    Works with real files, StringIO, custom implementations.
    """
    content = file.read()
    processed = content.upper()
    file.write(processed)
    return processed

# ❌ BAD: Explicit type checking instead of duck typing
def render_shapes_bad(shapes: list) -> None:
    for shape in shapes:
        if isinstance(shape, (Circle, Rectangle)):  # Rigid type checking
            shape.draw()
        else:
            raise TypeError(f"Invalid shape type: {type(shape)}")

Pythonic Idioms

# ✅ GOOD: enumerate() instead of manual indexing
def process_with_index(items: list[str]) -> None:
    """Process items with their indices."""
    for index, item in enumerate(items, start=1):
        print(f"{index}. {item}")

# ❌ BAD: Manual indexing
def process_with_index_bad(items: list[str]) -> None:
    for i in range(len(items)):
        print(f"{i + 1}. {items[i]}")

# ✅ GOOD: zip() for parallel iteration
def merge_lists(names: list[str], ages: list[int]) -> list[dict]:
    """Merge two lists into list of dicts."""
    return [
        {"name": name, "age": age}
        for name, age in zip(names, ages)
    ]

# ✅ GOOD: Tuple unpacking
def get_user_info() -> tuple[str, int, str]:
    """Return user information as tuple."""
    return "John", 30, "john@example.com"

name, age, email = get_user_info()  # Unpacking

# Swap variables
a, b = b, a

# Extended unpacking
first, *middle, last = [1, 2, 3, 4, 5]
# first=1, middle=[2, 3, 4], last=5

# ✅ GOOD: Default dict values with get()
user_data = {"name": "John"}
age = user_data.get("age", 0)  # Default 0 if missing

# ✅ GOOD: String joining
words = ["hello", "world"]
sentence = " ".join(words)  # "hello world"

# ❌ BAD: String concatenation in loop
sentence = ""
for word in words:
    sentence += word + " "

# ✅ GOOD: any() and all() for boolean checks
numbers = [1, 2, 3, 4, 5]
has_even = any(n % 2 == 0 for n in numbers)
all_positive = all(n > 0 for n in numbers)

# ❌ BAD: Manual boolean accumulation
has_even = False
for n in numbers:
    if n % 2 == 0:
        has_even = True
        break

# ✅ GOOD: Walrus operator (Python 3.8+)
def process_data(data: list[int]) -> int:
    """Use walrus operator to avoid duplicate computation."""
    if (result := compute_expensive(data)) > 100:
        return result * 2
    return result

# ✅ GOOD: reversed() for backward iteration
for item in reversed(items):
    process(item)

# ❌ BAD: Manual reverse indexing
for i in range(len(items) - 1, -1, -1):
    process(items[i])

Anti-Patterns

Overly Complex Comprehensions

# ❌ Avoid: Unreadable nested comprehension
result = [
    item.upper()
    for sublist in data
    for item in sublist
    if item
    if len(item) > 5
    if item.startswith("test")
]

# ✅ Fix: Use regular loop for clarity
result = []
for sublist in data:
    for item in sublist:
        if item and len(item) > 5 and item.startswith("test"):
            result.append(item.upper())

Not Using Context Managers

# ❌ Avoid: Manual resource cleanup
file = open("data.txt")
data = file.read()
file.close()

# ✅ Fix: Use context manager
with open("data.txt") as file:
    data = file.read()

Modifying List During Iteration

# ❌ Avoid: Modifying list while iterating
items = [1, 2, 3, 4, 5]
for item in items:
    if item % 2 == 0:
        items.remove(item)  # Undefined behavior!

# ✅ Fix: Create new list with comprehension
items = [item for item in items if item % 2 != 0]

Testing Strategies

Testing Custom Context Managers

import pytest

def test_context_manager_cleanup():
    """Test that context manager cleans up resources."""
    with TemporaryDirectory() as tmpdir:
        assert tmpdir.exists()
        test_file = tmpdir / "test.txt"
        test_file.write_text("content")
        assert test_file.exists()
    
    # After context exit, directory should be gone
    assert not tmpdir.exists()

def test_context_manager_exception_handling():
    """Test cleanup happens even on exception."""
    tmpdir_path = None
    
    with pytest.raises(ValueError):
        with TemporaryDirectory() as tmpdir:
            tmpdir_path = tmpdir
            raise ValueError("Test error")
    
    # Cleanup should still happen
    assert not tmpdir_path.exists()

Testing Decorators

def test_retry_decorator():
    """Test that decorator retries on failure."""
    call_count = 0
    
    @retry(max_attempts=3, delay=0.01)
    def failing_function():
        nonlocal call_count
        call_count += 1
        if call_count < 3:
            raise ValueError("Not yet")
        return "success"
    
    result = failing_function()
    
    assert result == "success"
    assert call_count == 3

def test_retry_decorator_max_attempts():
    """Test that decorator respects max attempts."""
    @retry(max_attempts=2, delay=0.01)
    def always_fails():
        raise ValueError("Always fails")
    
    with pytest.raises(ValueError, match="Always fails"):
        always_fails()

Testing Generators

def test_batch_iterator():
    """Test batch iterator produces correct batches."""
    items = list(range(25))
    batches = list(batch_iterator(items, batch_size=10))
    
    assert len(batches) == 3
    assert len(batches[0]) == 10
    assert len(batches[1]) == 10
    assert len(batches[2]) == 5
    
    # Verify content
    assert batches[0] == list(range(10))
    assert batches[2] == list(range(20, 25))

References

python-best-practices.md - Type hints for protocols
async-python.md - Async context managers and generators
python-performance.md - Generator efficiency
python-testing.md - Testing Pythonic patterns

Pythonic Patterns

Pythonic Patterns

Key Concepts

Best Practices

Code Examples

List Comprehensions and Generator Expressions

Generator Functions for Lazy Evaluation

Context Managers for Resource Management

Decorators for Cross-Cutting Concerns

Duck Typing and Protocols

Pythonic Idioms

Anti-Patterns

Overly Complex Comprehensions

Not Using Context Managers

Modifying List During Iteration

Testing Strategies

Testing Custom Context Managers

Testing Decorators

Testing Generators

References

Related Skills