PY
Pythonic Patterns
Python patterns v1.0.0
Pythonic Patterns
Pythonic patterns represent idiomatic Python code that leverages language features for clarity, efficiency, and maintainability. This skill covers list comprehensions for concise data transformation, generators for memory-efficient iteration, context managers for resource management, decorators for cross-cutting concerns, and duck typing for flexible interfaces. Mastering these patterns enables writing elegant, performant Python code that follows community conventions and best practices.
Key Concepts
- List Comprehensions - Concise syntax for creating lists from iterables with optional filtering
- Generator Expressions - Memory-efficient lazy evaluation for large sequences
- Generator Functions - Functions using yield to produce values on-demand
- Context Managers - Protocol for resource acquisition and cleanup using
withstatement - Decorators - Functions that modify or enhance other functions without changing their code
- Duck Typing - Type compatibility based on behavior (“if it walks like a duck…”)
- Protocols - Structural subtyping for defining interfaces without inheritance
- Iterator Protocol - Implementing
__iter__and__next__for custom iteration - EAFP vs LBYL - “Easier to ask forgiveness than permission” vs “Look before you leap”
- Pythonic Idioms - Community-accepted patterns like enumerate, zip, and unpacking
Best Practices
- Prefer comprehensions over map/filter - List/dict/set comprehensions are more readable than map/filter chains
- Use generators for large sequences - Replace list comprehensions with generators for memory efficiency
- Always use context managers - Use
withstatement for files, locks, and resources to ensure cleanup - Decorate for cross-cutting concerns - Apply decorators for logging, timing, caching, and validation
- Embrace duck typing - Focus on object capabilities rather than explicit type checks
- Use enumerate() and zip() - Avoid manual indexing; use built-in iteration helpers
- Unpack sequences - Use tuple unpacking for multiple assignment and function returns
- Prefer EAFP over LBYL - Try operations and handle exceptions rather than checking preconditions
- Write generator functions - Use yield for streaming data and avoiding memory allocation
- Define protocols for interfaces - Use typing.Protocol for structural typing without inheritance
Code Examples
List Comprehensions and Generator Expressions
# ✅ GOOD: List comprehension for data transformation
from typing import Sequence
def process_users(users: Sequence[dict]) -> list[dict]:
"""Transform user data using list comprehension."""
return [
{
"id": user["id"],
"name": user["name"].upper(),
"email": user["email"].lower(),
"active": user.get("is_active", True)
}
for user in users
if user.get("verified", False)
]
# Dictionary comprehension
def create_user_index(users: list[dict]) -> dict[int, str]:
"""Create ID to name mapping."""
return {
user["id"]: user["name"]
for user in users
}
# Set comprehension
def extract_unique_domains(emails: list[str]) -> set[str]:
"""Extract unique email domains."""
return {
email.split("@")[1]
for email in emails
if "@" in email
}
# Generator expression for memory efficiency
def sum_large_sequence(n: int) -> int:
"""Sum large sequence without creating list."""
return sum(x**2 for x in range(n))
# Constant memory vs list comprehension's O(n) memory
# Nested comprehension
def flatten_matrix(matrix: list[list[int]]) -> list[int]:
"""Flatten 2D matrix to 1D list."""
return [
item
for row in matrix
for item in row
]
# ❌ BAD: Verbose loops instead of comprehensions
def process_users_verbose(users: list[dict]) -> list[dict]:
result = []
for user in users:
if user.get("verified", False):
processed = {
"id": user["id"],
"name": user["name"].upper(),
"email": user["email"].lower(),
"active": user.get("is_active", True)
}
result.append(processed)
return result
Generator Functions for Lazy Evaluation
# ✅ GOOD: Generator function for streaming data
from typing import Iterator, TypeVar
from collections.abc import Iterable
T = TypeVar('T')
def batch_iterator(
items: Iterable[T],
batch_size: int
) -> Iterator[list[T]]:
"""
Yield items in batches of specified size.
Memory efficient: only holds one batch in memory at a time.
Args:
items: Iterable of items to batch
batch_size: Number of items per batch
Yields:
Lists of items with length up to batch_size
Example:
>>> for batch in batch_iterator(range(100), batch_size=10):
... process_batch(batch)
"""
batch = []
for item in items:
batch.append(item)
if len(batch) == batch_size:
yield batch
batch = []
# Yield remaining items
if batch:
yield batch
# Infinite generator
def fibonacci_generator() -> Iterator[int]:
"""
Generate Fibonacci sequence indefinitely.
Yields:
Next Fibonacci number
Example:
>>> fib = fibonacci_generator()
>>> [next(fib) for _ in range(10)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
"""
a, b = 0, 1
while True:
yield a
a, b = b, a + b
# Generator with state and send()
def coroutine_accumulator() -> Iterator[int]:
"""
Coroutine that accumulates sent values.
Yields:
Running total after each send
Example:
>>> acc = coroutine_accumulator()
>>> next(acc) # Prime the coroutine
0
>>> acc.send(10)
10
>>> acc.send(5)
15
"""
total = 0
while True:
value = yield total
if value is not None:
total += value
# Generator for file processing
def read_large_file(file_path: str, chunk_size: int = 8192) -> Iterator[str]:
"""
Read large file in chunks.
Memory efficient: doesn't load entire file into memory.
Args:
file_path: Path to file
chunk_size: Bytes to read per iteration
Yields:
File chunks as strings
"""
with open(file_path, 'r') as file:
while True:
chunk = file.read(chunk_size)
if not chunk:
break
yield chunk
# Pipeline with generators
def process_data_pipeline(file_path: str) -> Iterator[dict]:
"""
Process data using generator pipeline.
Each stage yields results one at a time, avoiding memory overhead.
"""
# Stage 1: Read lines
def read_lines():
with open(file_path) as f:
for line in f:
yield line.strip()
# Stage 2: Parse JSON
import json
def parse_json(lines):
for line in lines:
if line:
yield json.loads(line)
# Stage 3: Filter and transform
def filter_transform(records):
for record in records:
if record.get("active"):
yield {
"id": record["id"],
"value": record["value"] * 2
}
# Chain generators
lines = read_lines()
records = parse_json(lines)
return filter_transform(records)
# ❌ BAD: Loading entire file into memory
def process_large_file_bad(file_path: str) -> list[dict]:
with open(file_path) as f:
lines = f.readlines() # Loads entire file into memory
records = [json.loads(line) for line in lines if line]
filtered = [r for r in records if r.get("active")]
return [{"id": r["id"], "value": r["value"] * 2} for r in filtered]
Context Managers for Resource Management
# ✅ GOOD: Custom context manager using class
from typing import Any, Iterator
from pathlib import Path
import tempfile
class TemporaryDirectory:
"""
Context manager for temporary directory.
Creates directory on enter, removes on exit.
Example:
with TemporaryDirectory() as tmpdir:
file_path = tmpdir / "data.txt"
file_path.write_text("content")
# Directory automatically cleaned up
"""
def __init__(self, prefix: str = "tmp"):
self.prefix = prefix
self.path: Path | None = None
def __enter__(self) -> Path:
"""Create temporary directory."""
self.path = Path(tempfile.mkdtemp(prefix=self.prefix))
return self.path
def __exit__(self, exc_type, exc_val, exc_tb) -> None:
"""Remove temporary directory."""
if self.path and self.path.exists():
import shutil
shutil.rmtree(self.path)
# Context manager using contextlib
from contextlib import contextmanager
import time
@contextmanager
def timing_context(operation_name: str) -> Iterator[None]:
"""
Context manager for timing operations.
Args:
operation_name: Name for timing report
Yields:
None
Example:
with timing_context("data processing"):
process_large_dataset()
# Prints: "data processing took 1.23s"
"""
start = time.perf_counter()
try:
yield
finally:
duration = time.perf_counter() - start
print(f"{operation_name} took {duration:.2f}s")
# Database transaction context manager
@contextmanager
def database_transaction(connection: Any) -> Iterator[Any]:
"""
Context manager for database transactions.
Commits on success, rolls back on exception.
Args:
connection: Database connection
Yields:
Connection object
Example:
with database_transaction(conn) as cursor:
cursor.execute("INSERT INTO users VALUES (?)", (user_data,))
# Automatically commits
"""
cursor = connection.cursor()
try:
yield cursor
connection.commit()
except Exception:
connection.rollback()
raise
finally:
cursor.close()
# Async context manager
from typing import AsyncIterator
import aiohttp
class AsyncAPIClient:
"""Async context manager for HTTP client."""
def __init__(self, base_url: str):
self.base_url = base_url
self.session: aiohttp.ClientSession | None = None
async def __aenter__(self) -> "AsyncAPIClient":
"""Create session on enter."""
self.session = aiohttp.ClientSession(base_url=self.base_url)
return self
async def __aexit__(self, exc_type, exc_val, exc_tb) -> None:
"""Close session on exit."""
if self.session:
await self.session.close()
async def get(self, path: str) -> dict:
"""Make GET request."""
if not self.session:
raise RuntimeError("Session not initialized")
async with self.session.get(path) as response:
return await response.json()
# Usage with async with
async def fetch_data():
async with AsyncAPIClient("https://api.example.com") as client:
data = await client.get("/users")
return data
# Session automatically closed
# ❌ BAD: Manual resource management
def process_file_bad(file_path: str) -> str:
file = open(file_path)
content = file.read()
file.close() # Forgotten if exception occurs
return content
# ✅ Fix: Always use context manager
def process_file_good(file_path: str) -> str:
with open(file_path) as file:
return file.read()
Decorators for Cross-Cutting Concerns
# ✅ GOOD: Function decorator with arguments
from functools import wraps
from typing import Callable, TypeVar, ParamSpec, Any
import time
import logging
P = ParamSpec('P')
T = TypeVar('T')
def retry(
max_attempts: int = 3,
delay: float = 1.0,
exponential_backoff: bool = True
) -> Callable[[Callable[P, T]], Callable[P, T]]:
"""
Decorator to retry function on exception.
Args:
max_attempts: Maximum number of retry attempts
delay: Initial delay between retries in seconds
exponential_backoff: Double delay after each retry
Returns:
Decorated function with retry logic
Example:
@retry(max_attempts=5, delay=2.0)
def fetch_data():
return requests.get("https://api.example.com/data")
"""
def decorator(func: Callable[P, T]) -> Callable[P, T]:
@wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
current_delay = delay
last_exception: Exception | None = None
for attempt in range(max_attempts):
try:
return func(*args, **kwargs)
except Exception as e:
last_exception = e
if attempt < max_attempts - 1:
time.sleep(current_delay)
if exponential_backoff:
current_delay *= 2
continue
raise
raise last_exception or RuntimeError("All retries failed")
return wrapper
return decorator
# Logging decorator
def log_execution(
logger: logging.Logger | None = None,
log_args: bool = True
) -> Callable[[Callable[P, T]], Callable[P, T]]:
"""
Decorator to log function execution.
Args:
logger: Logger instance (uses default if None)
log_args: Whether to log function arguments
Example:
@log_execution(logger=my_logger, log_args=True)
def process_data(data: list[int]) -> int:
return sum(data)
"""
if logger is None:
logger = logging.getLogger(__name__)
def decorator(func: Callable[P, T]) -> Callable[P, T]:
@wraps(func)
def wrapper(*args: P.args, **kwargs: P.kwargs) -> T:
func_name = func.__name__
if log_args:
logger.info(f"Calling {func_name} with args={args}, kwargs={kwargs}")
else:
logger.info(f"Calling {func_name}")
try:
result = func(*args, **kwargs)
logger.info(f"{func_name} completed successfully")
return result
except Exception as e:
logger.error(f"{func_name} failed with {type(e).__name__}: {e}")
raise
return wrapper
return decorator
# Class decorator
def singleton(cls: type[T]) -> type[T]:
"""
Decorator to make class a singleton.
Example:
@singleton
class Database:
def __init__(self):
self.connection = create_connection()
db1 = Database()
db2 = Database()
assert db1 is db2 # Same instance
"""
instances: dict[type, Any] = {}
@wraps(cls)
def get_instance(*args: Any, **kwargs: Any) -> T:
if cls not in instances:
instances[cls] = cls(*args, **kwargs)
return instances[cls]
return get_instance # type: ignore
# Property decorator for validation
class User:
"""User with validated email property."""
def __init__(self, email: str):
self._email = ""
self.email = email # Triggers validation
@property
def email(self) -> str:
"""Get email address."""
return self._email
@email.setter
def email(self, value: str) -> None:
"""Set email with validation."""
if "@" not in value:
raise ValueError("Invalid email address")
self._email = value.lower()
# Stacking decorators
@retry(max_attempts=3)
@log_execution(log_args=False)
def fetch_remote_data(url: str) -> dict:
"""
Fetch data with retry and logging.
Decorators are applied bottom-to-top:
1. log_execution wraps fetch_remote_data
2. retry wraps the result of (1)
"""
import requests
response = requests.get(url)
response.raise_for_status()
return response.json()
# ❌ BAD: Duplicate cross-cutting code
def fetch_data1(url: str) -> dict:
logger.info(f"Fetching {url}")
for attempt in range(3):
try:
response = requests.get(url)
response.raise_for_status()
return response.json()
except Exception as e:
if attempt == 2:
raise
time.sleep(1)
def fetch_data2(url: str) -> dict:
logger.info(f"Fetching {url}")
for attempt in range(3): # Duplicated retry logic
try:
response = requests.get(url)
response.raise_for_status()
return response.json()
except Exception as e:
if attempt == 2:
raise
time.sleep(1)
Duck Typing and Protocols
# ✅ GOOD: Protocol for structural typing
from typing import Protocol, runtime_checkable
from collections.abc import Iterable
@runtime_checkable
class Drawable(Protocol):
"""
Protocol for drawable objects.
Any class with draw() method satisfies this protocol.
No inheritance required.
"""
def draw(self) -> None:
"""Draw the object."""
...
class Circle:
"""Circle class - no inheritance from Drawable."""
def __init__(self, radius: float):
self.radius = radius
def draw(self) -> None:
"""Draw circle."""
print(f"Drawing circle with radius {self.radius}")
class Rectangle:
"""Rectangle class - also no inheritance."""
def __init__(self, width: float, height: float):
self.width = width
self.height = height
def draw(self) -> None:
"""Draw rectangle."""
print(f"Drawing rectangle {self.width}x{self.height}")
def render_shapes(shapes: Iterable[Drawable]) -> None:
"""
Render all shapes.
Works with any object that has draw() method.
Type checker validates protocol compliance.
"""
for shape in shapes:
shape.draw()
# Usage - type checker validates
shapes: list[Drawable] = [
Circle(5.0),
Rectangle(10.0, 20.0)
]
render_shapes(shapes)
# EAFP (Easier to Ask Forgiveness than Permission)
def get_value_eafp(data: dict, key: str, default: Any = None) -> Any:
"""
Get value from dict using EAFP.
Pythonic: try first, handle exception if it fails.
"""
try:
return data[key]
except KeyError:
return default
# ❌ BAD: LBYL (Look Before You Leap) - not Pythonic
def get_value_lbyl(data: dict, key: str, default: Any = None) -> Any:
"""Non-Pythonic: check before accessing."""
if key in data:
return data[key]
else:
return default
# EAFP is faster when key usually exists (one dict lookup vs two)
# Duck typing with hasattr
def process_any_iterable(data: Any) -> list:
"""
Process any iterable object.
Uses duck typing: if it's iterable, use it.
"""
try:
return list(data)
except TypeError:
# Not iterable, wrap in list
return [data]
# Protocol for file-like objects
@runtime_checkable
class FileLike(Protocol):
"""Protocol for file-like objects."""
def read(self, size: int = -1) -> str:
"""Read from file."""
...
def write(self, data: str) -> int:
"""Write to file."""
...
def process_file(file: FileLike) -> str:
"""
Process any file-like object.
Works with real files, StringIO, custom implementations.
"""
content = file.read()
processed = content.upper()
file.write(processed)
return processed
# ❌ BAD: Explicit type checking instead of duck typing
def render_shapes_bad(shapes: list) -> None:
for shape in shapes:
if isinstance(shape, (Circle, Rectangle)): # Rigid type checking
shape.draw()
else:
raise TypeError(f"Invalid shape type: {type(shape)}")
Pythonic Idioms
# ✅ GOOD: enumerate() instead of manual indexing
def process_with_index(items: list[str]) -> None:
"""Process items with their indices."""
for index, item in enumerate(items, start=1):
print(f"{index}. {item}")
# ❌ BAD: Manual indexing
def process_with_index_bad(items: list[str]) -> None:
for i in range(len(items)):
print(f"{i + 1}. {items[i]}")
# ✅ GOOD: zip() for parallel iteration
def merge_lists(names: list[str], ages: list[int]) -> list[dict]:
"""Merge two lists into list of dicts."""
return [
{"name": name, "age": age}
for name, age in zip(names, ages)
]
# ✅ GOOD: Tuple unpacking
def get_user_info() -> tuple[str, int, str]:
"""Return user information as tuple."""
return "John", 30, "john@example.com"
name, age, email = get_user_info() # Unpacking
# Swap variables
a, b = b, a
# Extended unpacking
first, *middle, last = [1, 2, 3, 4, 5]
# first=1, middle=[2, 3, 4], last=5
# ✅ GOOD: Default dict values with get()
user_data = {"name": "John"}
age = user_data.get("age", 0) # Default 0 if missing
# ✅ GOOD: String joining
words = ["hello", "world"]
sentence = " ".join(words) # "hello world"
# ❌ BAD: String concatenation in loop
sentence = ""
for word in words:
sentence += word + " "
# ✅ GOOD: any() and all() for boolean checks
numbers = [1, 2, 3, 4, 5]
has_even = any(n % 2 == 0 for n in numbers)
all_positive = all(n > 0 for n in numbers)
# ❌ BAD: Manual boolean accumulation
has_even = False
for n in numbers:
if n % 2 == 0:
has_even = True
break
# ✅ GOOD: Walrus operator (Python 3.8+)
def process_data(data: list[int]) -> int:
"""Use walrus operator to avoid duplicate computation."""
if (result := compute_expensive(data)) > 100:
return result * 2
return result
# ✅ GOOD: reversed() for backward iteration
for item in reversed(items):
process(item)
# ❌ BAD: Manual reverse indexing
for i in range(len(items) - 1, -1, -1):
process(items[i])
Anti-Patterns
Overly Complex Comprehensions
# ❌ Avoid: Unreadable nested comprehension
result = [
item.upper()
for sublist in data
for item in sublist
if item
if len(item) > 5
if item.startswith("test")
]
# ✅ Fix: Use regular loop for clarity
result = []
for sublist in data:
for item in sublist:
if item and len(item) > 5 and item.startswith("test"):
result.append(item.upper())
Not Using Context Managers
# ❌ Avoid: Manual resource cleanup
file = open("data.txt")
data = file.read()
file.close()
# ✅ Fix: Use context manager
with open("data.txt") as file:
data = file.read()
Modifying List During Iteration
# ❌ Avoid: Modifying list while iterating
items = [1, 2, 3, 4, 5]
for item in items:
if item % 2 == 0:
items.remove(item) # Undefined behavior!
# ✅ Fix: Create new list with comprehension
items = [item for item in items if item % 2 != 0]
Testing Strategies
Testing Custom Context Managers
import pytest
def test_context_manager_cleanup():
"""Test that context manager cleans up resources."""
with TemporaryDirectory() as tmpdir:
assert tmpdir.exists()
test_file = tmpdir / "test.txt"
test_file.write_text("content")
assert test_file.exists()
# After context exit, directory should be gone
assert not tmpdir.exists()
def test_context_manager_exception_handling():
"""Test cleanup happens even on exception."""
tmpdir_path = None
with pytest.raises(ValueError):
with TemporaryDirectory() as tmpdir:
tmpdir_path = tmpdir
raise ValueError("Test error")
# Cleanup should still happen
assert not tmpdir_path.exists()
Testing Decorators
def test_retry_decorator():
"""Test that decorator retries on failure."""
call_count = 0
@retry(max_attempts=3, delay=0.01)
def failing_function():
nonlocal call_count
call_count += 1
if call_count < 3:
raise ValueError("Not yet")
return "success"
result = failing_function()
assert result == "success"
assert call_count == 3
def test_retry_decorator_max_attempts():
"""Test that decorator respects max attempts."""
@retry(max_attempts=2, delay=0.01)
def always_fails():
raise ValueError("Always fails")
with pytest.raises(ValueError, match="Always fails"):
always_fails()
Testing Generators
def test_batch_iterator():
"""Test batch iterator produces correct batches."""
items = list(range(25))
batches = list(batch_iterator(items, batch_size=10))
assert len(batches) == 3
assert len(batches[0]) == 10
assert len(batches[1]) == 10
assert len(batches[2]) == 5
# Verify content
assert batches[0] == list(range(10))
assert batches[2] == list(range(20, 25))
References
- PEP 8 - Style Guide
- PEP 343 - with Statement
- PEP 289 - Generator Expressions
- PEP 318 - Decorators
- PEP 544 - Protocols
- Python Design Patterns
- Effective Python
- Fluent Python
Related Skills
- python-best-practices.md - Type hints for protocols
- async-python.md - Async context managers and generators
- python-performance.md - Generator efficiency
- python-testing.md - Testing Pythonic patterns