DynamoDB Access Patterns: The Part the Documentation Skips
In 2019 Rick Houlihan gave a talk at AWS re:Invent that has quietly become required viewing for anyone taking DynamoDB seriously. The central argument is that relational databases and key-value stores are not points on a spectrum of sophistication – they answer different questions. A relational database is designed for flexible querying against data whose access patterns you do not know in advance. DynamoDB is designed for known, fixed access patterns executed at scale with predictable latency. The schema follows from the patterns; you do not derive the patterns from the schema. If you approach DynamoDB the relational way, reaching for secondary indexes where a relational database would use a join, you pay for it in cost and latency. The talk is worth the two hours (Houlihan is a fast talker!).
The problem is that having understood this, you open boto3 and the discipline has nowhere to go.
boto3 exposes DynamoDB faithfully at the level of the service API.
PK and SK values are strings you construct at the call site, at the moment of use.
USER#jane#STORY#42 in the write has no formal relationship to USER#jane#STORY#42 in the read.
A typo produces a silent miss.
The access pattern you designed before writing schema – the one the Houlihan model depends on – is implicit in your code rather than declared in it.
It lives in a diagram or a comment, if it survives at all.
dynawrap gives the pattern somewhere to live.
How It Works
Each table item is a class that inherits from DBItem alongside a Pydantic BaseModel (or a dataclass).
The PK and SK patterns are declared as class variables using Python’s str.format() placeholder syntax:
from dynawrap import DBItem
from pydantic import BaseModel
class Story(DBItem, BaseModel):
pk_pattern = "USER#{owner}#STORY#{story_id}"
sk_pattern = "STORY#{story_id}"
schema_version: str = ""
owner: str
story_id: str
title: str
content: str = ""
The model is backend-agnostic. It knows nothing about DynamoDB or PostgreSQL; it only knows its own key structure and fields. The backend is constructed separately and passed the model at operation time:
import boto3
from dynawrap.backends.dynamodb import DynamoDBBackend
client = boto3.client("dynamodb") # always client, never resource
backend = DynamoDBBackend(client)
story = Story(owner="johndoe", story_id="1234", title="Test Story")
backend.save("stories", story)
result = backend.get("stories", Story, owner="johndoe", story_id="1234")
backend.get() returns None on a miss rather than raising.
Queries return
a generator:
for story in backend.query("stories", Story, owner="johndoe"):
print(story.title)
# prefix match on SK
for story in backend.query("stories", Story, owner="johndoe", story_id="12"):
print(story.title)
dynawrap has no partial update.
The pattern is read-modify-write, which mirrors what DynamoDB actually does well: fetch the item, modify it with model_copy(update={...}), save it back.
This is a deliberate choice to keep the model clearly separate from the infrastructure rather than building a managed ORM interface that would obscure the underlying semantics.
The PostgreSQL Backend
The same model class and the same application code work against a PostgreSQL backend with no modification. Only the backend constructor changes:
import psycopg2
from dynawrap.backends.postgres import PostgresBackend
conn = psycopg2.connect(dsn)
PostgresBackend.create_table(conn, "stories") # idempotent, safe on startup
backend = PostgresBackend(conn)
# identical from here
backend.save("stories", story)
result = backend.get("stories", Story, owner="johndoe", story_id="1234")
PostgreSQL stores items in a fixed-schema table: pk, sk, schema_version, and data (JSONB). All model fields live in data.
No migrations are required as models evolve, because the structure is always the same table regardless of what the model contains.
This has two practical consequences. The first is local development: running DynamoDB locally requires either Docker with the DynamoDB Local image or AWS credentials pointed at a real table. PostgreSQL has none of those requirements. The backend switch is a one-line configuration change; the application code does not move.
The second is that PostgreSQL’s query capabilities are not surrendered. JSONB operators, aggregation, analytical queries, joins against other tables – these are available against the same data that DynamoDB serves in production. A system that writes via the dynawrap interface can be queried analytically via PostgreSQL directly, without an ETL step. The access pattern discipline DynamoDB requires does not prevent OLAP-style usage of the underlying data; it just moves it to the right tool.
DynamoDB Streams
DynamoDBBackend.from_stream_record() constructs a typed model instance from a DynamoDB stream event record, deserialising the wire format and validating against the class’s PK/SK pattern.
It raises ValueError if the record does not match the class, which makes it safe to call on mixed-type streams without prior branching:
def lambda_handler(event, context):
client = boto3.client("dynamodb")
backend = DynamoDBBackend(client)
for record in event["Records"]:
try:
obj = backend.from_stream_record(record, Story)
if record["eventName"] == "INSERT":
notify_subscribers(obj)
except ValueError:
pass # record belongs to a different model class
What Changes in Practice
The class definition becomes the authoritative record of the table’s access patterns. A new engineer reads the class and knows the key structure, the field requirements, and the placeholder values – without tracing through call sites. Pattern errors surface at the first save attempt with a missing field rather than as silent misses during query. Schema versioning is automatic: dynawrap computes a hash of the pattern and field names and stores it on every item, which simplifies identifying items written by an older model version when access patterns evolve.
The PostgreSQL backend removes the last practical reason to avoid the pattern in early development. A project can begin against a local PostgreSQL database, move to DynamoDB when AWS infrastructure is in place, and retain full analytical access to the data via PostgreSQL at any point in between.
(The source is at github.com/bayinfosys/aws-dynamodb-wrapper and the package at https://pypi.org/project/dynawrap. If DynamoDB access pattern design is a live problem in a project you are building, get in touch.)