Read Time Upcasting - Never Migrate your Deebs Again

Aaron SullivanMay 14 2025

Read time upcasting is the practice of transforming legacy data from the database at runtime. This is an expansion of upcasting from the event sourcing pattern, applied to non event sourced, traditional databases.

Why?

Short answer: Data safety, application down time, and coordinated deploys.

Migrations come with headaches:

Data Safety

Traditional batch migrations operate on large sets of data in one go, which creates a risk of partial migration if the process fails midway. This can lead to data inconsistency, leaving some records migrated and others not. Recovering from such a state often requires complex rollback or manual fix-up, increasing the chance of that one manual key stroke that brings down prod, and the late Friday night “pizza party” that ensues.

Application Downtime

Batch migrations frequently require taking the application offline to avoid data corruption or inconsistent reads or writes while migrations run. This downtime can impact users and business operations, sometimes for extended periods during complex migrations.

Coupled Migrations with Deploys

This is why migrations come with a down and an up - because sometimes deploys go wrong. Whoops, shipped a bug, and we need to roll back the deploy. Well, that means you need to un-migrate the database, which could cause data loss, and definitely causes headache - possibly with nausea as well.

What’s it look like?

Traditional Batch Migration

Imagine a banking system needing to add a new boolean field is_active to an Account model to track whether accounts are active or delinquent.

# App code
from pydantic import BaseModel

# Originally, our account looks like this:
class Account(BaseModel):
    account_id: str
    owner_name: str
    balance: float

# Some requirements change, and we need to mark delinquent accounts:
class Account(BaseModel):
    account_id: str
    owner_name: str
    balance: float
    is_active: bool # <-- new guy

# Assuming our database looks something like this
accounts = [
    {"account_id": "abc123", "owner_name": "John Doe", "balance": 1000.0},
    {"account_id": "xyz789", "owner_name": "Jane Smith", "balance": 2500.0},
]

# We'd need some kind of migration that looks like this
def up():
    for acc_data in db.get_accounts():
        acc_data["is_active"] = True
        db.save_account(acc_data)

def down():
    for acc_data in get_accounts():
        del acc_data["is_active"]
        db.save_account(acc_data)

# Now we have to coordinate running this migration with the deploy of the new app code

Read Time Upcasting

Instead of migrating all accounts at once, upcasting transforms accounts on-the-fly at read time.

from pydantic import BaseModel
from typing import Union

# Define both old and new schemas
class AccountV1(BaseModel):
    version: 1
    account_id: str
    owner_name: str
    balance: float

class AccountV2(BaseModel):
    version: 2
    account_id: str
    owner_name: str
    balance: float
    is_active: bool

# Define a transition from old to new
def account_v1_to_v2(account: AccountV1) -> AccountV2:
    acc = account.model_dump()
    acc["is_active"] = True
    return AccountV2(**acc)

# Now register this transition with a library which handles identifying old
# domain models and turning them into the new ones. This is inspired by the
# event sourcing upcasting pattern, and pypi's `eventsourcing` library does this.

# In Pseudo code:
# 1) Check what version is in the db
# 2) Find the path from that version to the newest within the registered transitions
# 3) Put the data through that path
# 4) Never migrate again

Now the old data in your database gets loaded into the new form. With this, you have no migrations, no coordinated deploys, and no downtime. The tradeoff is that you have to keep all versions of Account (although you never touch the old ones again, they’re frozen in time), and the transitions between them.

Common Pitfalls & How to Avoid Them

Working with read-time upcasting can be sweet, but watch out for these common traps:

Skipping version and transition discipline: Without strict version tracking and well-registered transitions, your data migration will be a house of cards. Maintain discipline.
Over-engineering transitions: Don’t make your transition functions do everything at once. Keep changes small and incremental.
Ignoring performance impact: Every migration on read adds latency. Use caching wisely and measure your own tolerance.
Neglecting testing: This can’t be stressed this enough. Untested migrations are an invitation for production surprise parties.
Forgetting rollback plans: Even though upcasting avoids batch migration rollbacks, you might still need to handle corrupt or malformed data gracefully.
Leaving legacy code to rot: Keep old versions around only as long as needed. Clean up after yourself.

Keep these pitfalls in mind and you’ll be able to stick to your pizza free diet.

Caching

Caching is the secret sauce that makes read-time upcasting viable at scale. Since the primary drawback of transforming data on every read is the runtime cost, caching helps flatten that curve.

Why Cache?

Every upcasted transformation on read is like a mini computation migraine. Without caching, your app repeats the same upgrade dance for that same record every time it’s accessed.

What to Cache

In-memory Objects: Cache the fully deserialized and migrated objects inside your app memory when practical.
Serialized Data: Cache the migrated payloads in fast-access stores like Redis or Memcached to avoid repeated DB reads and transformations.

Cache Invalidation

A strong cache is only as good as its invalidation strategy:

Evict or refresh cache entries when underlying data changes
Use TTL (time to live) judiciously based on data volatility
For writes, update the cache immediately after the database save to keep syncing

Pitfalls & Gotchas

Stale caches can serve old data. Always balance between fresh and fast.
Cache complexity can hide migration issues, so monitor cache hit/miss ratios closely.

Alternatives

Let’s talk about other ways to do this

Traditional style

As outlined above, you can define migrations, then with those, operate all at once on your whole database. This might be fine for a small app, or for a large, legacy application, might be difficult.

Pros:

Simple and well understood pattern
Good tooling in many frameworks and databases
Schema is consistent after migration completes

Cons:

Risk of partial migration failures and inconsistent data
Downtime or complex coordination needed during migration
Rollbacks and downgrades are risky and complex

Incremental Migration

Incremental migration, aka Lazy Migration, JIT Migration, or On-demand Migration, involves retrieving objects from disk and transforming them on-the-fly to the newest version, the same way as the read time upcasting we’ve been talking about. However in this pattern, after serving the data to the user, the updated version gets written back to disk. This cycle ensures that over time, all data converges to the latest format without the need for an offline migration.

Pros:

No downtime, migrates gradually during normal use
Less risk of migration failure causing data loss or downtime
Data converges to latest format over time transparently

Cons:

More complex code paths to handle multiple versions
Slight runtime performance cost during migration on access
Cache and consistency management adds complexity

Shadow Copy Migration (Dual Write)

Shadow Copy Migration involves writing incoming writes to both the old and new schema in parallel, allowing you to backfill data gradually and switch reads over only when ready.

Pros:

Allows gradual migration without downtime
Enables rollback to old schema by continuing to read old data
Decouples schema change from deployment

Cons:

Complex write path logic with dual writes
Increased write latency and storage usage
Complex synchronization between old and new data formats

Blue-Green Deployment with Migration

Blue-Green deployments create a parallel environment where the new schema and code run side-by-side, allowing you to switch traffic gradually once migration is complete.

Pros:

No downtime during cutover
Easy rollback by switching back to old environment
Enables testing in production environment

Cons:

Requires duplicate infrastructure costs
Operational complexity managing parallel environments
Migration still needs to complete before cutover

Schema Versioning with Feature Flags

Deploy code supporting both old and new schemas behind feature flags that can be toggled to control rollout.

Pros:

Granular control over schema change rollout
Supports rollback by toggling flags
Enables A/B testing and gradual feature rollout

Cons:

Increased code complexity to support multiple versions
Potential technical debt if old code paths linger
Feature flag management overhead

Event Versioning (Event Sourcing)

Explicitly version your events and apply transformations (upcasting) when projecting to newer formats or views.

Pros:

Immutable event store makes migrations safe and transparent
Enables flexible projections and multiple views
Clear version history and audit trail

Cons:

Complexity of managing event versioning and upcasters
Steeper learning curve for event sourced systems
Less suited for all types of databases or apps