Event Sourcing from the Ground Up, Part 1: The Log Is the Truth

A four-part series on event sourcing. Part 1 builds the mental model; Part 2 implements an event-sourced system in Kotlin from scratch; Part 3 adds CQRS and projections; Part 4 covers versioning, GDPR, the Kafka trap, and the library landscape.

The question your database can’t answer

Open the accounts table and look at a row: balance = 4200. How did it get there? The row doesn’t know. Every UPDATE overwrote the previous truth, and the history is gone, or smeared across an audit table maintained by triggers and hope.

Event sourcing keeps the changes themselves, as an append-only sequence of events, and derives current state by replaying them. Martin Fowler’s definition is the canonical one: event sourcing “ensures that all changes to application state are stored as a sequence of events,” which can be queried, used to reconstruct past states, and replayed to rebuild the system (Fowler, Event Sourcing).

Mutable current state overwrites history; an event log derives state from it

The accountant’s ledger is the standard analogy, and it’s a good one: accountants don’t erase. A mistake isn’t fixed by editing an old line; it’s fixed by appending a correcting entry. The ledger is the truth, and the account balance is just a summary you compute from it. In code:

state = events.fold(initialState, ::evolve)

Everything else in this series (aggregates, projections, snapshots, upcasters) is engineering around making that fold fast, safe, and evolvable.

Events, not commands, not rows

Terminology matters here because three similar-sounding things do very different jobs:

A command is a request to do something, named in the imperative: WithdrawMoney. It can be rejected: that’s the point of it.
An event is a fact that already happened, named in the past tense: MoneyWithdrawn. It cannot be rejected, because it isn’t a request; it’s history. Events are immutable.
State is a left-fold of events; a convenience, a cache, never the truth.

The flow is always: a command arrives, business logic examines current state and either rejects the command or emits one or more events, the events are appended to the log, and state moves forward. The log of events is the system of record; everything else is derived (Microsoft Azure Architecture Center, Event Sourcing pattern).

The event store

Events live in an event store: an append-only, immutable, ordered log, organized into streams: typically one stream per entity (account-42, order-1337). A real event store needs exactly three capabilities:

Append events to a stream, atomically, with an optimistic concurrency check (“append only if the stream is still at version N”; we’ll build this in Part 2).
Read a stream in order, to rebuild one entity’s state.
Read everything in global order, to build the derived read models of Part 3.

Notice what’s not on the list: queries by field, joins, secondary indexes. The event store is deliberately dumb. All the query-side richness lives in projections built from the log; that separation is Part 3.

What you actually buy

The benefits worth paying for, in rough order of how often they matter in practice:

A perfect audit trail, for free. Not a best-effort audit table; the audit log is the database. In regulated domains (payments, healthcare, anything an auditor visits) this alone justifies the pattern. There are times when “we don’t just want to see where we are, we also want to know how we got there” (Fowler).

Time travel and retroactive queries. Replay events up to any point and you have the state as of then. “What did this customer’s cart look like the moment they hit checkout?” stops being archaeology.

New read models from old data. Because the log keeps everything, a question you didn’t anticipate at design time can often be answered by writing a new projection and replaying history through it. Analytics teams love this; it’s like having gone back in time and added the instrumentation you wished you had.

Debugging with a flight recorder. Copy the production event stream into a test environment and replay it through new code. The bug that “only happens in prod” now happens on your laptop, deterministically.

Honest concurrency. Conflicts surface as explicit version conflicts at append time rather than silent lost updates from racing UPDATEs.

What it costs

The costs compound over a system’s life:

A steeper mental model. Everyone touching the system has to think in commands, events, and folds. Teams running this in production say the learning curve and schema evolution are the worst of it (Overeem et al.).
Events are forever. Whatever event schema you ship today is still in your store years from now. Don’t ship a sloppy one (Part 4).
Eventual consistency on the read side. Read models trail the log by some milliseconds. The UI has to deal with that (Part 3).
Deleting data is hard by design, which becomes a problem the moment GDPR asks you to erase a user (Part 4).

Decision rule: event source the parts where history is the business (money, orders, inventory, anything audited or contested). Leave the rest CRUD. Event sourcing the whole system is the most common way to regret it.

Isn’t this just CRDTs again?

The resemblance to operation-based CRDTs is precise: an append-only log of operations, state derived by applying them, replicas converging by exchanging updates. The patterns are siblings, and the family resemblance is exact; an operation-based CRDT is essentially a tiny event-sourced object whose events are broadcast between replicas over reliable causal delivery (Sypytkowski, Pure operation-based CRDTs). Akka and Apache Pekko ship this combination literally, as Replicated Event Sourcing: each replica keeps its own event journal, and the op-based CRDT rules make the replicated entity converge (Pekko docs, Replicated Event Sourcing).

The difference is who orders the log, and what that buys you:

	CRDTs	Event sourcing
Log order	Per-replica; merged commutatively	Single, totally ordered stream (per entity)
Writes during partition	Always accepted, everywhere	Only where the stream’s writer lives
Invariants (“balance ≥ 0”)	Cannot be enforced; convergence only	Enforced; commands are validated against state before append
Conflict story	Resolved by the data type’s merge	Prevented by optimistic concurrency

CRDTs cannot enforce a constraint like “this bank balance must never go negative,” because that requires coordination, and CRDTs exist to avoid coordination. Event sourcing sits on the other side of that trade: the totally ordered stream is the coordination point, which is exactly why the bank account (impossible to build safely as a CRDT) will be our running example for this whole series. Same log-shaped idea, opposite corner of the CAP triangle. Where the two patterns genuinely meet (replicated event journals), Part 4 points back to it.

Where we’re headed

The mental model to carry forward: events are facts, the log is the truth, state is a fold. In Part 2 we build it for real in Kotlin (an event store with optimistic concurrency, a bank-account decider, replay, and snapshots) in a few hundred lines with no framework. Part 3 splits reads from writes with CQRS and projections. Part 4 is the part most introductions skip: what happens to an event-sourced system after year one.