Event Sourcing from the Ground Up, Part 1: The Log Is the Truth
The intuition behind event sourcing: events vs commands vs state, the event store contract, the benefits and the costs that compound, and a precise comparison with CRDTs.
A four-part series on event sourcing. Part 1 builds the mental model; Part 2 implements an event-sourced system in Kotlin from scratch; Part 3 adds CQRS and projections; Part 4 covers versioning, GDPR, the Kafka trap, and the library landscape.
The question your database can’t answer
Open any CRUD application and look at a row: balance = 4200. Now answer a simple question — how did it get there? Was it one deposit? A deposit, a withdrawal, and a refund? Did a support agent manually correct it last Tuesday? The row doesn’t know. Every UPDATE overwrote the previous truth, and the history is gone — or, if you’re disciplined, smeared across an audit table that nobody trusts because it’s maintained by triggers and hope.
Event sourcing flips this around. Instead of storing the current state and discarding how you got there, you store the changes themselves — as an append-only sequence of events — and derive the current state by replaying them. Martin Fowler’s definition is the canonical one: event sourcing “ensures that all changes to application state are stored as a sequence of events,” which can be queried, used to reconstruct past states, and replayed to rebuild the system (Fowler, Event Sourcing).
The accountant’s ledger is the standard analogy, and it’s a good one: accountants don’t erase. A mistake isn’t fixed by editing an old line; it’s fixed by appending a correcting entry. The ledger is the truth, and the account balance is just a summary you compute from it. Event sourcing is that idea applied to software state:
state = events.fold(initialState, ::evolve)
That one line is the whole pattern. Everything else in this series — aggregates, projections, snapshots, upcasters — is engineering around making that fold fast, safe, and evolvable.
Events, not commands, not rows
Terminology matters here because three similar-sounding things do very different jobs:
- A command is a request to do something, named in the imperative:
WithdrawMoney. It can be rejected — that’s the point of it. - An event is a fact that already happened, named in the past tense:
MoneyWithdrawn. It cannot be rejected, because it isn’t a request; it’s history. Events are immutable. - State is a left-fold of events — a convenience, a cache, never the truth.
The flow is always: a command arrives, business logic examines current state and either rejects the command or emits one or more events, the events are appended to the log, and state moves forward. The log of events is the system of record; everything else is derived (Microsoft Azure Architecture Center, Event Sourcing pattern).
The event store
Events live in an event store: an append-only, immutable, ordered log, organized into streams — typically one stream per entity (account-42, order-1337). A real event store needs exactly three capabilities:
- Append events to a stream, atomically, with an optimistic concurrency check (“append only if the stream is still at version N” — we’ll build this in Part 2).
- Read a stream in order, to rebuild one entity’s state.
- Read everything in global order, to build the derived read models of Part 3.
Notice what’s not on the list: queries by field, joins, secondary indexes. The event store is deliberately dumb. All the query-side richness lives in projections built from the log — that separation is Part 3.
What you actually buy
The benefits worth paying for, in rough order of how often they matter in practice:
A perfect audit trail, for free. Not a best-effort audit table — the audit log is the database. In regulated domains (payments, healthcare, anything an auditor visits) this alone justifies the pattern. There are times when “we don’t just want to see where we are, we also want to know how we got there” (Fowler).
Time travel and retroactive queries. Replay events up to any point and you have the state as of then. “What did this customer’s cart look like the moment they hit checkout?” stops being archaeology.
New read models from old data. Because the log keeps everything, a question you didn’t anticipate at design time can often be answered by writing a new projection and replaying history through it. Analytics teams love this; it’s like having gone back in time and added the instrumentation you wished you had.
Debugging with a flight recorder. Copy the production event stream into a test environment and replay it through new code. The bug that “only happens in prod” now happens on your laptop, deterministically.
Honest concurrency. Conflicts surface as explicit version conflicts at append time rather than silent lost updates from racing UPDATEs.
What it costs
Event sourcing is sometimes sold as a free lunch. It is not, and the costs compound over a system’s life:
- A steeper mental model. Every developer who touches the system must think in commands, events, and folds. An empirical study of industrial event-sourced systems found the steep learning curve and event system evolution (schema change) to be the top reported challenges (Overeem et al., An empirical characterization of event sourced systems and their schema evolution).
- Events are forever. A sloppy event schema is a sloppy event schema in perpetuity. Versioning discipline is not optional (Part 4).
- Eventual consistency on the read side. Projections lag the log, and your UX has to live with that (Part 3).
- Deleting data is hard by design, which collides head-on with GDPR’s right to erasure (Part 4).
The honest decision rule: event source the parts of your domain where history is the business — money movement, orders, inventory, anything contested or audited — and let the boring reference data stay CRUD. Event sourcing an entire system uniformly is the most common way to regret it (Dudycz, Event streaming is not event sourcing! makes the related point that the pattern is about durable state, not about moving messages around).
Isn’t this just CRDTs again?
If you’ve read my CRDT series, this should be ringing bells: an append-only log of operations, state derived by applying them, replicas converging by exchanging updates. The patterns are siblings, and the family resemblance is precise — an operation-based CRDT is essentially a tiny event-sourced object whose events are broadcast between replicas over reliable causal delivery (Sypytkowski, Pure operation-based CRDTs). Akka and Apache Pekko ship this combination literally, as Replicated Event Sourcing: each replica keeps its own event journal, and the op-based CRDT rules make the replicated entity converge (Pekko docs, Replicated Event Sourcing).
The difference is who orders the log, and what that buys you:
| CRDTs | Event sourcing | |
|---|---|---|
| Log order | Per-replica; merged commutatively | Single, totally ordered stream (per entity) |
| Writes during partition | Always accepted, everywhere | Only where the stream’s writer lives |
| Invariants (“balance ≥ 0”) | Cannot be enforced — convergence only | Enforced — commands are validated against state before events are appended |
| Conflict story | Resolved by the data type’s merge | Prevented by optimistic concurrency |
I made a point in the CRDT foundations post that CRDTs cannot enforce a constraint like “this bank balance must never go negative,” because that requires coordination, and CRDTs exist to avoid coordination. Event sourcing sits on the other side of that trade: the totally ordered stream is the coordination point, which is exactly why the bank account — impossible to build safely as a CRDT — will be our running example for this whole series. Same log-shaped idea, opposite corner of the CAP triangle. I won’t re-cover convergence, semilattices, or merge functions here; that ground is in the CRDT series, and where the two patterns genuinely meet (replicated event journals), Part 4 points back to it.
Where we’re headed
The mental model to carry forward: events are facts, the log is the truth, state is a fold. In Part 2 we build it for real in Kotlin — an event store with optimistic concurrency, a bank-account decider, replay, and snapshots — in a few hundred lines with no framework. Part 3 splits reads from writes with CQRS and projections. Part 4 is the part most introductions skip: what happens to an event-sourced system after year one.