Testing Guidelines

Core Principle: Test Outcomes, Not Outputs

Every test should answer: "Did the right thing happen?" — not "Did the code produce this exact string?"

An outcome is a user-visible or system-observable result: a document syncs to the server, a conflict is detected, a subscription delivers updated data. An output is a specific implementation detail: the exact title string, a particular version number, the format of an ID.

Tests that assert outputs break when implementation details change, even if the system is still working correctly. Tests that assert outcomes survive refactors and catch real bugs.

Guidelines

1. Assert relationships, not literals

Bad — asserts a hardcoded value the handler happens to produce:

const id = await client.mutate('todos.create', { title: 'Buy milk' })
const doc = await syncEngine.getDoc('todos', id)
expect(doc?.title).toBe('Buy milk')

Good — asserts the handler actually ran and the result synced:

const id = await client.mutate('todos.create', { title: 'Buy milk' })
const localDoc = await syncEngine.getDoc('todos', id)
// Handler transforms title — verify it ran
expect(localDoc?.title).toBe('Buy milk_')

// After reconnect, verify server matches local state
const serverDoc = await getServerDoc(id)
expect(serverDoc?.title).toBe(localDoc?.title)

The second test catches a real bug: if the queued mutation sends original args instead of the handler's output, the server will have a different title than the local DB. The first test would pass even with that bug.

2. Each test should be able to catch at least one real bug

Before writing a test, ask: "What bug would make this fail?" If you can't name one, the test is likely not pulling its weight.

Tests we removed for violating this:

"can connect to server" — expect(client.getConnectionState()).toBe("connected"). If the server doesn't start, every other test fails too. This catches nothing on its own.
"can unsubscribe" — body was expect(true).toBe(true). Literally asserts nothing.
"can insert via sync write" — duplicated by every test that inserts a document as setup.

3. Test the real path, not a simulation

Bad — tests raw CRUD, which bypasses the function handler pipeline:

await client.mutate('todos.insert', {
    _id: id,
    title: 'Test',
    completed: 0,
    priority: 1,
})

Good — tests through the handler, which exercises codegen → server → boa → result:

const id = await client.mutate('todos.create', {
    title: 'Test',
    priority: 1,
})

Three shipped bugs (truncated handler bodies, missing console global, broken await) went undetected because all e2e tests used raw CRUD. The function handler pipeline was never exercised.

4. Don't duplicate coverage across test files

If version.test.ts already tests "insert creates document with _version = 1", don't re-test it in offline.test.ts. Each file should own a specific concern:

File	Owns
`codegen/dotenv.test.ts`	`.env` file parsing and read/write
`codegen/env.test.ts`	`valet env` CLI command
`codegen/generator.test.ts`	Code generation output (api, handlers, schema, react exports)
`codegen/init.test.ts`	`valet init` scaffolding
`codegen/parser.test.ts`	Schema and function file parsing
`core/logger.test.ts`	Structured logger levels, formatting, tags
`core/protocol.test.ts`	Wire format serialization, type guards, message constructors
`core/subscription.test.ts`	Subscription state machine, document cache, change events
`expo/sqlite-mutation-log-storage.test.ts`	SQLite-backed MutationLogStorage for Expo
`local/context.test.ts`	ReplayableEnv, deterministic globals, ID extraction
`local/database.test.ts`	LocalDatabase with wa-sqlite backend (contract tests)
`local/database.better-sqlite.test.ts`	LocalDatabase with bun:sqlite backend (contract tests)
`local/idb-storage.test.ts`	IndexedDB-backed MutationLogStorage
`local/mutation-log.test.ts`	MutationLog lifecycle: log, getPending, confirm, fail
`local/sync.test.ts`	SyncEngine: delta application, schema migration, backfills
`qb/query.test.ts`	QueryBuilder, SyncFilterBuilder, subquery builder
`react/client.test.ts`	ValetClient: connection, auth, subscriptions, mutations, replay
`react/factory.test.tsx`	createValetProvider factory wiring (MutationLog integration)
`react/hooks.test.tsx`	React hooks logic (useQuery, useMutation, etc.)
`server/db.test.ts`	GenericDatabaseReader/Writer, filter builder, query builder
`server/functions.test.ts`	defineQuery, defineMutation, execution modes, type helpers
`server/schema.test.ts`	defineTable, defineSchema, sync config, indexes, backfills
`server/validators.test.ts`	Validator types, validation, inference, error messages

If a test doesn't clearly belong to the file's concern, it's probably a duplicate.

5. Setup is not a test

If the first 80% of a test is creating documents and subscribing — and every other test does the same thing — that's setup, not testing. Use beforeEach or helper functions for shared setup.

Similarly, don't write separate tests for "can connect" and "can subscribe" when every real test already connects and subscribes as setup. The real tests implicitly verify those work.

6. Make the handler do something observable

When testing handler execution, the handler should transform its input in some way so you can verify it actually ran:

// Handler appends "_" to title — proves handler executed, not just raw CRUD
handler: async (ctx, args) => {
    return ctx.db.insert('todos', {
        title: args.title + '_',
        completed: 0,
    })
}

If the handler is a passthrough, you can't distinguish "handler ran" from "raw args were inserted directly."

7. Write the test before the code when chasing a bug

When a bug is found, the first step is writing a test that fails because of the bug. This ensures:

The bug is real (not a misunderstanding)
The fix actually works (test goes green)
The bug can't regress (test stays in the suite)

Example: the offline handler sync bug was caught because the test asserted serverDoc?.title === localDoc?.title — comparing the actual states rather than hardcoding expected values.

Testing Guidelines

On this page