Melange v0.8.2

April 30, 2026·

Melange v0.8.2 is a performance release. Permission checks that carry contextual tuples are now 12–20× faster, the recursive list_*_sub codegen takes its cheap fast path on a wider class of parent relations, and the release pipeline is now fully automated end-to-end with release-please and GitHub Actions.

No breaking changes from v0.8.1. Upgrade and run melange migrate to pick up the regenerated SQL functions for the list_*_sub improvements. Contextual-tuples performance gains are entirely client-side — they take effect as soon as you upgrade the runtime, no migration needed.

Performance

Contextual Tuples: Single-Round-Trip Plumbing

Single-call operations carrying contextual tuples used to be 12–20× slower than the same operation without them. The cause was the per-call setup that prepared the temporary view PostgreSQL uses to shadow melange_tuples:

One pg_class / pg_namespace lookup to find the base schema
One CREATE TEMP TABLE
N × INSERT for the supplied tuples
One CREATE TEMP VIEW joining the temp table to the base view
One DROP TABLE cleanup at the end

That is 3 + N round trips before the actual permission query. For a typical contextual check with a handful of tuples, the plumbing dominated the total cost.

Two combined optimisations bring that to a single round trip:

Inline contextual tuples as a VALUES literal. The temp view body now embeds the tuples directly: CREATE TEMP VIEW melange_tuples AS SELECT ... FROM base_view UNION ALL VALUES (...). The temp table and the per-row INSERTs disappear entirely.
Memoise the base-schema lookup on the Checker. The schema doesn’t move during a Checker’s lifetime, so a mutex-guarded cache amortises it to a single query at startup instead of repeating it on every contextual call.

Measured on the OpenFGA contextual-tuples benchmark suite (Apple M2 Pro, Postgres 18-alpine via testcontainers):

Operation (with contextual tuples)	Before	After
Check	2.0 ms	~1.3 ms
ListObjects	3.9 ms	~1.5 ms
ListUsers	3.9 ms	~1.5 ms

The new shape ships as a reusable benchmark under test/openfgatests/benchmarks/contextual_tuples_plumbing_test.go so future plumbing changes can be measured against the same five candidate strategies (baseline / cached-schema / multi-row insert / array unnest / inline values) side-by-side.

Wider Fast Path for Recursive `list_*_sub`

The recursive list_subjects codegen had a binary fast/slow split for tuple-to-userset (TTU) parents. The cheap parent_closure CTE — which scans ancestors directly — was used only when the parent’s target relation was “simply resolvable” (no Userset, Recursive, Wildcard, Intersection, or Exclusion features). Anything else fell back to subject_pool + a per-row check_permission_internal, which is O(universe × check) and dominates the per-leg cost on deep models.

This release replaces the binary isParentRelationComplex guard with a finer classifier (classifyParentRelation + parentTargetNeedsSubjectPool). The closure path now also covers parent target relations whose nested TTUs are self-referential and use the same linking relation as the current walk — a common shape in role-hierarchy models — so they no longer get pushed to the slow path unnecessarily.

Multi-sample benchstat (-count=5 -benchtime=1s) on list_objects_expands_wildcard_tuple ListUsers:

Stages 1–2 (purely-recursive parents): −9.7% / −8.6% (p = 0.008)
Stage 3 ListUsers/3: −9.2% (p = 0.032)

A new test/openfgatests/testdata/intersection_recursive.yaml fixture acts as a regression net for the harder shapes the new classifier still routes to subject_pool.

Tooling

`explaintest` Walks Every Stage

Multi-stage OpenFGA tests previously emitted a Warning: test has N stages, processing only first stage and silently dropped stages 2..N. This was a real blind spot — the slow assertions on the wildcard-expansion benchmark all live in stage 3, so EXPLAIN ANALYZE for them was simply unreachable through the existing tooling.

runTest now creates the store once (mirroring the openfgatests/runner.go lifecycle) and loops over every stage, writing each stage’s model + tuples and running its assertions. Output is tagged with the stage index, and a new --stage flag filters to a single stage when you want to focus.

Release Engineering

The release pipeline is now fully automated. Pushing a feat: or fix: commit to main opens a release-please PR; merging that PR fires a single GitHub Actions workflow that handles the full multi-module Go tag dance, signing & notarising the darwin binaries, building deb / rpm packages, pushing the multi-arch container image to ghcr.io, updating the Homebrew tap, and publishing the TypeScript client to npm.

Practical fallout for users:

Container image. This is the first release that publishes ghcr.io/pthm/melange with multi-arch (linux/amd64 + linux/arm64) tags :v0.8.2 and :latest.
Linux packages. Released artifacts now include .deb and .rpm packages alongside the existing tarballs.
Toolchain unification. Local dev and CI now share mise.toml as the single source of truth for go, goreleaser, node, pnpm, and quill versions.

Migration Notes

From v0.8.1

No breaking changes. Upgrade and run migrations to pick up the regenerated SQL functions:

melange migrate

If you use melange generate migration, regenerate your migration files to pick up the wider fast path:

melange generate migration \
  --schema melange/schema.fga \
  --output db/migrations \
  --git-ref main

The contextual-tuples performance work is entirely client-side and takes effect as soon as the new runtime is in place — no SQL changes needed.

Try It Out

# Install / upgrade CLI
brew install pthm/melange/melange

# Or pull the container image
docker pull ghcr.io/pthm/melange:v0.8.2

# Or install the .deb / .rpm package from the GitHub release

# Apply migrations
melange migrate

# Go runtime
go get github.com/pthm/melange/melange@v0.8.2

# TypeScript runtime
npm install @pthm/melange

Feedback

We welcome feedback and bug reports. Please open an issue with questions or feature requests.

Melange v0.8.1