Performance

Melange is designed for high-performance permission checking at scale. This page explains the performance characteristics you can expect and how to optimize your setup.

Performance Overview

Melange achieves sub-millisecond permission checks through:

  1. Specialized SQL functions: Purpose-built functions generated for each relation
  2. Precomputed closures: Role hierarchies resolved at migration time, not runtime
  3. Index utilization: Generated SQL optimized for efficient index scans
  4. In-database execution: No network round-trips for authorization

Pattern Complexity Benchmarks

All benchmarks run on Apple M2 Pro with PostgreSQL 16. These results show performance across different authorization patterns from the OpenFGA compatibility test suite.

Performance by Pattern Type

PatternCheckListObjectsListUsers
Direct assignment ([user])~155 µs~165 µs~155 µs
Computed userset (viewer: owner)~160 µs~165 µs~160 µs
Union ([user] or owner)~155 µs~165 µs~155 µs
Tuple-to-userset (viewer from parent)~170 µs~180 µs~160 µs
Exclusion (writer but not blocked)~170 µs~180 µs~170 µs
Intersection (writer and editor)~170 µs~205 µs~250 µs
Wildcards ([user:*])~155 µs~160 µs~155 µs
Userset references ([group#member])~170 µs~320 µs~180 µs

Complex Pattern Performance

More complex patterns combining multiple features:

PatternCheckListObjectsListUsers
TTU + computed userset~175 µs~190 µs~170 µs
TTU + TTU (nested)~200 µs~190 µs~190 µs
Three-prong relation~230 µs~475 µs~330 µs
Computed user indirect ref~190 µs~270 µs~230 µs
Nested TTU with intersection~210 µs~240 µs~265 µs
Nested TTU with exclusion~200 µs~210 µs~275 µs

Expensive Operations

Some patterns have significantly higher latency:

PatternCheckListObjectsListUsersNotes
Contextual tuples-~3.2 ms~3.1 msTemporary table overhead
Wildcard expansion (worst case)~850 µs~9.2 ms~21 msFull enumeration required
Deep intersection chains~175 µs~450 µs~500 µsMultiple path evaluation

Validation Performance

Invalid requests are rejected immediately without database queries:

ValidationLatency
Invalid relation~25 ns
Invalid type~30 ns
Invalid user format~30 ns
Invalid contextual tuple~25 ns

Scale Benchmarks

These benchmarks test performance at different tuple volumes using a GitHub-like authorization model.

Test Schema

The scale benchmarks use a model with organizations, repositories, and pull requests:

type organization
  relations
    define owner: [user]
    define admin: [user] or owner
    define member: [user] or admin
    define can_read: member

type repository
  relations
    define org: [organization]
    define reader: [user] or can_read from org
    define writer: [user] or reader
    define can_read: reader
    define can_write: writer
    define can_review: can_read but not author

type pull_request
  relations
    define repo: [repository]
    define author: [user]
    define can_read: can_read from repo
    define can_review: can_read from repo but not author

Test Data Configuration

ScaleUsersOrgsRepos/OrgMembers/OrgPRs/RepoTotal ReposTotal PRs~Tuples
1K1005102010505001,150
10K5001050502050010,00021,000
100K2,00020100100502,000100,000204,000
1M10,0005020020010010,0001,000,0002,020,000

Tuples include: org memberships, repo→org relationships, PR→repo relationships, and PR→author relationships.

Check Operations

Check operations execute specialized SQL functions and scale with O(1) constant time regardless of tuple count:

OperationDescription1K10K100K1MScaling
Direct Membershipusercan_readorganization (direct tuple lookup)~246 µs~196 µs~210 µs~202 µsO(1)
Inherited Permissionusercan_readrepository (via org membership)~319 µs~318 µs~309 µs~307 µsO(1)
Exclusion Patternusercan_reviewpull_request (reader but not author)~391 µs~388 µs~393 µs~392 µsO(1)
Denied PermissionNon-member user checking org access (expected: denied)~193 µs~208 µs~235 µs~210 µsO(1)

Key insight: All check operations maintain constant time regardless of dataset size. This is achieved through specialized SQL code generation that eliminates runtime schema interpretation.

List Operations

List operations find all objects a subject can access (or all subjects with access to an object). Performance varies by relation complexity:

ListObjects (find objects a user can access):

OperationDescription1K10K100K1MScaling
List Accessible ReposAll repositories user can read (via org membership)~2.9 ms~26 ms~102 ms~531 msO(n)
List Accessible OrgsAll organizations user is member of (direct)~184 µs~200 µs~195 µs~194 µsO(1)
List Accessible PRsAll pull requests user can read (via repo→org)~3.2 ms~30 ms~113 ms~605 msO(n)

ListSubjects (find users with access):

OperationDescription1K10K100K1MScaling
List Org MembersAll users who can read an organization~178 µs~203 µs~255 µs~330 µsO(log n)
List Repo ReadersAll users who can read a repository (via org)~6.7 ms~27 ms~117 ms~682 msO(n)
List Repo WritersAll users who can write to a repository (direct)~176 µs~184 µs~184 µs~184 µsO(1)

Why the variation? Operations on direct relations (org membership, repo writers) achieve O(1) scaling. Operations that traverse parent relationships (repo readers via org, PRs via repo→org) scale linearly with the number of potential matches because they must join across relationship hierarchies.

Parallel Operations

Permission checks are fully parallelizable since each check is independent:

OperationTime per OpAllocations
Parallel Direct Check~55 µs32 allocs/op
Parallel Inherited Check~70 µs33 allocs/op

At 12 parallel workers, throughput increases ~4x compared to sequential checks.

Caching

The optional in-memory cache provides dramatic speedups for repeated checks:

ScenarioLatencySpeedup
Cold cache (database query)~336 µsbaseline
Warm cache (memory lookup)~79 ns~4,250x faster

Enable caching in your application:

cache := melange.NewCache(
    melange.WithTTL(time.Minute),
    melange.WithMaxSize(10000),
)
checker := melange.NewChecker(db, melange.WithCache(cache))

When to use caching:

  • High read-to-write ratio (permissions checked frequently, changed rarely)
  • Acceptable staleness window (typically 30s-5min depending on application)
  • Memory budget available for cache storage

When to skip caching:

  • Permissions change frequently
  • Strict real-time consistency required
  • Memory-constrained environments

Optimizing Performance

1. Design Efficient Schemas

Schema design has the biggest impact on query performance. The benchmarks show clear patterns:

Performance tiers by pattern:

TierPatternsTypical Check Latency
FastDirect, computed userset, union, wildcards150-165 µs
MediumTTU, exclusion, userset references165-180 µs
SlowIntersection, nested TTU, complex multi-path180-250 µs
Very slowDeep intersection chains, contextual tuples300+ µs

Minimize intersections: Intersection is the most expensive set operation, especially for ListObjects and ListUsers:

# Slower: Intersection requires evaluating multiple paths
type document
  relations
    define viewer: writer and editor

# Faster: Union is much cheaper
type document
  relations
    define viewer: writer or editor

Prefer shallow hierarchies: Deep parent traversals increase query complexity:

# Better: Direct assignment with union
type document
  relations
    define viewer: [user, team#member]

# Slower: Deep parent chain
type document
  relations
    define viewer: viewer from folder
type folder
  relations
    define viewer: viewer from workspace

Use direct relations where possible: define viewer: [user] (~155 µs) is faster than nested computed usersets (~190 µs).

Be cautious with wildcards in ListObjects/ListUsers: While Check with wildcards is fast (~155 µs), wildcard expansion for list operations can be expensive (up to 9-21 ms in worst cases).

2. Avoid Runtime Contextual Tuples

Contextual tuples add ~3ms overhead per operation due to temporary table setup. Use stored tuples when possible:

// Slow: ~3.2ms with contextual tuples
checker.Check(ctx, user, "viewer", doc,
    melange.WithContextualTuples(temporaryPermissions))

// Fast: ~160µs with stored tuples
checker.Check(ctx, user, "viewer", doc)

If you need contextual tuples frequently, consider batching multiple checks together.

3. Index Your Tuples View

Proper indexing of your melange_tuples view source tables is critical:

-- Object-based lookups (most common pattern)
CREATE INDEX idx_org_members_lookup
    ON organization_members (organization_id, role, user_id);

-- Expression indexes for text ID conversion
CREATE INDEX idx_org_members_text
    ON organization_members ((organization_id::text), (user_id::text));

Without expression indexes, PostgreSQL must perform sequential scans when your view converts integer IDs to text.

See the Tuples View documentation for complete indexing guidance.

4. Use Appropriate Scaling Strategies

ScaleRecommended Approach
< 10K tuplesRegular view + source table indexes
10K-100K tuplesRegular view + expression indexes + caching
100K-1M tuplesRegular view + expression indexes + caching, or materialized view
> 1M tuplesDedicated table with trigger sync + caching

See Tuples View Scaling Strategies for implementation details.

5. Batch List Operations

For list operations at scale, consider pagination or streaming:

// Stream results instead of loading all at once
objects, err := checker.ListObjects(ctx, user, "viewer", "document",
    melange.WithLimit(100),
    melange.WithCursor(lastCursor),
)

6. Monitor Query Performance

Use EXPLAIN ANALYZE to identify slow queries:

EXPLAIN ANALYZE
SELECT check_permission('user', '123', 'viewer', 'document', '456');

Look for:

  • Sequential scans: Add indexes to eliminate these
  • High row estimates: May indicate missing statistics (ANALYZE your tables)
  • Nested loops with high iterations: Consider schema simplification

Performance Comparison

How does Melange compare to alternatives?

ApproachCheck LatencyConsistencyScaling
Melange (in-database)150-250 µsTransactionalO(1) for checks
OpenFGA (external service)~1-5 msEventualO(1) for checks
Application-level RBAC~10-100 µsVariesO(roles)
Direct SQL queries~50-200 µsTransactionalO(query complexity)

Melange provides the consistency guarantees of transactional SQL with performance comparable to simpler RBAC systems, while supporting the full expressiveness of Zanzibar-style authorization.

Summary

Typical performance expectations:

  • Check operations: 150-250 µs for most patterns
  • ListObjects: 165-300 µs for simple patterns, 300-500 µs for complex patterns
  • ListUsers: 155-330 µs for most patterns
  • Validation errors: ~25-30 ns (instant rejection, no database hit)
  • Cache hits: ~79 ns (~2000x faster than database)

Patterns to avoid for latency-sensitive paths:

  • Deep intersection chains (use union where possible)
  • Contextual tuples (use stored tuples)
  • Wildcard expansion in ListObjects/ListUsers
  • Deeply nested TTU chains

Further Reading