APIs, in depthAPI Designhard18 min read

Authorization, Past the If-Statement

Almost every 'user saw data that wasn't theirs' incident is an authorization bug, not an authentication bug. The fix is to stop scattering permission checks through your handlers and make the answer to 'is this allowed?' come from one place, every time.

The authentication deep dive ended on a clean line: authentication answers who are you, authorization answers what are you allowed to do. Auth gets all the attention because it's where the cryptography is, but authorization is where the breaches are. "User A could see User B's invoices" is not a login bug — A logged in perfectly well. It's an authorization bug, and it's the single most common serious vulnerability in real applications, because it hides in plain sight as ordinary-looking business logic.

The simplest version, and where it rots

You start with one check, and it's fine:

if (user.role !== "admin") return res.status(403).end();

Then the rules grow. Editors can edit but not delete. Users can edit their own posts. Billing admins can see invoices but not change roles. Six months later, the same permission is checked in fourteen handlers, each slightly differently, and one of them — the new endpoint someone added in a hurry — forgot the check entirely. That forgotten check is the breach.

Why scattered checks are dangerous, not just messy

When the permission logic lives inline in every handler, there is no single place to audit "who can delete an invoice?" — you have to read every handler and hope you found them all. Adding a new endpoint means remembering to add the check, and the failure mode of forgetting is silent: the endpoint works perfectly, for everyone, including people who shouldn't have access. Security logic whose failure mode is "silently allows everything" must not depend on a human remembering.

The rest of this dive is about getting the decision out of the handlers and into one place — and about choosing a model that can express your actual rules.

The three models: RBAC, ABAC, ReBAC

These sound like jargon, but each is just a different answer to "what does a permission depend on?"

RBAC — Role-Based

Permission depends on the user's role. "Admins can delete; editors can edit; viewers can read." You assign roles to users and permissions to roles. Simple, auditable, and enough for most apps. It breaks down when the answer depends on which specific thing you're touching.

ABAC — Attribute-Based

Permission depends on attributes of the user, the resource, and the context. "A manager can approve an expense if it's under ₹50,000 and belongs to their department and it's within the fiscal year." Rules are expressions over attributes. Powerful, but harder to audit — you can't just list who has access.

ReBAC — Relationship-Based (the one Google uses)

Permission depends on a relationship between the user and the resource, often through a graph. "You can view this document if you're its owner, or a member of a folder it lives in, or in a group that folder was shared with." This is the Google Zanzibar model behind Drive's sharing. It shines when access flows through nesting and sharing (folders, orgs, teams) where neither a flat role nor a simple attribute captures "is this user connected to this resource in a way that grants access?"

The progression is about expressiveness. Most apps start with RBAC and should — it's the easiest to reason about and audit. You reach for ABAC when decisions depend on the data (amounts, ownership, time), and for ReBAC when access flows through relationships and nesting (sharing, hierarchies). They compose: a real system is often "RBAC for coarse roles, plus attribute checks for the fine-grained 'is it yours and under the limit' rules."

DecisionStart with RBAC; add attribute and relationship rules only where the role alone can't answer the question.

RBAC's strength is that "who can do X?" has a listable answer — you can show an auditor a table. The moment you go to ABAC/ReBAC you trade that auditability for expressiveness: the answer becomes "run this rule against this data," which you can't enumerate by eye. So don't reach for the powerful model preemptively. Use the simplest model that expresses the real rule, and layer the expressive ones in only at the specific decisions that need them. Over-engineering authorization is how you end up with a policy engine nobody understands and bugs nobody can find.

Centralize the decision

Whatever model you pick, the structural fix is the same: the decision "is this allowed?" comes from one component, and every handler asks it. The handler's job is to enforce, not to decide.

PEP / PDP
Request
Handler (PEP)enforces
can(user, action, resource)decides
Decisionallow / deny
Handlers enforce; a single policy decision point decides. One place to audit, one place to fix.

This is the policy enforcement point (PEP) vs policy decision point (PDP) split. The handler is the enforcement point — it calls can(user, "delete", invoice) and returns 403 if denied. The decision point is the one function (or service, or policy file) that holds all the rules. Now "who can delete an invoice?" has exactly one place to read, one place to change, and one place to test.

// Handler only enforces:
async function deleteInvoice(req, res) {
  const invoice = await invoices.byId(req.params.id);
  if (!invoice) return res.status(404).end();
  await authorize(req.user, "invoice:delete", invoice); // throws 403 if denied
  await invoices.delete(invoice.id);
  res.status(204).end();
}

Every endpoint follows the identical shape: load the resource, authorize, act. The check can no longer be subtly different in each handler, and a code reviewer can scan for "does this handler call authorize before mutating?" as a single, mechanical rule.

404 vs 403, and the leak in the gap

Notice the order above: we load the invoice and return 404 before the auth check. That's a subtle information leak — a 403 on an ID that exists, vs 404 on one that doesn't, tells an attacker which invoice IDs are real. For sensitive resources, return 404 for "not found" and for "exists but you can't see it," so the two are indistinguishable. Decide this deliberately per resource; the default of "404 then 403" quietly enumerates your data for an attacker.

The bug that matters most: missing object-level checks

There's a hierarchy of authorization, and most breaches happen at the bottom of it:

  1. Function-level: can this user call this endpoint at all?

    "Is this user an admin?" The coarse check. Easy to get right, and the one people remember.

  2. Object-level: can this user act on THIS specific record?

    "Is this invoice theirs?" This is the one that gets forgotten, and forgetting it is the classic IDOR (Insecure Direct Object Reference) bug: the endpoint checks you're a logged-in user, fetches the invoice by the ID in the URL, and hands it over — without checking the invoice belongs to you. Change the ID in the URL, read someone else's data.

  3. Field-level: can this user see THIS column of this record?

    "Can a support agent see the customer's name but not their card number?" The finest grain, often handled by shaping the response per role.

The IDOR test you should run on every endpoint

For every endpoint that takes a resource ID, ask: if I log in as user B and pass user A's resource ID, what happens? If the answer isn't a clean 403/404, you have a data leak. This single question, applied mechanically to every /:id route, finds the most common serious vulnerability in web apps. The function-level check ("are you logged in?") passing is exactly what makes it sneaky — the request looks completely legitimate.

Multi-tenancy: when one bug leaks everyone

If your app serves multiple customers (tenants) from one database — the normal SaaS shape — authorization gets a hard floor: no query may ever cross a tenant boundary. Tenant A must never, under any bug, see tenant B's rows. A single missing WHERE tenant_id = ? is not a small bug here; it's a cross-customer data breach.

You don't want this to depend on every developer remembering the tenant_id filter on every query. The defenses, strongest last:

Filter in application code

Every query includes WHERE tenant_id = :current. Simplest, but one forgotten filter leaks data. Relies on discipline and review — exactly what we said security shouldn't rely on.

Enforce in the database (RLS)

Postgres Row-Level Security: you attach a policy to the table that says "rows are only visible where tenant_id matches the session's tenant," set the tenant on the connection, and the database itself refuses to return other tenants' rows — even if a developer writes a query that forgot the filter. The guarantee moves from "we remembered" to "the database enforces it."

DecisionSet the tenant context once per request and let Row-Level Security enforce isolation in the database.

Application-level filtering is one forgotten WHERE away from a breach, and that forgotten WHERE will pass every test that only uses one tenant's data. Row-Level Security makes the isolation a property of the data store, not of each query: even a buggy or malicious query can't escape its tenant. The cost is operational complexity (you must reliably set the tenant on every connection, and connection pooling makes that fiddly) and a little query overhead. For real multi-tenant data, that's a price worth paying, because the failure mode you're buying out of is "one customer reads another customer's data." A strong test for this: write an integration test that, as tenant B, tries every endpoint with tenant A's IDs and asserts it gets nothing.

The confused deputy: when your own server is the leak

A subtler class: the confused deputy. Your server holds powerful credentials — a database connection, an internal API key, an S3 token — and acts on behalf of a user. If you let the user's input steer that power without re-checking it against their permissions, your trusted server becomes their deputy, using its authority to do something they couldn't do themselves.

The classic shape: an endpoint takes a file key and streams it from S3 using the server's S3 credentials. The server is allowed to read every file; the user is not. If the endpoint doesn't verify this user may read this file before using its own powerful S3 access, the user passes any key and the server dutifully fetches it. The server wasn't tricked into revealing its credentials — it was tricked into using them on the attacker's behalf. The fix is always the same: authorize the user's right to the resource before you spend the server's authority on it.

The one idea to take away

Authorization is where data leaks, and almost always because the check was scattered, inconsistent, or missing — not because the model was too weak. So make the decision come from one place that every handler asks (can(user, action, resource)), pick the simplest model that expresses your real rules (RBAC, then attributes, then relationships), and never forget the object-level check: "is this specific record actually theirs?" In multi-tenant systems, push isolation down into the database with Row-Level Security so a forgotten filter can't breach a tenant boundary. And whenever your server uses its own privileged credentials on a user's behalf, re-check the user's right first — or you've built a confused deputy.

Test yourself

Questions· say the answer out loud before you open it. If you can't, the chapter isn't done.

QWhy is 'user A saw user B's data' an authorization bug, not an authentication bug?+

Because user A authenticated correctly — they really are who they claim to be. What failed is the decision about what they're allowed to do: the system let them act on a resource that wasn't theirs. Authentication is the login; authorization is the per-action permission check, and that's the layer that leaked.

QWhat's the difference between RBAC, ABAC, and ReBAC, and when do you reach for each?+

RBAC keys permission off the user's role ("admins can delete") — simple and auditable, the right default. ABAC keys off attributes of user/resource/context ("approve if under ₹50k and in your department") — reach for it when the decision depends on the data. ReBAC keys off a relationship, often a graph ("you can view it if you're in a group the folder was shared with") — reach for it when access flows through sharing and nesting, like Google Drive. Start with RBAC and layer the others in only where a role can't answer the question.

QWhat does centralizing authorization with a PEP/PDP split actually buy you?+

The handler becomes a pure enforcement point that calls one decision function (authorize(user, action, resource)); all the rules live in one decision point. That gives you exactly one place to read "who can do X?", one place to change it, one place to test it, and a mechanical review rule ("does this handler authorize before mutating?"). Scattered inline checks have no single source of truth and fail silently when someone forgets one.

QWhat is an IDOR / missing object-level check, and how do you test for it?+

It's when an endpoint verifies you're a logged-in user (function-level) but not that this specific record belongs to you (object-level) — so changing the ID in the URL returns someone else's data. Test it mechanically: on every /:id route, log in as user B, pass user A's ID, and assert you get a clean 403/404. It's sneaky precisely because the request looks legitimate and the coarse "are you logged in?" check passes.

QIn a multi-tenant app, why prefer Row-Level Security over filtering in application code?+

Application filtering depends on every developer remembering WHERE tenant_id = ? on every query, and one forgotten filter is a cross-customer breach that passes any single-tenant test. Row-Level Security makes isolation a property of the database: you set the tenant on the connection and Postgres refuses to return other tenants' rows even for a query that forgot the filter. It moves the guarantee from "we remembered" to "the engine enforces it," at the cost of some operational complexity.

QWhat is a confused-deputy bug?+

Your server holds privileged credentials (DB, S3, internal API) and acts on a user's behalf. If it lets the user's input steer that power without re-checking the user's own permissions, the server uses its authority to do something the user couldn't — e.g. streaming any S3 file by key because the server can read all files. The server wasn't tricked into leaking its credentials; it was tricked into using them. Fix: authorize the user's right to the resource before spending the server's authority.

QWhy can returning 403 vs 404 leak information, and what do you do about it?+

A 403 ("exists but forbidden") on a given ID, versus 404 ("doesn't exist"), tells an attacker which IDs are real — they can enumerate your resources by watching which response they get. For sensitive resources, return 404 for both "not found" and "exists but you can't see it," so the two are indistinguishable. Decide this deliberately per resource rather than defaulting to "404 then 403," which quietly maps out your data.

Before you leave — how confident are you with this?

Your honest rating shapes when you'll see this again. No grades, no shame.

More deep dives

Comments

to join the discussion.

Loading comments…