How Should You Serve Your Access Tokens: JWTs, Phantom, or Split?
OAuth is an authorization framework. It enables parties (clients) to access resources that belong to users by supplying a digital identity in the form of an access token. Properly designing these access tokens is essential to a system's security, yet their design is often overlooked. Developers tend to focus more on properly implementing OAuth or OpenID Connect flows. While secure flows are essential, they should be complemented by a keen focus on access tokens.
Access tokens can be classified in different ways. For example, the "bearer tokens vs. proof-of-possession tokens" classification focuses on which client is eligible to utilize a given token. Another classification focuses on the format of tokens — by-reference (opaque) vs. by-value tokens. The latter is usually represented in the OAuth world by JSON Web Tokens (JWTs).
Opaque tokens are just that — opaque strings that do not have any meaning to the client. They are simply used to access some resources and are not useful for any other purpose. When presented with an access token, the receiver (a resource server) must verify it and get the token's associated data to perform authorization decisions. When it receives an opaque token, it must call the issuer to verify the token and get this data. In contrast, when a by-value token is used, the receiver already has all the information in the token and can perform authorization without needing to call the issuer. That's why JWTs became so popular in the OAuth world (or, more precisely, JSON Web Signature (JWS)).
The Problem of By-Value Tokens
By-value tokens are helpful for APIs that receive them but have some inherent issues. Unless the token is encrypted (and usually it is not, as encryption is costly in terms of resources and management), the content of a JWT is readable by anyone who happens to possess it. This means that the token can leak information about the user, which may become pretty serious if it contains Personally Identifiable Information (PII).
Imagine a healthcare service that places a user's social security number or sensitive health information in the access token. That information is then readable by anyone who has the JWT. It might be a malicious actor who manages to intercept the JWT. Or, it could also be an application, a legitimate OAuth client, that harvests user information.
The access token is intended for the API — the resource server. The data contained in that token is the data the API needs to perform authorization decisions. But since the data is available to the OAuth client, integrators tend to start using it for their own purpose in their applications. This means that the content of an access token becomes part of your API's contract, and you can no longer introduce breaking changes. For example, removing a claim from the token or changing its length might break applications that integrate with your API. This is an undesirable situation, as you might unwillingly harm your customers' businesses.
PII is not the only potentially leaked data that you should worry about. The access token's claims might reveal information about your infrastructure. This data is something a hacker might exploit to breach your defenses.
The above issues mean that by-value tokens should only be used inside your infrastructure when the token is not publicly available on the Internet. Opaque tokens are also better for storing tokens in cookies (e.g., for web apps or Single Page Applications that utilize a Token Handler) since they fit inside browser size limits. Still, the convenience of JWTs means it's hard to forfeit them. This is where the phantom token and split token come to help.
The Phantom Token
The phantom token pattern combines the best of two worlds — opaque and by-value tokens. In this pattern, the authorization server issues opaque tokens to the OAuth client, preventing leaked information from the token. When the client calls the API, an API gateway comes in (as it is best practice to let all incoming traffic come through an API gateway) and performs a token introspection using the `application/jwt` media type. As a result of the introspection, the gateway receives a JWT, which corresponds to the opaque token, and sends the JWT to the downstream services that handle the request. Thanks to that, all services used to process the request deal with a JWT. They can then use its contents to perform authentication decisions without the need to call the authorization service to get the opaque token's data.
Introspection involves fast local calls from the API gateway to the authorization server. These only occur occasionally as the results should be cached for subsequent requests with the same access token. Such an approach further improves the performance of this pattern.
The Split Token
The split token pattern is in many ways similar to the previously described phantom token. At least the result is the same — the OAuth client deals with an opaque token, and the API's services handle JWTs. In this pattern, though, the API gateway doesn't have to call the introspection endpoint to get a JWT. When the authorization server issues the JWT, it splits it into two parts: one consists of the header and payload of the JWT, and the other is the token's signature. The signature part is returned to the client and used as the opaque token. The header and payload are sent to the API gateway where they are cached, with the signature's hash used as the cache key.
When the client calls the API, it sends the signature part of the token. The API gateway hashes it, looks up the corresponding header and payload in the cache, and, if found, glues the pieces back together to form a JWT. The JWT is thus reconstructed without the need to contact the authorization server.
When to Use What
If you're wondering when to use the approaches described here, the algorithm is quite simple: if your tokens leave your infrastructure, then go with the phantom token approach. This should be your default choice whenever you have OAuth clients that connect to your API on the public Internet — regardless if these are first-party or third-party clients. It's as simple as that.
You should consider implementing the split token approach if you have any of these scenarios:
You use a highly distributed API gateway. In such a scenario, when phantom tokens are used, requests might end up waiting for the token introspection, even when the gateway caches the response from the authorization server. This is because every new request might end up calling a different instance or cluster of the gateway.
Latency is crucial in your system, and you want to avoid the API gateway having to contact the authorization server's introspection endpoint, even if the gateway caches the JWT token.
For security reasons, you don't want JWT access tokens to be stored in the API gateway's cache. In the split token approach, the gateway's cache does not contain the whole JWT (only the header plus payload and the signature's hash), so even if the cache is compromised, the attacker will not be able to exploit the API.
I've already mentioned the primary purpose of using either the phantom token or the split token approach — you end up with a system where you are sure that no data from the token is leaked outside your infrastructure. However, your services still benefit from using by-value tokens. This allows you to be a bit more lenient about what to put into the access token, and it's much simpler to modify the token's contents. The importance of not using JWTs publicly cannot be overstated. Not only does it improve the security and privacy of your system, but it also prevents you from causing outages in your integrators' applications.
Additionally, when you implement the split token approach, you are sure that anyone with the opaque token must still make requests through your API gateway. This might protect you from abuse by malicious actors inside your organization. Since neither the client nor the API gateway possesses the whole token, the two must act together to make a successful request to the API. (Similar functionality can be achieved with the phantom token approach if you ensure that only the API gateway can make introspection requests). Moreover, when you use split tokens, you don't need the authorization server to be online when the API is called, which might be crucial to some projects.
No solution comes without downsides, and it's essential to realize the limitations of every feature, pattern, or approach you adopt in your architecture. When you decide to implement phantom tokens instead of using by-value tokens directly, your system again relies on the availability of the authorization server. The API gateway needs to be able to call the authorization server to introspect the opaque token. Of course, the traffic to the authorization server is much smaller than in a situation where only opaque tokens are used, where all services need to verify tokens themselves.
The API gateway can use a cache to further limit traffic to the authorization server, especially in systems where one client makes many requests with the same token. Using a cache has all the usual issues — the cache must be invalidated properly, and you might have to synchronize instances of the cache between clusters or data centers. Using a cache gets a bit more complicated if your system revokes tokens — the cache needs to be invalidated as fast as possible so that requests with a revoked token cannot reach the API.
As the split token approach requires a distributed cache, the cache issues are innate to this approach. The cache must also be effectively populated during token creation so API requests are not dropped due to cache misses.
In either approach, you must remember that the API services, which receive the JWT access tokens, should still implement a zero-trust approach and validate the tokens according to the best security practices. This is especially important when using the split token approach to prevent cache poisoning attacks — where someone would inject a different header and payload for a given signature.
To Sum Up
Remember that a proper access token design should be an essential part of the security architecture of your system. Developers tend to default to using JWTs and don't consider the consequences thoroughly. The rule of thumb here should be to always use opaque tokens whenever you send them outside your infrastructure. For the convenience of your services, the phantom token pattern should be used as the default option.
You should consider implementing the split token approach if you're using a distributed API gateway, when the latency is crucial (you want to avoid calling the authorization server during API requests), or when, for security reasons, you don't want the API gateway to have access to a complete JWT access token. The split token approach might seem very attractive, but of the solutions presented in this post, it is also the most complicated one to implement. As usual, you should thoroughly consider all the pros and cons of any of the possibilities.
If you want to try these patterns, have a look at our API gateway guides, where we show how to implement phantom and split tokens in some of the popular API gateways.