Data Privacy and GDPR Best Practices
Privacy and its enforcement via GDPR and other legislation can be a source of problems if the identity architecture isn’t set up correctly. In this article, we’ll go through a few of the best practices to follow when applying a Neo-Security Architecture to a privacy problem.
Centralized Identities
By centralizing the issuance of identity information with the Identity Management System (IMS) your organization is already on track for a privacy-centric system. All identity-related data should be considered a privacy concern until deemed otherwise. This means that when a user authenticates, all logs, events, audits, database access need to be understood and handled correctly. If this is spread out across too many system-domains it quickly becomes unmanageable. The IMS addresses this issue by encapsulating it into three distinct services under tight control: The Authentication Service, the Token Service, and the User Management Service.
All data that the authentication service has at hand while authenticating a user does not need to be communicated to the end system. The result of authentication is a canonicalized set of attributes describing the authenticated identity. The only recipient of this data is the Token Service (or federation service in some cases), which narrows the potential leak of information significantly.
Token Issuance and Claims
A token contains information about the identity it was issued for. These are claims that should be carefully crafted to ensure that the tokens don’t contain more claims than the API needs to fulfill the request.
Some simple guidelines should be followed when designing tokens:
- Use a global user identifier such as a GUID instead of real user names
- Group claims with scopes and only assign sensitive scopes to trusted clients and APIs
- Only add claims to the token that are needed for the scope of access for which it is issued
- Use a privacy safe token strategy
Using Global Identifiers
In many cases, it’s possible to avoid personally identifiable information (PII) all together in the tokens by anonymizing the subject and carefully adding claims. Doing so reduces the scope for where to manage privacy-related data.
Some tokens, such as an OpenID Connect ID token, are issued directly to the client; this could be a website or a mobile application. In these cases, there are many standard claims that may be present and contain PII. Always consider if these are necessary to include. More times than not, they can be removed.
OpenID Connect provides a standardized way to anonymize the user against the client. This is called a Pairwise Pseudonym Identifier (PPID). This hides the real user ID from the client and provides an identifier that is unique to the client per user. The benefit of using this is that not only is the subject hidden, but it is impossible for two clients to collude and see if the same user is visiting them.
Scopes and Claims
A scope in OAuth defines the scope of access for the client to the APIs. Some access is usually more sensitive than others. By grouping the claims in a token to scopes, it’s easy to control that certain claims are only issued when a particularly sensitive scope is approved. This reduces the data surface of APIs that are exposed to sensitive data.
Privacy Safe Token Strategies
Another important aspect is to use a proper token strategy. At Curity, we have invented two essential strategies: The Phantom Token and the Split Token approach. Both approaches utilize an opaque token on the Internet and translate it to a JSON Web Token (JWT) once the token passes the internal network barrier. This prevents any unintentional leaks of information to the Internet since there is no way a client can decode such a token.