/images/resources/tutorials/configuration/tutorials-elasticsearch.png

Log Aggregation to Elasticsearch

On this page

This tutorial will show one possible way to implement Logging Best Practices for the Curity Identity Server, by aggregating technical support logs to the popular open source Elasticsearch system. The result will be a productive setup where logs are immediately available to query whenever you need to troubleshoot.

The following Elastic Stack components will be deployed to a demo level to a local cluster.

ComponentOverview of Behavior
ElasticsearchThe main system, which receives logs via API endpoints and stores the data
KibanaA UI where authorized users can query Elasticsearch and query the cluster wide log data
FilebeatA log shipping component deployed inside the cluster, to upload log files to Elasticsearch

The deployment enables users to make field by field queries from Kibana against aggregated logs:

1234
POST _sql?format=txt
{
"query": "select hostname, contextMap.RequestId, http.uri, http.status, http.duration from \"curityrequest*\" order by timestamp desc limit 100"
}

Results will then be returned for all Curity Identity Server pods, which you can use to investigate errors or slowness:

Cluster Wide Query

GitHub Repository

Elastic Stack helper resources can be cloned from GitHub, using the link at the top of this page. The resources contain preconfigured Kubernetes YAML resources and some helper scripts for quickly deploying a working end-to-end setup:

Helper Resources

Deploy the Curity Identity Server

Start with a local Kubernetes deployment of the Curity Identity Server. If you do not have one, use the Curity Identity Server deployment from the Kubernetes Training repository. You can also run the Curity Token Handler deployment to enable an end-to-end OAuth flow with a Single Page Application and API, to generate some logs.

Before running your deployment, locate the Helm values.yaml file and edit the logging properties for runtime nodes, setting stdout=true. Also activate tailing of request logs, as shown in the last three lines below:

yaml
123456789
curity:
runtime:
...
logging:
level: INFO
image: "busybox:latest"
stdout: true
logs:
- request

Then locate your log4j2.xml file and update the following two appenders, replacing the default PatternLayout with the following JSONLayout values. This includes two extra fields that will be useful in Kibana queries later:

xml
12345678910111213141516
<Appenders>
<Console name="stdout" target="SYSTEM_OUT">
<JSONLayout compact="true" eventEol="true" properties="true" stacktraceAsString="true">
<KeyValuePair key="hostname" value="${env:HOSTNAME}" />
<KeyValuePair key="timestamp" value="$${date:yyyy-MM-dd'T'HH:mm:ss.SSSZ}" />
</JSONLayout>
...
</Console>
<RollingFile name="request-log" fileName="${env:IDSVR_HOME}/var/log/request.log" filePattern="${env:IDSVR_HOME}/var/log/request.log.%i.gz">
<JSONLayout compact="true" eventEol="true" properties="true" stacktraceAsString="true">
<KeyValuePair key="hostname" value="${env:HOSTNAME}" />
<KeyValuePair key="timestamp" value="$${date:yyyy-MM-dd'T'HH:mm:ss.SSSZ}" />
</JSONLayout>
...
</RollingFile>
</Appenders>

Deploy Elastic Components

Run the following script, which exposes an external Elasticsearch URL at http://api.elastic.local and an external Kibana URL at http://elastic.local.

bash
1
./1-deploy-elastic-stack.sh

The elastic/service.yaml file configures xpack security in a basic way, which will ensure that a user called elastic must provide a password in order to access logs:

yaml
123456789101112131415
spec:
containers:
- name: elastic
image: docker.elastic.co/elasticsearch/elasticsearch:latest
env:
- name: discovery.type
value: 'single-node'
- name: xpack.security.enabled
value: 'true'
- name: xpack.security.http.ssl.enabled
value: 'false'
- name: xpack.security.authc.api_key.enabled
value: 'false'
- name: ELASTIC_PASSWORD
value: 'Password1'

The kibana/service.yaml file points to the Elasticsearch API inside the cluster:

yaml
123456789101112131415
spec:
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana:latest
env:
- name: ELASTICSEARCH_HOSTS
value: 'http://elastic-svc:9200'
- name: ELASTICSEARCH_USERNAME
value: 'kibana_system'
- name: ELASTICSEARCH_PASSWORD
value: 'Password1'
- name: SERVER_SSL_ENABLED
value: 'false'
- name: SERVER_PUBLICBASEURL
value: http://elastic.local

Once deployment is complete, get the external IP address of the API gateway with a command similar to the following:

bash
1
kubectl -n apigateway get svc

Then make the domains resolvable by adding them to the /etc/hosts file:

text
1
172.18.0.5 api.elastic.local elastic.local

Then sign in to Kibana at http://elastic.local with credentials elastic / Password1:

Helper Resources

You can also connect to the REST API from the command line:

bash
1
curl -u 'elastic:Password1' http://api.elastic.local

For a production Elasticsearch setup it would then be straightforward to set up other security features, including TLS, authorized users and groups, and API keys for using the REST API. See the Elastic Security Docs for further details.

Configure Log Shipping

To complete the setup, run the following script, which will configure the Filebeat log shipper and will also set up Elasticsearch to receive logs correctly:

bash
1
./2-deploy-log-shipping.sh

The cluster will then contain the following pods in total, and note that Filebeat is deployed as a DaemonSet, so that in a production setup there would be one instance per node:

Kubernetes Resources

The filebeat-kubernetes.yaml file is configured generically, to tell Filebeat where to get logs and how to connect to Elasticsearch. This component then does the job of tracking new logging events in files, then sending rows of text in batches to Elasticsearch.

yaml
1234567891011121314151617181920212223
filebeat.inputs:
- type: container
paths:
- /var/log/containers/curity-idsvr-runtime*.log
json:
keys_under_root: true
add_error_key: false
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
index: "curitysystem-%{+yyyy.MM.dd}"
pipelines:
- pipeline: curity-ingest-pipeline
setup.ilm.enabled: false
setup.template.name: curitysystem
setup.template.pattern: curitysystem-*
processors:
- drop_fields:
fields: ['agent', 'ecs', 'host', 'input', 'version']

Initial Queries

At this point data will start feeding into Elasticsearch, and you can start querying cluster wide logs by querying one of these indexes:

Index NameType of Data
curitysystem-[date]The default index, used for system logs
curityrequest-[date]The schema containing request data
JSON Document

Each logging event is stored as a type-safe JSON document and there are multiple ways to search data, including Lucene Queries that return entire documents or SQL Queries that return only the fields of interest. The example deployment creates index templates before data is received, including Explicit Mappings, to ensure that fields are received with the correct data types, so that integers and dates sort and filter correctly.

Data Ingestion

When logging events are received, it is usual to need to perform custom actions, and in Elasticsearch this is done using an Ingest Pipeline, which can run one or more Processors. The following text is from a ScriptProcessor that uses the built-in Painless Scripting Language to dynamically set the index based on the log4j2 logger name received with each logging event:

text
123
String loggerName = ctx.loggerName;
if (loggerName.contains('RequestReceiver')) {
ctx['_index'] = ctx['_index'].replace('curitysystem', 'curityrequest');

During development, the ingest pipeline logic can be tested by the Simulate Pipeline API and this was how the example processors were tested. The example setup ensures both that data is received to the correct places and that fields generated by logging components are removed.

Final Log Data

The system logs contain the following fields, with a clean data structure that is easy to query. It can be useful to filter on level=WARN or level=ERROR when troubleshooting, and this table is where exception stack traces are found, in the event of infrastructure problems:

json
12345678910111213141516
{
"loggerFqcn" : "org.apache.logging.slf4j.Log4jLogger",
"level" : "WARN",
"thread" : "req-109",
"message" : "Issuing unsafe web CAT because client configuration is set to skip attestation validation!",
"threadPriority" : 5,
"threadId" : 12,
"hostname" : "curity-idsvr-runtime-7454859bbc-9r5js",
"loggerName" : "se.curity.identityserver.controllers.cat.UnsafeCatManager",
"contextMap" : {
"RequestId" : "RwwqCLwS",
"SessionId" : "3306c631",
"LongSessionId" : "3306c631-9120-4b60-b6e6-eb571bbe63f5"
},
"timestamp" : "2021-11-11T16:40:24.257+0000"
}

The request logs include additional supporting details such as status codes and request durations, and filtering on these fields can also be useful when troubleshooting failures or slowness:

json
1234567891011121314151617181920212223242526
{
"loggerFqcn" : "org.apache.logging.log4j.spi.AbstractLogger",
"level" : "INFO",
"thread" : "req-106",
"message" : "",
"threadPriority" : 5,
"threadId" : 12,
"hostname" : "curity-idsvr-runtime-7454859bbc-9r5js",
"http" : {
"duration" : "403",
"protocol" : "HTTP/1.1",
"method" : "GET",
"size" : "383",
"content-type" : "application/vnd.auth+json",
"params" : "",
"secure" : "true",
"uri" : "/oauth/v2/oauth-authorize",
"accept" : "application/vnd.auth+json",
"status" : "200"
},
"contextMap" : {
"RequestId" : "Lo0wiUp3"
},
"loggerName" : "se.curity.identityserver.app.RequestReceiver",
"timestamp" : "2021-11-11T16:40:25.032+0000"
}

Live Log Analysis

Once the data is in a clean format you can navigate within Kibana to Data Views, then configure the range of data to analyze:

Kibana Data Views

Many analysis features can then be used, and the Discover feature is very useful for near real time log visualization. Items can then be quickly filtered, for example to find all entries matching the Request ID of a slow or failed request:

Kibana Live Analysis

Detailed Queries

The Kibana Dev Tools remain useful for more complex queries, where you have full control over the commands sent to Elasticsearch, as in the following examples:

json
12345678910111213141516171819
POST curitysystem*/_search
{
"query": {
"bool": {
"filter": [
{
"match": {"level": "ERROR"}
},
{
"range": {"timestamp": {"gte": "2021-11-11"}}
}
]
}
},
"sort": [
{"timestamp": "desc"}
],
"from" : 0, "size" : 5
}
sql
1234
POST _sql?format=txt
{
"query": "select contextMap.RequestId, contextMap.SessionId, http.method, http.uri, http.status, http.duration from \"curityrequest*\" order by http.duration desc limit 50"
}

Conclusion

This tutorial showed one way to use the Curity Identity Server's extensible logging system and manage logs for your cluster, though the Elastic Stack was only deployed to a demo level. In a real system, this type of setup would provide modern troubleshooting capabilities for technical support staff, to help ensure fast problem resolution and reliability over time.

Join our Newsletter

Get the latest on identity management, API Security and authentication straight to your inbox.

Start Free Trial

Try the Curity Identity Server for Free. Get up and running in 10 minutes.

Start Free Trial