Log to Elasticsearch

Log to Elasticsearch

Intro

This tutorial will show one possible way to implement Logging Best Practices for the Curity Identity Server, by aggregating technical support logs to the popular open source Elasticsearch system. The result will be a productive setup where logs are immediately available to query whenever you need to troubleshoot.

Components

The following Elastic Stack components will be deployed to a demo level for a development computer:

ComponentOverview of Behavior
ElasticsearchThe main system, which receives logs via API endpoints and stores the data
KibanaA UI where authorized users can query Elasticsearch and query the cluster wide log data
FilebeatA log shipping component deployed inside the cluster, to upload log files to Elasticsearch

The components will be deployed within a Kubernetes minikube setup, to best demonstrate how to collect logs from multiple instances of the Curity Identity Server. Other log deployment setups are also possible, such as shipping logs outside the cluster to a managed cloud service such as Elastic Cloud.

Usage

To simplify infrastructure the following plain HTTP URLs will be used, after which you will be able to log in to Kibana as user who is authorized to view logs:

ComponentURL
Kibanahttp://elastic.local
Elasticsearchhttp://api.elastic.local

This will enable field by field queries against the denormalized log data:

POST _sql?format=txt
{
"query": "select hostname, contextMap.RequestId, http.uri, http.status, http.duration from \"curityrequest*\" order by timestamp desc limit 100"
}

Results will then be returned for the entire cluster, and you can work with this denormalized data however you like, if you need to investigate errors or slowness:

Cluster Wide Query

GitHub Repository

Some Elastic Stack helper resources are available, which you can download by cloning the below GitHub repository:

git clone https://github.com/curityio/kubernetes-logaggregation-elasticsearch

This contains preconfigured Kubernetes YAML resources and some helper scripts for quickly deploying the system:

Helper Resources

Deploy the Base System

To deploy the Curity Identity Server, follow the Kubernetes Demo Installation. Before starting the install there are three customization steps, starting with updating the create-cluster.sh to allocate sufficient Kubernetes resources:

minikube start --cpus=4 --memory=16384 --disk-size=50g --driver=hyperkit --profile curity

Next replace the demo installation’s log4j2.xml file with that from the above GitHub resources. This file uses log4j2 layouts configured for log shipping, with a single line of JSON per logging event:

<JSONLayout compact="true" eventEol="true" properties="true" stacktraceAsString="true">
    <KeyValuePair key="hostname" value="${env:HOSTNAME}" />
    <KeyValuePair key="timestamp" value="$${date:yyyy-MM-dd'T'HH:mm:ss.SSSZ}" />
</JSONLayout>

Finally edit the helm-values.yml file from the demo installation to set stdout=true for runtime nodes, and to activate tailing of request logs, as shown in the last three lines below. This ensures that request log files are written to the minikube virtual machine’s logging file share.

runtime:
    role: default
    service:
      type: ClusterIP
      port: 8443
    livenessProbe:
      timeoutSeconds: 1
      failureThreshold: 3
      periodSeconds: 10
      initialDelaySeconds: 120
    readinessProbe:
      timeoutSeconds: 1
      failureThreshold: 3
      successThreshold: 3
      periodSeconds: 10
      initialDelaySeconds: 120
    logging:
      image: "busybox:latest"
      level: INFO
      stdout: true
      logs:
      - request

Generate Logs via an Application

This tutorial will use the Hypermedia Authentication API (HAAPI) code example that is shipped with the base system, which will be available with these connection details:

FieldValue
Sample URLhttps://login.curity.local/demo-client.html
User Namejohn.doe
PasswordPassword1

Working Sample

Signing into the app will generate some logs that can be found by first remoting to the minikube virtual machine:

minikube ssh --profile curity

The JSON log files can then be found via the following bash command, and they are ready for sending to Elasticsearch:

ls /var/log/containers/curity-idsvr-runtime*

Node Logs

Deploy Elastic Components

Run the following command to get the IP address of the minikube virtual machine:

minikube ip --profile curity

Then update your hosts file, to add the two Elastic domains to those of the base installation, all mapped to the minikube virtual machine’s IP address:

192.168.64.3   login.curity.local admin.curity.local api.elastic.local elastic.local

Then run the deployment script, which will take a few minutes the first time, since it will download the large Docker images for Elasticsearch and Kibana:

./1-deploy-elastic-stack.sh

The elastic/service.yaml file configures xpack security in a basic way, which will ensure that a user called elastic must provide a password in order to access logs:

spec:
  containers:
  - name: elastic
    image: docker.elastic.co/elasticsearch/elasticsearch:7.15.0
    env:
    - name: discovery.type
      value: 'single-node'
    - name: xpack.security.enabled
      value: 'true'
    - name: xpack.security.http.ssl.enabled
      value: 'false'
    - name: xpack.security.authc.api_key.enabled
      value: 'false'
    - name: ELASTIC_PASSWORD
      value: 'Password1'

The kibana/service.yaml file points to the Elasticsearch API inside the cluster:

spec:
  containers:
  - name: kibana
    image: docker.elastic.co/kibana/kibana:7.15.0
    env:
    - name: ELASTICSEARCH_HOSTS
      value: 'http://elastic-svc:9200'
    - name: ELASTICSEARCH_USERNAME
      value: 'elastic'
    - name: ELASTICSEARCH_PASSWORD
      value: 'Password1'
    - name: SERVER_SSL_ENABLED
      value: 'false'
    - name: SERVER_PUBLICBASEURL
      value: http://elastic.local

Once deployment has completed you can then sign in to Kibana at http://elastic.local with credentials elastic / Password1:

Helper Resources

You can also connect to the REST API from the command line:

curl -u 'elastic:Password1' http://api.elastic.local

For a production Elasticsearch setup it would then be straightforward to set up other security features, including TLS, authorized users and groups, and API keys for using the REST API. See the Elastic Security Docs for further details.

Configure Log Shipping

To complete the setup, run the following script, which will configure the Filebeat log shipper and will also set up Elasticsearch to receive logs correctly:

./2-deploy-log-shipping.sh

The cluster will then contain the following pods in total, and note that Filebeat is deployed as a DaemonSet, so that in a production setup there would be one instance per node:

Kubernetes Resources

The Filebeat-kubernetes.yaml file is configured generically, to tell Filebeat where to get logs and how to connect to Elasticsearch. This component then does the job of tracking new logging events in files, then sending rows of text in batches to Elasticsearch.

filebeat.inputs:
- type: container
  paths:
    - /var/log/containers/curity-idsvr-runtime*.log
  json:
    keys_under_root: true
    add_error_key: false

output.elasticsearch:
  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
  username: ${ELASTICSEARCH_USERNAME}
  password: ${ELASTICSEARCH_PASSWORD}
  index: "curitysystem-%{+yyyy.MM.dd}"
  pipelines:
  - pipeline: curity-ingest-pipeline

setup.ilm.enabled: false
setup.template.name: curitysystem
setup.template.pattern: curitysystem-*

processors:
- drop_fields:
    fields: ['agent', 'ecs', 'host', 'input', 'version']

Initial Queries

At this point data will start feeding into Elasticsearch, and you can start querying cluster wide logs by browsing to the Kibana Dev Tools and querying one of these indexes:

Index NameType of Data
curitysystem-[date]The default index, used for system logs
curityrequest-[date]The schema containing request data

JSON Document

Each logging event is stored as a type-safe JSON document and there are multiple ways to search data, including Lucene Queries that return entire documents or SQL Queries that return only the fields of interest.

The example setup defines index templates before data is received, including Explicit Mappings, to ensure that fields are received with the correct data types, so that integers and dates sort and filter correctly.

Data Ingestion

When logging events are received, it is usual to need to perform custom actions, and in Elasticsearch this is done using an Ingest Pipeline, which can run one or more Processors. The following text is from a ScriptProcessor that uses the built-in Painless Scripting Language to dynamically set the index based on the log4j2 logger name received with each logging event:

String loggerName = ctx.loggerName;
if (loggerName.contains('RequestReceiver')) {
  ctx['_index'] = ctx['_index'].replace('curitysystem', 'curityrequest');

During development, the ingest pipeline logic can be tested by the Simulate Pipeline API and this was how the example processors were tested. The example setup ensures both that data is received to the correct places and that fields generated by logging components are removed.

Final Log Data

The system logs contain the following fields, with a clean data structure that is easy to query. It can be useful to filter on level=WARN or level=ERROR when troubleshooting, and this table is where exception stack traces are found, in the event of infrastructure problems:

{
    "loggerFqcn" : "org.apache.logging.slf4j.Log4jLogger",
    "level" : "WARN",
    "thread" : "req-109",
    "message" : "Issuing unsafe web CAT because client configuration is set to skip attestation validation!",
    "threadPriority" : 5,
    "threadId" : 12,
    "hostname" : "curity-idsvr-runtime-7454859bbc-9r5js",
    "loggerName" : "se.curity.identityserver.controllers.cat.UnsafeCatManager",
    "contextMap" : {
    "RequestId" : "RwwqCLwS",
    "SessionId" : "3306c631",
    "LongSessionId" : "3306c631-9120-4b60-b6e6-eb571bbe63f5"
    },
    "timestamp" : "2021-11-11T16:40:24.257+0000"
}

The request logs include additional supporting details such as status codes and request durations, and filtering on these fields can also be useful when troubleshooting failures or slowness:

{
    "loggerFqcn" : "org.apache.logging.log4j.spi.AbstractLogger",
    "level" : "INFO",
    "thread" : "req-106",
    "message" : "",
    "threadPriority" : 5,
    "threadId" : 12,
    "hostname" : "curity-idsvr-runtime-7454859bbc-9r5js",
    "http" : {
    "duration" : "403",
    "protocol" : "HTTP/1.1",
    "method" : "GET",
    "size" : "383",
    "content-type" : "application/vnd.auth+json",
    "params" : "",
    "secure" : "true",
    "uri" : "/oauth/v2/oauth-authorize",
    "accept" : "application/vnd.auth+json",
    "status" : "200"
    },
    "contextMap" : {
    "RequestId" : "Lo0wiUp3"
    },
    "loggerName" : "se.curity.identityserver.app.RequestReceiver",
    "timestamp" : "2021-11-11T16:40:25.032+0000"
}

Live Log Analysis

Once the data is in a clean format you can navigate within Kibana to Index Patterns, then configure the range of data to analyze:

Kibana Index Patterns

Many analysis features can then be used, and the Discover feature is very useful for near real time log visualization. Items can then be quickly filtered, for example to find all entries matching the Request ID of a slow or failed request:

Kibana Live Analysis

Detailed Queries

The Kibana Dev Tools remain useful for more complex queries, where you have full control over the commands sent to Elasticsearch, as in the following examples:

POST curitysystem*/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "match": {"level": "ERROR"}
        },
        {
          "range": {"timestamp": {"gte": "2021-11-11"}}
        }
      ]
    }
  },
  "sort": [
    {"timestamp": "desc"}
   ],
  "from" : 0, "size" : 5
}
POST _sql?format=txt
{
"query": "select contextMap.RequestId, contextMap.SessionId, http.method, http.uri, http.status, http.duration from \"curityrequest*\" order by http.duration desc limit 50"
}

Conclusion

This tutorial showed one way to use the Curity Identity Server’s extensible logging system and manage logs for your cluster, though the Elastic Stack was only deployed to a demo level. In a real system, this type of setup would provide modern troubleshooting capabilities for technical support staff, to help ensure fast problem resolution and reliability over time.