Logging on kubernetes with fluentd and elasticsearch 6

TLDR

The solution I have used in the past for logging in kubernetes clusters is EFK (Elastic-Fluentd-Kibana). Alot of you have probably heard of ELK stack but I find that logstash is more heavyweight and does not provide the same output plugins as fluentd.

Elasticsearch

Elasticsearch

Elasticsearch 6 has just been announced with some major performance improvements.

You have a few options to deploy elasticsearch - elastic cloud or spin up your own ES cluster in kubernetes. But I have decided to go with the AWS managed service. It now has VPC support and the new release 6.0.

The reason being that administrating Elasticsearch can be a lot of work, as many people experienced with the system will tell you it can be tricky to keep running smoothly and that it’s a task better outsourced to an external service.

I recently read dubsmash story https://stackshare.io/dubsmash/dubsmash-scaling-to-200-million-users-with-3-engineers and how they focus on the product deliverables and not infrastructure. If you have a small team or not in the business of managing services such as ES, i find its best to outsource..

Following the 12 factor principles for logging https://12factor.net/logs all applications that are containerised (using docker) should log to STDOUT.

A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout.

Whether your using winston for nodeJS or console appender for spring boot, this is the recommended way.

Every pod in a K8S cluster has its standard output and standard error captured and stored in the /var/log/containers/ node directory.

Now you need a logging agent ( or logging shipper) to ingest these logs and output to a target.

FluentD

Fluentd is an open-source framework for data collection that unifies the collection and consumption of data in a pluggable manner. There are several producer and consumer loggers for various kinds of applications. It has a huge suite of plugins to choose from.

Fluentd will be deployed as a daemonset on the kubernetes cluster.

Kubernetes logs the content of the stdout and stderr streams of a pod to a file. It creates one file for each container in a pod. The default location for these files is /var/log/containers . The filename contains the pod name, the namespace of the pod, the container name, and the container id. The file contains one JSON object per line of the two streams stdout and stderr.

A DaemonSet ensures that a certain pod is scheduled to each kubelet exactly once. The fluentd pod mounts the /var/lib/containers/ host volume to access the logs of all pods scheduled to that kubelet.

Whats missing - Geo

I found with the examples online for fluentd daesmonset there was none that supported geoip. Coreos offering and fluentd official kubernetes daemonset do not provide this feature.

Geoip is a very useful tool when inspecting access logs through kibana. I use nginx ingress controller in kubernetes and I wanted to see where incoming requests arose geographically.

Knowing from where in the world people are accessing your website is important not only for troubleshooting and operational intelligence but also for other use cases such as business intelligence

Ingress and Nginx

I use nginx as a reverse proxy behind AWS ELB to manage my routing. By default, nginx does not output to json. And instead of figuring out the fluentd nginx parser I decided to configure nginx to enable json logging.

Sample conf can be seen below:

kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
  labels:
    app: ingress-nginx
data:
  use-proxy-protocol: "false"
  log-format-escape-json: "true"
  log-format-upstream: '{"proxy_protocol_addr": "$proxy_protocol_addr","remote_addr": "$remote_addr", "proxy_add_x_forwarded_for": "$proxy_add_x_forwarded_for",
   "request_id": "$request_id","remote_user": "$remote_user", "time_local": "$time_local", "request" : "$request", "status": "$status", "vhost": "$host","body_bytes_sent": "$body_bytes_sent",
   "http_referer":  "$http_referer", "http_user_agent": "$http_user_agent", "request_length" : "$request_length", "request_time" : "$request_time",
    "proxy_upstream_name": "$proxy_upstream_name", "upstream_addr": "$upstream_addr",  "upstream_response_length": "$upstream_response_length",
    "upstream_response_time": "$upstream_response_time", "upstream_status": "$upstream_status"}'

Fluentd docker image

I then extended the fluentd debian elasticsearch docker image to install the geo-ip plugin and also update the max mind database.

Docker Image can be found here on docker hub. Tag is v0.12-debian-elasticsearch-geo

docker pull shanelee007/fluentd-kubernetes:v0.12-debian-elasticsearch-geo

Now I needed to amend the fluentd config to filter my nginx access logs and translate ip address to geo co-ordinates.

Sample config in the yaml file for daemonset with updated database is

  geoip-filter.conf: |
    <filter kube.ingress-nginx.nginx-ingress-controller>
        type geoip

        # Specify one or more geoip lookup field which has ip address (default: host)
        # in the case of accessing nested value, delimit keys by dot like 'host.ip'.
        geoip_lookup_key  remote_addr

        # Specify optional geoip database (using bundled GeoLiteCity databse by default)
        geoip_database    "/home/fluent/GeoLiteCity.dat"

        # Set adding field with placeholder (more than one settings are required.)
        <record>
          city            ${city["remote_addr"]}
          lat             ${latitude["remote_addr"]}
          lon             ${longitude["remote_addr"]}
          country_code3   ${country_code3["remote_addr"]}
          country         ${country_code["remote_addr"]}
          country_name    ${country_name["remote_addr"]}
          dma             ${dma_code["remote_addr"]}
          area            ${area_code["remote_addr"]}
          region          ${region["remote_addr"]}
          geoip           '{"location":[${longitude["remote_addr"]},${latitude["remote_addr"]}]}'
        </record>

        # To avoid get stacktrace error with `[null, null]` array for elasticsearch.
        skip_adding_null_record  true

        # Set log_level for fluentd-v0.10.43 or earlier (default: warn)
        log_level         info

        # Set buffering time (default: 0s)
        flush_interval    1s
    </filter>

I also updated the elasticsearch template to version 6 as there was issues with version 5.

Now to the fun part... testing it out!

Try it yourself

This tutorial can be executed in less than 15 minutes, as log as you already have:

  • Kubernetes cluster up

  • Nginx ingress installed

Installing

Github project for creating all resources in kubernetes can be found at https://github.com/shavo007/k8s-ingress-letsencrypt/tree/master

To create the namespace and manifests for logging, the only change you need is to update the elasticsearch endpoint in the configmap.

File is located at https://github.com/shavo007/k8s-ingress-letsencrypt/blob/master/resources/logging/fluentd-configmap.yaml#L291

Then you can create the resources.

Kubectl aliases

For the commands below I am using bash aliases. Aliases save me alot of time when I am querying a cluster. You can find the github project here

ka resources/logging --record

Verify the pods are created

kgpon logging

If you have the dashboard installed you can inspect the logs there or you can tail the logs from the command line using kubetail

kubetail fluentd -n logging

Kibana

Kibana

Now access kibana - GUI for elasticsearch

You should now see your logs ingested in elasticsearch.

Tile map visualization

There is a limitation with managed service and tile map view. I customized the settings to use WMS compliant map server. See below:

Once you pick the pattern, and assuming your geopoints are correctly mapped, Kibana will automatically populate the visualisation settings such as which field to aggregate on, and display the map almost instantly.

I have defined a visualisation for heatmap. This helps in visualizing geospatial data. You can go to management and import the json file from here https://github.com/shavo007/k8s-ingress-letsencrypt/blob/master/resources/assets/export.json

This includes multiple visualizations and a dashboard.

Inaccurate Geolocation

You may find the IP is matched to an inaccurate location. Be aware that the free Maxmind database that is used is “comparable to, but less accurate than, MaxMind’s GeoIP2 databases”, and, “IP geolocation is inherently imprecise. Locations are often near the center of the population.” See the MaxMind site for further details.

Widgets

You can build up more widgets, such as url count or count by country.

Cleanup of log indices

For cleanup of old log indices there is a tool called curator. I found a serverless option https://github.com/cloudreach/aws-lambda-es-cleanup and created the lambda function via terraform.

You can easily schedule the lambda function to cleanup log indices greater than x no. of days.

And thats it! Stay tuned for more posts on kubernetes.