Introduction to ELK stack

When you work with systems that consist with more than 1 server and include many different component that interact with other third party ones, then you also need a way to trace issues.

In order to trace issues you need to collect as much information you can about the current state of that system or potentially from other systems as well and correlate with the time of the indecent.

There are already some tools out there, like NewRelic, Papertrail, etc that can provide such functionality, but what happen when you are not allowed to share information with external third party tools. Then you have to host your logging cluster. One solution that we will talk about is the ELK stack.

==ELK== stands for Elastic, Logstash, Kibana and it’s a suite that you can be installed on-prem and collect, transform, analyze your data (also they have a hosted solution as well).

We will talk about each component separate and in this document we will talk about Elastic and what to keep in mind when you are thinking on adopting this stack.

Elastic

It has many usages and many features, most of them we will not use for our scenario. What we need is search engine that we want to discover logging patterns and also utilize some of it’s reporting capabilities later on.

In the world of the NoSQL databases Elastic is placed as a document store, also they added support for graph database but for this we do not need it, which means that you save basically json objects and then you can query, with various ways, and get results.

What is different from other document stores is the capability of search that it offers. It allows you to search fields from within your json objects and retrieve your data. It does that by keeping Lucene indexes, not the same as the indexes that we will see later on.

The Lucene indecies at Elastic keep information about frequency matrixes and our data so that the when you want to lookup a word it will answer back quickly, also that indecies are wrapped around from other Elastic terminology so you do not have to know alot more about the internal of their work, unless you need to do a more sophisticated stuff and then knowledge is required.

What you should know for our logging scenario is that Elastic has ==indecies== (not the same as Lucene’s ) and ==types==.

What is an index

An index is a collection of shards that hold the Lucene indexes (ya i know!). But what is great about those indexes is that by default Elastic takes care about synchronizing them and replicate them across different hosts to offer HA (high availability).

An index also has a mapping. This mapping tells elastic about the fields that the index holds. You are not required to define a mapping but if you do then you will have fantastic increase in performance compared to the default because Elastic will not be required to guess data types and you will not be in a situation where some fields are mapped as string some others as longs etc.

What is a type

Index can hold many types, you can think of types as tables in the RDBSM world.

So why is there a need for types you say if i can just send documents to an index?

Very good observation i would say, but here you will have to know a bit more about the Elastic internals. If you define an index type then when inserting documents into that type then Elastic will keep the documents in the same or closed shards and this is good for performance.

Why is mapping important

When you index your data you are not required to specify a mapping, but doing so it will help Elastic know about what you are going to insert.

Even if you have defined mapping then you will still can add more fields (unless you disable this option) as you go and the Elastic will try to guess about that field and update its index mapping by itself for feature reference.

What is wrong in indexing without mapping, well image the scenario where you want to save a document like the one bellow

{
  name: "Alex",
  created_at: "2016-01-01T00:00:00",
}

So the field created_at is apparently a date, so Elastic will know how to handle this, it will see that its a string with a date pattern and index it as a Date!

Yes, BUT, if a previous document is indexed with the field as String for example this one

{
  name: "Alex",
  created_at: "",
}

Then the rest of the indexing will take place with this field created_at as string thus making it impossible to execute date functions with this field.

If you map a field with a type, then Elastic will try to convert that field to the desired type, if that fails then it will generate an exception, so you can either check the code or validate again your mapping.

Example of converting a document.

{
  name: "Alex",
  counter: "1",
}

will not throw exception when the counter is set as integer, because it will do an Integer.parse()

Lessons learned from production

It’s very easy to misuse the freedom that Elastic provides. If you allow any field to be index without a mapping then you will face issues that you would otherwise never thought of.

Example of issues that i faced:

Different systems tend to use the same field in different context. One example is that when indexing http logs the status is an integer (ex 200, 404, 301..) but in one system the user used the status field as a String in a different context (SUCCESS, FAIL, PENDING…) which is not wrong its just different context.

Do not allow freedom of just adding fields. I saw cases of misusing when people started adding field without thinking through the data types, so in their initial design they wanted to log a and search a field as string and later on they switch to integer or long, this change is welcomed but will have to be done in a way that will be backward compatible, maybe alter the field name.

Use mapping templates for indexes so that when you create a new index the mapping will be offered by Elastic, this will help with some raise condition on dynamic index names.

Alexandros Sapranidis

Software engineer, keen on wearing many hat, current Senior Software Engineer @Elastic cloud

Athens, Greece http://sapranidis.gr