logstash and kibana for adhoc log analysis

logstash and kibana for adhoc log analysis

In the past I’ve pulled various logs into elasticsearch for some quick searching through large logs and to get a clean timeline across multiple servers. I had planned on building this into a tool for more general consumption.

As it turns out there are already some tools out there that let me do this pretty easily: logstash and kibana.

I realize that these tools make it pretty easy to just build out a complete logging environment and I think it might provide a very nice (and free) alternative to splunk, but sometimes I just need to play with specific logs and logstash and kibana seem to fill that need nicely.

So here is what I did …

First, I already run elasticsearch on my box. It’s very easy to install especially on Ubuntu for which there is a DEB file. Simply download it from http://www.elasticsearch.org/download/.

wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.20.6.deb
sudo dpkg -i elasticsearch-0.20.6.deb

Next, get logstash. There seem to be 2 different versions of the tool: flatjar and monolithic. The flatjar presumably runs faster, but I had issues with it and so I stuck with the monolithic version.

mkdir logstash
cd logstash
wget https://logstash.objects.dreamhost.com/release/logstash-1.1.10-monolithic.jar

Then you need to create yourself a configuration file to ingest your logs and slice them up. Since this setup will read from STDIN you need to make sure that you set the date and also likely change the grok filter to match your logs. In my case I was working with MarkLogic error logs, which I pulled over ssh. As part of that I modified the logs to add the hostname at the beginning like so `ssh server-name cat /var/opt/MarkLogic/Logs/ErrorLog.txt | sed “s/^/server-name /”‘, which resulted in the server-name being placed at the beginning of each line followed by a space. This can be run with a for loop as well to grab the logs from multiple servers and is then handled by the grok filter to set the source_host based on that. I used the following configuration in a config file called logstash-stdin.conf

input {
  stdin {
    type => "marklogic"
  }
}
filter {
  grok {
    type => "marklogic" 
    pattern => "%{HOST:ml_hostname} %{TIMESTAMP_ISO8601:ml_date} %{LOGLEVEL:ml_status}: %{GREEDYDATA:ml_msg}"
  }
  date {
    type => "marklogic"
    match => [ "ml_date", "YYYY-MM-dd HH:mm:ss.SSS" ]
  }  
  mutate {
    type => "marklogic"
    replace=> [ "@source_host", "%{ml_hostname}" ]
    replace => [ "@message", "%{ml_status}: %{ml_msg}" ]
    add_field => [ "@severity", "%{ml_status}" ]
    replace => [ "@type", "marklogic" ]
  }
}
output { 
  stdout { debug => true debug_format => "json"}
  elasticsearch_http {
    host => "127.0.0.1"
    flush_size => 1
  }
}

The pattern options are interesting and logstash offers a lot that are easily usable. For example most datastamps are easily parse in this case via the TIMESTAMP_ISO8601 pattern. Take a look at the grok-patters for what’s available.

You can then ingest the logs with a command like this:

for x in server1 server2 server3; do
  ssh ${x} cat /var/opt/MarkLogic/Logs/ErrorLog.txt | sed "s/^/${x} /";
done | java -jar logstash-1.1.10-monolithic.jar agent -f logstash-stdin.conf

At this point the logs should be in elasticsearch and ready for searching, so then grab kibana and set up the required bits and start it up.

sudo apt-get install bundler # for the ruby dependency
wget https://github.com/rashidkpc/Kibana/archive/v0.2.0.tar.gz
tar zxvf v0.2.0.tar.gz 
cd Kibana-0.2.0
bundle install
ruby kibana.rb

Now, point your browser at http://localhost:5601 and you should be able to enjoy your logs.

there is lots of stuff that the logstash configuration can do beyond the simple example above. The grok and mutate filters are quite powerful, but others are also available, so reviewing the list of input, filter and output plugins at the bottom of the docs page might be very useful.

I hope this helps someone. I will likely integrate both logstash and kibana more fully in the future.

\@matthias

Leave a Reply

Your email address will not be published. Required fields are marked *