How to Visualize Packetbeat Data using Kibana
Posted by Vineeth Mohan December 27, 2016In the previous post we saw how to set up, configure, and index network traffic data using packetbeat, logstash and elasticsearch.
In this post we will see how to visualize the data with the help of Kibana.
Setup
We have some small changes in the logstash configuration file for analytical convenience. In order to get the region information based on the IP
, in the http logs, we map the ip
field to geoip
in the logstash configuration file. Now, logstash will replace the ip
field in the incoming logs to a field named geoip
, which will contain detailed information on the location details of the ip address. These include country, region, geo coordinates, and more.
Here is what the new configuration file will look like:
Now we can start the indexing process just like we did in the previous article on packetbeats.
Running Kibana
After the data is indexed we can set up Kibana to visualize the data. Here we are using Kibana version 4.3.0. If you have doubts regarding the initial setup and starting of Kibana, you can refer to my previous blog here. In the settings tab of Kibana, please select the index we created just now, like below:
Visualization using Kibana
Let’s start creating visualizations using the data we have. Here we are planning to make a few visualizations that include the analysis of the geographical places of the websites accessed, the request details, the request code distribution over time, hostnames, and the incoming data in bytes information.
Geographical Information
From the Visualize section in Kibana select the Tile Map tab.
In the indexed documents we have a field geoip
, which picks up the location details of the ip
field from packetbeats. If we want to see all the geographic regions of the sites we have visited, we inspect thegeoip.location
parameter. Here the latitude and longitude values of theip
address is stored and we can use this data to do a geo hash aggregation to plot on the tile map.
Following are the settings for the tile map.
In the red box numbered 1, we have numerous options of how to represent the aggregations. Here I have chosen the Heat Map type representation to get a good idea of the locations.
Total Events Date Histogram
Next we will analyze the total count of requests and responses from the specified http port.
For this, let’s create a date histogram against the counts. Here we will go for the area charts representation, so click on the area chart
tab in the visualization options and give the following settings to it:
HTTP Communications
If we want to know what kind of communications were involved through the above port, we can use multi-level aggregations. It is better to use a stacked chart model to visualize the data. Here we can see which of the requests were communicated through the port the most, as well as the time. Also, from this graph we can make inferences like at which time which request was dominating.
City, Country, HTTP Code, and Hostname Information
For these parameters, use pie charts for easy understanding.
For example, we take the case of the city division of the responses, which will give us the top 10 cities of the responses. We can make use of the pie chart
options from the visualization page. See the settings for the city wise distribution pie chart:
HTTP Codes from Respective Countries
Suppose we want to know how he response code information within the country information. We can employ the stacked pie charts for this purpose. Select the pie chart option from the visualization and apply the following settings:
From the above diagram we can infer which country the highest number of successful hits were coming, or which country the highest number of requests were failing.
Incoming Data Statistics
Now let us consider one more parameter for our analysis of the incoming data, which is represented by the bytes_in
field for each document. We are creating a histogram which will breakdown the bytes in the interval of 100 and plot for the count. Following is the settings for the same in Kibana:
Dashboard
So far we have created many visualizations. We now need to integrate them to a single dashboard so that we can conveniently see all of them in one window. For that click on the dashboard
section in the header and click on the Add visualization
button. You can see the saved visualizations listed in a dropdown as shown in the figure:
Upon selecting the visualizations it will appear in the dashboard in a minimal size, which can be resized or repositioned for our convenience. The final dashboard in our case it would look like:
Mappings and Index Name
In most practical cases we will need our own custom mapping to the packetbeat data. This can be done by inserting the mapping in the packetbeat.template.json
(located in /etc/packetbeat/packetbeat.template.json
) file and then applying the same by typing in the following command to the terminal:
curl -XPUT ‘<a href="http://localhost:9200/_template/packetbeat">http://localhost:9200/_template/packetbeat</a>’ -d@packetbeat.template.json
In real case scenarios we want the name of each logs to be in an organized way. We can do that by changing the index
field in the output
section of the logstash file to the following settings.
index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
This will return the name of the index as packetbeat-2016.03.01
.
Conclusion
In this post we have seen how to visualize the packetbeat data using the ELK stack and to draw inferences based on the same. In the next we see in detail about another beats component, topbeat used for monitoring system processes.