Collecting threat intel has become an important topic in the information security industry. Unfortunately, this topic is mostly discussed behind closed doors. There could be several reasons why you would like to import data into Elasticsearch, and there are several ways that you can make use of threat intelligence.

This article is not meant as a copy/paste tutorial on how to run your own threat intel program, but rather to get you thinking of all the possibilities on how you can utilize Logstash, Elasticsearch, and Kibana in working with threat intelligence.

The most notable would be to make use of translate filters in logstash and alert on data that corresponds to data in blacklists. You can see some examples of this in the following Github repo: https://github.com/TravisFSmith/SweetSecurity.

Another strategy would be to ingest the data into Elasticsearch and then output it in CSV and make use of the CSV file in a translate filter. This is where the translate filter checks the CSV if it contains a value and then performs an action if the CSV contains the value.

Perhaps you just want to fill Elasticsearch with possible email addresses that could be sending you phishing mails, blacklisted IPs etc. This would allow you to search for an IP and help make an informed decision whether connections to or from the IP could potentially be malicious. In any case, Elasticsearch would be useful in this aspect.

Using Combine

I’m going to be using an open source tool called Combine. This tool parses the output from several threat intel feeds and blacklists and creates one big blacklist of incoming and outgoing blacklisted IP’s and domains. Thus, you would use this information.

To install combine:

$ cd /opt
$ sudo mkdir threatintel
$ sudo chown `whoami`:`whoami` /opt/threatintel
$ git clone https://github.com/mlsecproject/combine
$ cd combine
$ sudo apt-get install python-dev python-pip python-virtualenv
$ virtualenv venv
$ source venv/bin/activate
(venv) $ pip install -r requirements.txt

Now you need to create a configuration file for combine. You can just use the configuration file that already exists and copy it to have a new name. The configuration file should be called combine.cfg and placed in the combine /directory.

Tutorial: How to Start Supergiant Server as a Service on Ubuntu

(venv) $ cp combine-example.cfg combine.cfg

Now you can the following commands to collect data from several threat intel feeds and black lists:

(venv) $ python reaper.py
2016-09-30 17:54:55,234 - combine.reaper - INFO - Fetching inbound URLs
2016-09-30 17:55:03,730 - combine.reaper - INFO - Fetching outbound URLs
2016-09-30 17:55:08,219 - combine.reaper - INFO - Storing raw feeds in harvest.json

For the next step, run the following:

(venv) $ python thresher.py
2016-09-30 17:57:23,459 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/email.txt
2016-09-30 17:57:23,459 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/email.txt
2016-09-30 17:57:23,913 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/ftp.txt
2016-09-30 17:57:23,914 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/ftp.txt
2016-09-30 17:57:23,926 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/imap.txt
2016-09-30 17:57:23,926 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/imap.txt
2016-09-30 17:57:23,946 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/ircbot.txt
2016-09-30 17:57:23,946 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/ircbot.txt
2016-09-30 17:57:23,946 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/pop3.txt
2016-09-30 17:57:23,947 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/pop3.txt
2016-09-30 17:57:23,967 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/postfix.txt
2016-09-30 17:57:23,967 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/postfix.txt
2016-09-30 17:57:24,440 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/proftpd.txt
2016-09-30 17:57:24,441 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/proftpd.txt
2016-09-30 17:57:24,452 - combine.thresher - INFO - Evaluating http://www.blocklist.de/lists/sip.txt
2016-09-30 17:57:24,452 - combine.thresher - INFO - Parsing feed from http://www.blocklist.de/lists/sip.txt
2016-09-30 17:57:24,454 - combine.thresher - INFO - Evaluating http://cinsscore.com/list/ci-badguys.txt
2016-09-30 17:57:24,455 - combine.thresher - INFO - Evaluating http://reputation.alienvault.com/reputation.data
2016-09-30 17:57:24,455 - combine.thresher - INFO - Parsing feed from http://reputation.alienvault.com/reputation.data
2016-09-30 17:57:24,712 - combine.thresher - INFO - Evaluating http://dragonresearchgroup.org/insight/sshpwauth.txt
2016-09-30 17:57:24,712 - combine.thresher - INFO - Parsing feed from http://dragonresearchgroup.org/insight/sshpwauth.txt

There is one more step, which is optional. This step is running a program called “winnower.py”. You only need to run this if you plan on enriching your DNS data, and this requires a paid API key. Link to the repo here. For now, I’m leaving this step out, although it would be nice to enrich the DNS data.

Now for our last step:

(venv) $ python baler.py 
2016-09-30 18:01:38,361 - combine.baler - INFO - Reading processed data from crop.json
2016-09-30 18:01:41,304 - combine.baler - INFO - Output regular data as CSV to harvest.csv

It is a good idea to create a script and put it into a cron job, which automates running these scripts to pull the blacklists. These blacklists get updated regularly, and is probably something you want to do every day so you can stay up to date.

This outputs a file called “harvest.csv”. This file contains IPs and domains which either try to connect into your network or out of your network. This is what the file looks like:

$ tail harvest.csv 
"95.163.107.19","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"95.163.107.42","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"95.163.121.137","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"95.163.121.138","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"95.163.121.252","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"95.173.183.223","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"96.57.23.154","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"96.91.129.246","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"98.23.159.86","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"
"99.248.17.200","IPv4","outbound","https://feodotracker.abuse.ch/blocklist/?download=ipblocklist","","2016-09-30"

It is a good idea to have two separate lists, one for inbound connecting IP’s or domains and one containing outbound containing IP’s or domains. This is so you can alert on them differently.

For example, perhaps you would like to check your squid proxy logs to see if people are connecting to blacklisted domains or IPs from their desktop computers. This could be an indication that they have some malware installed that is trying to connect to its Command and Control Server.

The inbound connecting IP or domains can then be compared to the IPs that connect to your firewall. To separate the different lists, we’re going to use a very hacky yet efficient way.

We are going to use logstash to read the CSV produced by Combine. If the IP entry in harvest.csv is labeled as “outbound”, then we output it to a CSV file of its own called outbound.csv. The same logic applies to inbound connections which we will store in inbound.csv. We are using the Logstash CSV input plugin and also using the logstash CSV output plugin. I’ve covered the CSV output plugin for Logstash in other blog posts, but in case you missed it, you can install the plugin with:

# cd /opt/logstash
# bin/logstash-plugin install logstash-output-csv

This is what we are going to put in our config file:

input {
 file {
   path => "/opt/threatintel/combine/harvest.csv"
   start_position => "beginning"
   sincedb_path => "/dev/null"
 }
}
filter {
 csv {
     separator => ","
     columns => ["entity","type","direction","source","notes","date"]
 }
}
output {
if [direction] == "inbound" {
csv {
 fields => ["entity","type","direction","source","notes","date"]
 path => "/tmp/inbound.csv"
}
stdout {}
}
if [direction] == "outbound" {
csv {
 fields => ["entity","type","direction","source","notes","date"]
 path => "/tmp/outbound.csv"
}
stdout {}
}
}

You can run the config with the following:

$ /opt/logstash/bin/logstash -f combine-output-csv.conf

In case you were wondering, you don’t need to run this as root. You can monitor the progress of your two blacklists as they are created by logstash by running:

$ tail -f /tmp/inbound.csv
62.24.56.93,IPv4,inbound,http://reputation.alienvault.com/reputation.data,Malicious Host,2016-10-01
103.37.160.45,IPv4,inbound,http://reputation.alienvault.com/reputation.data,Malicious Host,2016-10-01
119.93.37.176,IPv4,inbound,http://reputation.alienvault.com/reputation.data,Malicious Host,2016-10-01
72.22.227.138,IPv4,inbound,http://reputation.alienvault.com/reputation.data,Malicious Host,2016-10-01
50.225.66.186,IPv4,inbound,http://reputation.alienvault.com/reputation.data,Malicious Host,2016-10-01
...

Or by running the following tail on the outbound file:

$  tail -f /tmp/outbound.csv
94.73.155.10,IPv4,outbound,https://feodotracker.abuse.ch/blocklist/?download=ipblocklist,"",2016-10-01
94.73.155.11,IPv4,outbound,https://feodotracker.abuse.ch/blocklist/?download=ipblocklist,"",2016-10-01
94.73.155.12,IPv4,outbound,https://feodotracker.abuse.ch/blocklist/?download=ipblocklist,"",2016-10-01
95.138.160.145,IPv4,outbound,https://feodotracker.abuse.ch/blocklist/?download=ipblocklist,"",2016-10-01
95.154.203.249,IPv4,outbound,https://feodotracker.abuse.ch/blocklist/?download=ipblocklist,"",2016-10-01
95.163.107.19,IPv4,outbound,https://feodotracker.abuse.ch/blocklist/?download=ipblocklist,"",2016-10-01
95.163.107.42,IPv4,outbound,https://feodotracker.abuse.ch/blocklist/?download=ipblocklist,"",2016-10-01
...

Now that you have these two files you will need to make use of them. This is where Logstash translate filters come into the picture. You can read more about in our blog post: Introduction to the Logstash Translate Filter.

So far, I’ve shown you how to do cool things with threatintel using Logstash, but perhaps you would like to import your threatintel into Elasticsearch directly and make it searchable. That way, when you see funny behavior coming from a certain IP on your firewall, for example, you could search through Elasticsearch and see if the IP has been blacklisted recently.

Tutorial: Auto-Scaling Kubernetes Cluster with OpenStack, Supergiant

You could use this to index the data into Elasticsearch. This is a rough config. You could still improve it a great deal, for example, by adding templates for mappings to make certain fields not analyzed, etc.

input {
 file {
   path => "/opt/threatintel/combine/harvest.csv"
   start_position => "beginning"
   sincedb_path => "/dev/null"
 }
}
filter {
 csv {
     separator => ","
     columns => ["entity","type","direction","source","notes","date"]
 }
}
output {
if [direction] == "inbound" {
elasticsearch {
hosts => "http://localhost:9200"
index => "combine-inbound-%{+YYYY.MM.dd}"
}
stdout {}
}
if [direction] == "outbound" {
   elasticsearch {
    hosts => "http://localhost:9200"
    index => "combine-outbound-%{+YYYY.MM.dd}"
 }
stdout {}
}
}

To run your config:

$ /opt/logstash/bin/logstash -f combine-to-es-timestamped.conf

You can go ahead and create the index patterns in Kibana like this:

kibana-1-copy.png#asset:1090

For the other index, you can also create it the same way. This allows you to view each index as the date changes.

kibana-2-copy.png#asset:1091

As mentioned earlier in the article, it makes more sense to use the threatintel from combine to create blacklists to be used in translate filters as in this example: https://github.com/TravisFSmith/SweetSecurity/blob/master/logstash.conf.

There is sometimes the need to import data from blacklists into something that can be quickly searched and referenced. By importing the inbound and outbound lists into Elasticsearch, you can quickly search for data using Kibana and even make dashboards from your visualizations. You could also for example use Kibana to quickly search all blacklisted domains that end with *.info:

kibana-3.png#asset:1092

Conclusion

Both Logstash and Elasticsearch offer great functionality in the operations space to work with very large security related data sets. This article was not meant as a copy paste tutorial on how to run your own threat intel program, but rather to get you thinking of all the possibilities on how you can utilize Logstash, Elasticsearch, and Kibana in working with threat intelligence. Remember to make use of Elasticsearch for monitoring your security-related logs. Attackers are hoping that you don’t. Questions/Comments? Drop us a line below.