In the previous tutorial in ElastAlert Series, we implemented flatline, frequency and blacklist rules for ElastAlert alerting via Email. We will be next looking into configuring and setting up alerting using ElastAlert on to the popular cloud-based team collaboration tool Slack.

ElastAlert was developed to automatically query and analyze the log data in Elasticsearch clusters and generate alerts based on easy-to-write rules. The initial goal was to create a comprehensive log management system for the data. It is easy to configure a few basic alerts such as “Send us an email if a user fails login X times in a day” or “Send a Sensu alert if the number of error messages spikes.” But, the usual requirement is a generic architecture which could suit almost any alerting scenario needed across any organisation using Elasticsearch. ElastAlert takes a set of “rules”, each of which has a pattern that matches data and a specific alert action it will take when triggered. For each rule, ElastAlert will query Elasticsearch periodically to grab relevant data in near real time.

ElastAlert is now available on Qbox provisioned Elasticsearch clusters and can be easily configured. Implementing ElastAlert is easy on Qbox. When you provision a cluster, there is a configuration box where you can input your Alert rules.  If you’re unclear how to structure rules in YAML, be sure to consult the ElastAlert Documentation.

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click "Get Started" in the header navigation. If you need help setting up, refer to "Provisioning a Qbox Elasticsearch Cluster." 

Our Goal

The goal of the tutorial is to use Qbox as a Centralized Logging, Alerting and Monitoring solution to automatically raise alerts on Slack Channel. We will assume you do have a Slack account set-up and running. Qbox provides out of box solution for Elasticsearch, Kibana and many of Elasticsearch analysis and monitoring plugins. We will set up Logstash in a separate node or machine to gather twitter stream and use Qbox provisioned ElastAlert alerting to configure rules and set up alerts for detection of anomalies and inconsistencies in data.

Our ELK stack setup has four main components:

  • Elasticsearch: It is used to store all of the application and monitoring logs(Provisioned by Qbox).

  • Logstash: The server component that processes incoming logs and feeds to ES.

  • ElastAlert:  The superb open-source alerting tool built by the team at Yelp Engineering and now available on all new Elasticsearch clusters on AWS.

  • Kibana(optional): A web interface for searching and visualizing logs (Provisioned by Qbox).

Prerequisites

The amount of CPU, RAM, and storage that your Elasticsearch Server will require depends on the volume of logs that you intend to gather. For this tutorial, we will be using a Qbox provisioned Elasticsearch with the following minimum specs:

  • Provider: AWS

  • Version: 5.1.1

  • RAM: 1GB

  • CPU: vCPU1

  • Replicas: 0

The above specs can be changed per your desired requirements. Please select the appropriate names, versions, regions for your needs. For this example, we used Elasticsearch version 5.1.1, the most current version is 5.3. We support all versions of Elasticsearch on Qbox. (To learn more about the major differences between 2.x and 5.x, click here.)  

In addition to our Elasticsearch Server, we will require a separate logstash server to process incoming twitter stream from twitter API and ship them to Elasticsearch. For simplicity and testing purposes, the logstash server can also act as the client server itself. The Endpoint and Transport addresses for our Qbox provisioned Elasticsearch cluster are as follows:

common_1.png

Endpoint: REST API

https://ec18487808b6908009d3:efcec6a1e0@eb843037.qb0x.com:32563

Authentication

  • Username = ec18487808b6908009d3

  • Password = efcec6a1e0

TRANSPORT (NATIVE JAVA)

eb843037.qb0x.com:30543

Note: Please make sure to whitelist the logstash server IP from Qbox Elasticsearch cluster.

Configure Alerting

Now, let’s create the “rules” namely

  • Twitter new term rule of type new_term

  • Twitter change rule of type change

  • Twitter spike rule of type spike

Using these, we will test 3 types of rules that Elastalert can manage:

  • The new_term rule rule matches when a new value appears in a field that has never been seen before.

  • The change rule will monitor a certain field and match if that field changes. The field must change with respect to the last event with the same query_key.

  • The spike rule matches when the volume of events during a given time period is spike_height times larger or smaller than during the previous time period. It uses two sliding windows to compare the current and reference frequency of events.

First, we create the new_term rule, by configuring the following code:

# Alert when a new language is detected with no re-tweets over a period of 1 day.
# (Required)
# Rule name, must be unique
name: Event new term rule
# (Required)
# Type of alert.
type: new_term
# (Required)
# Index to search, wildcard supported
index: twitter-*
# (Required, new_term specific)
# Monitor the field language
fields:
- "language"
# (Optional, new_term specific)
# This means that we will query 1 day worth of data when ElastAlert starts to find which values of ip_address already exist
# If they existed in the last 1 day, no alerts will be triggered for them when they appear
terms_window_size:
 days: 1
# (Required)
# A list of Elasticsearch filters used for find events
# These filters are joined with AND and nested in a filtered query
# We are filtering for only "twitter_logs" type documents with retweet_count as 0
filter:
- term:
   _type: "twitter_logs"
- term:
   retweet_count: 0
# (Required)
# The alert is used when a match is found
alert:
- "slack"
slack:
# The <"https://xxxxx.slack.com/services/new/incoming-webhook"> webhook URL that includes your auth data and the ID of the channel (room) you want to post to.
slack_webhook_url: "https://hooks.slack.com/services/T1XXXXXACS/B592XXXKJ/FSgLSXXXXXXXXXw7qNNFDe"

Slack Incoming WebHooks

s1.png

Slack Webhook Configuration Instructions

s2.png

Slack Integration Settings

s3.png

As we can see, this is a very straightforward and simple configuration. For the change config, we configure our rule as follows:

# Alert when some field changes between documents
# This rule would alert on documents similar to the following:
# {text: 'qbox is cool', 'user.name': 'Mike', '@timestamp': '2014-10-15T00:00:00'}
# {text: 'qbox is cool', 'user.name': 'Bob', '@timestamp': '2014-10-15T05:00:00'}
# Because the text (query_key) 'qbox is cool' is tweeted by two different users (compare_key) in the same day (timeframe)
# (Required)
# Rule name, must be unique
name: Event change rule
# (Required)
# Type of alert.
# the change rule will alert when a certain field changes in two documents within a timeframe
type: change
# (Required)
# Index to search, wildcard supported
index: twitter-*
# (Required, change specific)
# The field to look for changes in
compare_key: user.name
# (Required, change specific)
# Ignore documents without the compare_key (language) field
ignore_null: true
# (Required, change specific)
# The change must occur in two documents with the same query_key
query_key: text
# (Required, change specific)
# The value of compare_key must change in two events that are less than timeframe apart to trigger an alert
timeframe:
 days: 1
# (Required)
# A list of Elasticsearch filters used for find events
# These filters are joined with AND and nested in a filtered query
filter:
- query:
   query_string:
     query: "document_type: twitter_logs"
# (Required)
# The alert is used when a match is found
alert:
- "slack"
slack:
# The <"https://xxxxx.slack.com/services/new/incoming-webhook"> webhook URL that includes your auth data and the ID of the channel (room) you want to post to.Finally, let’s configure our final <b><code>spike</code></b> rule:
slack_webhook_url: "https://hooks.slack.com/services/T1XXXXXACS/B592XXXKJ/FSgLSXXXXXXXXXw7qNNFDe"

Finally, let’s configure our final spike rule:

# Alert when there is a sudden spike(3 times of previous count) in the volume of matching events within a sliding window of 1 hour
# (Required)
# Rule name, must be unique
name: Event spike rule
# (Required)
# Type of alert.
# the spike rule type compares the number of events within two sliding windows to each other
type: spike
# (Required)
# Index to search, wildcard supported
index: twitter-*
# (Required one of _cur or _ref, spike specific)
# The minimum number of events that will trigger an alert
# For example, if there are only 2 events between 12:00 and 2:00, and 20 between 2:00 and 4:00
# _cur is 2 and _ref is 20, and the alert WILL fire because 20 is greater than threshold_cur
threshold_cur: 5
#threshold_ref: 5
# (Required, spike specific)
# The size of the window used to determine average event frequency
# We use two sliding windows each of size timeframe
# To measure the 'reference' rate and the current rate
timeframe:
 hours: 1
# (Required, spike specific)
# The spike rule matches when the current window contains spike_height times more
# events than the reference window
spike_height: 3
# (Required, spike specific)
# The direction of the spike
# 'up' matches only spikes, 'down' matches only troughs
# 'both' matches both spikes and troughs
spike_type: "up"
# (Required)
# A list of Elasticsearch filters used for find events
# These filters are joined with AND and nested in a filtered query
filter:
- query:
   query_string:
     query: "text:search"
- type:
   value: "twitter_logs"
# (Required)
# The alert is used when a match is found
alert:
- "slack"
slack:
# The <"https://xxxxx.slack.com/services/new/incoming-webhook"> webhook URL that includes your auth data and the ID of the channel (room) you want to post to.
slack_webhook_url: "https://hooks.slack.com/services/T1XXXXXACS/B592XXXKJ/FSgLSXXXXXXXXXw7qNNFDe"

Thus, Qbox Configuration for Alerting must be as follows:

name: Event new term rule
type: new_term
index: twitter-*
fields:
- "language"
terms_window_size:
 days: 1
filter:
- term:
   _type: "twitter_logs"
- term:
   retweet_count: 0
alert:
- "slack"
slack:
slack_webhook_url: "https://hooks.slack.com/services/T1XXXXXACS/B592XXXKJ/FSgLSXXXXXXXXXw7qNNFDe"
---
name: Event change rule
type: change
index: twitter-*
compare_key: language
ignore_null: true
query_key: text
timeframe:
 days: 1
filter:
- query:
   query_string:
     query: "document_type: twitter_logs"
alert:
- "slack"
slack:
slack_webhook_url: "https://hooks.slack.com/services/T1XXXXXACS/B592XXXKJ/FSgLSXXXXXXXXXw7qNNFDe"
---
name: Event spike rule
type: spike
index: twitter-*
threshold_cur: 5
timeframe:
 hours: 1
spike_height: 3
spike_type: "up"
filter:
- query:
   query_string:
     query: "text:search"
- type:
   value: "twitter_logs"
alert:
- "slack"
slack:
slack_webhook_url: "https://hooks.slack.com/services/T1XXXXXACS/B592XXXKJ/FSgLSXXXXXXXXXw7qNNFDe"

Install Logstash

Download and install the Public Signing Key:

wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

We will use the Logstash version 2.4.x as compatible with our Elasticsearch version 5.1.x. The Elastic Community Product Support Matrix can be referred in order to clear any version issues.

Add the repository definition to your /etc/apt/sources.list file:

echo "deb https://packages.elastic.co/logstash/2.4/debian stable main" | sudo tee -a /etc/apt/sources.list

Run sudo apt-get update and the repository is ready for use. You can install it with:

sudo apt-get update && sudo apt-get install logstash

Alternatively, logstash tar can also be downloaded from Elastic Product Releases Site. Then, the steps of setting up and running logstash are pretty simple:

  • Download and unzip Logstash

  • Prepare a logstash.conf config file

  • Run bin/logstash -f logstash.conf -t to check config (logstash.conf)

  • Run bin/logstash -f logstash.conf

Configure Logstash (Twitter Stream)

Logstash configuration files are in the JSON-format, and reside in /etc/logstash/conf.d. The configuration consists of three sections: inputs, filters, and outputs.

We need to be authorized to take data from Twitter via its API. This part is easy:

  1. Login to your Twitter account

  2. Go to https://dev.twitter.com/apps/

  3. Create a new Twitter application (here I give Twitter-Qbox-Stream as the name of the app).

t1.png

After you successfully create the Twitter application, you get the following parameters in "Keys and Access Tokens":

  1. Consumer Key (API Key)

  2. Consumer Secret (API Secret)

  3. Access Token

  4. Access Token Secret

t2.png

We are now ready to create the Twitter data path (stream) from Twitter servers to our machine. We will use the above four parameters (consumer key, consumer secret, access token, access token secret) to configure twitter input for logstash.

Let's create a configuration file called 02-twitter-input.conf and set up our "twitter" input:

sudo vi /etc/logstash/conf.d/02-twitter-input.conf

Insert the following input configuration:

input {
 twitter {
   consumer_key => "BCgpJwYPDjXXXXXX80JpU0"
   consumer_secret => "Eufyx0RxslO81jpRuXXXXXXXMlL8ysLpuHQRTb0Fvh2"
   keywords => ["mobile", "java", "android", "elasticsearch", "search"]
   oauth_token => "193562229-o0CgXXXXXXXX0e9OQOob3Ubo0lDj2v7g1ZR"
   oauth_token_secret => "xkb6I4JJmnvaKv4WXXXXXXXXS342TGO6y0bQE7U"
 }
}

Save and quit the file 02-twitter-input.conf.

This specifies a twitter input that will filter tweets with keywords "mobile", "java", "android", "elasticsearch", "search" and pass them to logstash output. Save and quit. Lastly, we will create a configuration file called 30-elasticsearch-output.conf:

sudo vi /etc/logstash/conf.d/30-elasticsearch-output.conf

Insert the following output configuration:

output {
 elasticsearch {
   hosts => ["https://eb843037.qb0x.com:32563/"]
   user => "ec18487808b6908009d3"
   password => "efcec6a1e0"
   index => "twitter-%{+YYYY.MM.dd}"
   document_type => "twitter_logs"
 }
 stdout { codec => rubydebug }
}

Save and exit. This output basically configures Logstash to store the twitter logs data in Elasticsearch which is running at https://eb843037.qb0x.com:30024/, in an index named after the twitter.

If you have downloaded logstash tar or zip, you can create a logstash.conf file having input, filter and output all in one place.

sudo vi LOGSTASH_HOME/logstash.conf

Insert the following input and output configuration in logstash.conf

input {
 twitter {
   consumer_key => "BCgpJwYPDjXXXXXX80JpU0"
   consumer_secret => "Eufyx0RxslO81jpRuXXXXXXXMlL8ysLpuHQRTb0Fvh2"
   keywords => ["mobile", "java", "android", "elasticsearch", "search"]
   oauth_token => "193562229-o0CgXXXXXXXX0e9OQOob3Ubo0lDj2v7g1ZR"
   oauth_token_secret => "xkb6I4JJmnvaKv4WXXXXXXXXS342TGO6y0bQE7U"
 }
}
output {
 elasticsearch {
   hosts => ["https://eb843037.qb0x.com:32563/"]
   user => "ec18487808b6908009d3"
   password => "efcec6a1e0"
   index => "twitter-%{+YYYY.MM.dd}"
   document_type => "twitter_logs"
 }
 stdout { codec => rubydebug }
}

Test your Logstash configuration with this command:

sudo service logstash configtest

It should display Configuration OK if there are no syntax errors. Otherwise, try and read the error output to see what's wrong with your Logstash configuration.

Restart Logstash, and enable it, to put our configuration changes into effect:

sudo service logstash restart
sudo update-rc.d logstash defaults 96 9

If you have downloaded logstash tar or zip, it can be run using following command

bin/logstash -f logstash.conf

Numerous responses are received. The structure of document is as follows:

{
 "text": "Learn how to automate anomaly detection on your #Elasticsearch #timeseries data with #MachineLearning:",
 "created_at": "2017-05-07T07:54:47.000Z",
 "source": "<a href="%5C">Twitter for iPhone</a>",
 "truncated": false,
 "language": "en",
 "mention": [],
 "retweet_count": 0,
 "hashtag": [
   {
     "text": "Elasticsearch",
     "start": 49,
     "end": 62
   },
   {
     "text": "timeseries",
     "start": 65,
     "end": 75
   },
   {
     "text": "MachineLearning",
     "start": 88,
     "end": 102
   }
 ],
 "location": {
   "lat": 33.686657,
   "lon": -117.674558
 },
 "place": {
   "id": "74a60733a8b5f7f9",
   "name": "elastic",
   "type": "city",
   "full_name": "San Francisco, CA",
   "street_address": null,
   "country": "United States",
   "country_code": "US",
   "url": "https://api.twitter.com/1.1/geo/id/74a60733a8b5f7f9.json"
 },
 "link": [],
 "user": {
   "id": 2873953509,
   "name": "Elastic",
   "screen_name": "elastic",
   "location": "SF, CA",
   "description": "The company behind the Elastic Stack (#elasticsearch, #kibana, Beats, #logstash), X-Pack, and Elastic Cloud"
 }
}

The simplest of the rules to test it out is the new_term rule. All we have to do is wait for about 5 minutes with Elasticsearch running and Logstash stopped, so no documents are streaming. After the wait, we can see on our channel that an alert is received on the channel.

And, as the time passes, we will receive other alerts as well, like the change alert.

Conclusion

ElastAlert helps to learn a lot from data and use it to monitor many critical systems. If you know what you’re looking for, archiving log files and retrieving them manually might be sufficient, but this process is tedious. As your infrastructure scales, so does the volume of log files, and the need for a log management system becomes apparent. Qbox provisioned Elasticsearch is already very successful for indexing logs, faster retrieval, powerful search tools, great visualizations and many other purposes. Qbox built in support for ElastAlert will help greatly in alerting on anomalies, spikes, or other patterns of interest from data in Elasticsearch.

There are several types of included alerters. Of course, as in the example, you can send slack notifications. You can also open JIRA issues, run arbitrary commands, and custom python code. Each alerter has it’s own specific options, but there are several that can apply to any type, such as realert, which is the minimum time before sending a subsequent alert for a given rule, and aggregation, which allows you to aggregate all alerts which occur within a timeframe for a rule together.

Other Helpful Tutorials

Give It a Whirl!

It's easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, or Amazon data centers. And you can now provision your own AWS Credits on Qbox Private Hosted Elasticsearch

Questions? Drop us a note, and we'll get you a prompt response.

Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.

comments powered by Disqus