How to: HipChat Alerting for Elasticsearch with ElastAlert
Posted by Adam Vanderbush May 27, 2017In the previous tutorial in ElastAlert Series, we implemented new_term, change and spike rules for ElastAlert alerting via Slack. We will next be looking into configuring and setting up alerting using ElastAlert on to the popular cloud-based team collaboration tool HipChat.
Many organisations use Elasticsearch to rapidly prototype and launch new search applications, and moving quickly at scale raises challenges. In particular, we often encounter difficulty making changes to query logic without impacting users, as well as finding client library bugs, problems with multi-tenancy, and general reliability issues. As the number of queries grow, the Search Infrastructure faces difficulty in supporting the multitude of ways queries are being sent to Elasticsearch cluster. The infrastructure designed for a single team to communicate with a single cluster does not scale to tens of teams and tens of clusters.
Indexing in large volumes require instantaneous alerting on anomalies, spikes, or other patterns of interest from data in Elasticsearch. If you have data being written into Elasticsearch in near real time and want to be alerted when that data matches certain patterns, ElastAlert is the tool for you.
For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click “Get Started” in the header navigation. If you need help setting up, refer to “Provisioning a Qbox Elasticsearch Cluster.“
ElastAlert is now available on Qbox provisioned Elasticsearch clusters and can be easily configured. Implementing ElastAlert is easy on Qbox. When you provision a cluster, there is a configuration box where you can input your Alert rules. If you’re unclear how to structure rules in YAML, be sure to consult the ElastAlert Documentation.
Our Goal
The goal of the tutorial is to use Qbox as a Centralized Logging, Alerting and Monitoring solution to automatically raise alerts on HipChat Channel. We will assume you do have a HipChat account set-up and running. Qbox provides out of box solution for Elasticsearch, Kibana and many of Elasticsearch analysis and monitoring plugins. We will set up Logstash in a separate node or machine to gather twitter stream and use Qbox provisioned ElastAlert alerting to configure rules and set up alerts for detection of anomalies and inconsistencies in data.
Our ELK stack setup has four main components:
Elasticsearch
: It is used to store all of the application and monitoring logs(Provisioned by Qbox).Logstash
: The server component that processes incoming logs and feeds to ES.ElastAlert
: The superb open-source alerting tool built by the team at Yelp Engineering and now available on all new Elasticsearch clusters on AWS.Kibana
(optional): A web interface for searching and visualising logs (Provisioned by Qbox).
Prerequisites
The amount of CPU, RAM, and storage that your Elasticsearch Server will require depends on the volume of logs that you intend to gather. For this tutorial, we will be using a Qbox provisioned Elasticsearch with the following minimum specs:
- Provider:
AWS
- Version:
5.1.1
- RAM:
1GB
- CPU:
vCPU1
- Replicas:
0
The above specs can be changed per your desired requirements. Please select the appropriate names, versions, regions for your needs. For this example, we used Elasticsearch version 5.1.1, the most current version is 5.3. We support all versions of Elasticsearch on Qbox. (To learn more about the major differences between 2.x and 5.x, click here.)
In addition to our Elasticsearch Server, we will require a separate logstash server to process incoming twitter stream from twitter API and ship them to Elasticsearch. For simplicity and testing purposes, the logstash server can also act as the client server itself. The Endpoint and Transport addresses for our Qbox provisioned Elasticsearch cluster are as follows:
Endpoint: REST API
https://ec18487808b6908009d3:efcec6a1e0@eb843037.qb0x.com:32563
Authentication
- Username =
ec18487808b6908009d3
- Password =
efcec6a1e0
TRANSPORT (NATIVE JAVA)
eb843037.qb0x.com:30543
Note: Please make sure to whitelist the logstash server IP from Qbox Elasticsearch cluster.
Configure Alerting
Now, let’s create the “rules” namely
- Twitter cardinality rule of type
cardinality
- Twitter percentage match rule of type
percentage_match
- Twitter single metric aggregation rule of type
metric_aggregation
Using these, we will test 3 types of rules that Elastalert can manage:
- The cardinality rule matches when a the total number of unique values for a certain field within a time frame is higher or lower than a threshold.
- The percentage match rule matches when the percentage of document in the match bucket within a calculation window is higher or lower than a threshold. By default the calculation window is buffer_time.
- The metric aggregation rule matches when the value of a metric within the calculation window is higher or lower than a threshold. By default, this is buffer_time.
First, we create the cardinality
rule, by configuring the following code:
# Alert when a the total number of unique values for a certain field (language) for non-truncated tweets within a time frame (1 hour) is lower than a threshold (5) # (Required) # Rule name, must be unique name: Event Cardinality rule # (Required) # Type of alert. type: cardinality # (Required, cardinality specific) # Count the number of unique values for this field cardinality_field: "language" # (Required, frequency specific) # If the cardinality of the data is lower than this number, an alert will be triggered. The timeframe must have elapsed since the first event before any alerts will be sent. # Alert when there are less than 5 unique languages min_cardinality: 5 # If the cardinality of the data is greater than this number, an alert will be triggered. # Alert when there are more than 10 unique languages # max_cardinality: 10 # (Required, frequency specific) # The cardinality is defined as the number of unique values for the most recent 1 hour timeframe: hours: 1 # (Required) # A list of Elasticsearch filters used for find events # These filters are joined with AND and nested in a filtered query filter: - term: truncated: "false" # (Required) # The alert is used when a match is found alert: - "hipchat" hipchat: # The randomly generated notification token created by HipChat. hipchat_auth_token: "YfkUkoUnQ8R3mxXXXXPKOf2szt4myllDpokM" # The id associated with the HipChat room you want to send the alert to hipchat_room_id: "33XXX11"
HipChat Accounts Token Generation
As we can see, this is a very straightforward and simple configuration. For the <strong>percentage</strong>
match
config, we configure our rule as follows:
# Alert when 95% of tweets in any particular language are truncated within buffer_time # (Required) # Rule name, must be unique name: Event Percentage Match rule # (Required) # Type of alert type: percentage_match index: twitter-* description: "95% of tweets in any particular language should be truncated" # (Required) # A list of Elasticsearch filters used for find events # These filters are joined with AND and nested in a filtered query filter: - term: _type: twitter_logs # default the calculation window buffer_time: minutes: 5 # Group percentage by this field. For each unique value of the query_key field, the percentage will be calculated and evaluated separately against the threshold(s). query_key: language doc_type: twitter_logs match_bucket_filter: - term: truncated: true # If the percentage of matching documents is greater than this number, an alert will be triggered min_percentage: 95 #max_percentage: 60 # (Required) # The alert is used when a match is found alert: - "hipchat" hipchat: # The randomly generated notification token created by HipChat. hipchat_auth_token: "YfkUkoUnQ8R3mxXXXXPKOf2szt4myllDpokM" # The id associated with the HipChat room you want to send the alert to hipchat_room_id: "33XXX11"
Finally, let’s configure our final metric_aggregation
rule as follows:
# Alert when average retweet count for a particular user’s tweet is either less than 3 or greater than 5 # (Required) # Rule name, must be unique name: Event Twitter Metric Aggregation Rule # (Required) # Type of alert type: metric_aggregation index: twitter-* # default the calculation window buffer_time: hours: 1 # name of the field over which the metric value will be calculated metric_agg_key: retweet_count # The type of metric aggregation to perform on the metric_agg_key field metric_agg_type: avg # Group metric calculations by this field. For each unique value of the query_key field, the metric will be calculated and evaluated separately against the threshold(s). query_key: user.id doc_type: twitter_logs bucket_interval: minutes: 5 sync_bucket_interval: true # If the calculated metric value is greater than this number, an alert will be triggered. This threshold is exclusive. max_threshold: 5 # If the calculated metric value is smaller than this number, an alert will be triggered. This threshold is exclusive. min_threshold: 3 # (Required) # A list of Elasticsearch filters used for find events # These filters are joined with AND and nested in a filtered query filter: - term: truncated: true # (Required) # The alert is used when a match is found alert: - "hipchat" hipchat: # The randomly generated notification token created by HipChat. hipchat_auth_token: "YfkUkoUnQ8R3mxXXXXPKOf2szt4myllDpokM" # The id associated with the HipChat room you want to send the alert to hipchat_room_id: "33XXX11"
Thus, Qbox Configuration for Alerting must be as follows:
name: Event Cardinality rule type: cardinality cardinality_field: "language" min_cardinality: 5 timeframe: hours: 2 filter: - term: truncated: "false" alert: - "hipchat" hipchat: hipchat_auth_token: "YfkUkoUnQ8R3mxXXXXPKOf2szt4myllDpokM" hipchat_room_id: "33XXX11" --- name: Event Percentage Match rule type: percentage_match index: twitter-* description: "95% of all truncated tweets should be in any particular language" filter: - term: _type: twitter_logs buffer_time: minutes: 5 query_key: language doc_type: twitter_logs match_bucket_filter: - term: truncated: true min_percentage: 95 alert: - "hipchat" hipchat: hipchat_auth_token: "YfkUkoUnQ8R3mxXXXXPKOf2szt4myllDpokM" hipchat_room_id: "33XXX11" --- name: Event Twitter Metric Aggregation Rule type: metric_aggregation index: twitter-* buffer_time: hours: 1 metric_agg_key: retweet_count metric_agg_type: avg query_key: user.id doc_type: twitter_logs bucket_interval: minutes: 5 sync_bucket_interval: true max_threshold: 5 min_threshold: 3 filter: - term: truncated: true alert: - "hipchat" hipchat: hipchat_auth_token: "YfkUkoUnQ8R3mxXXXXPKOf2szt4myllDpokM" hipchat_room_id: "33XXX11"
Install Logstash
Download and install the Public Signing Key:
wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
We will use the Logstash version 2.4.x
as compatible with our Elasticsearch version 5.1.x
. The Elastic Community Product Support Matrix can be referred in order to clear any version issues.
Add the repository definition to your /etc/apt/sources.list file:
echo "deb https://packages.elastic.co/logstash/2.4/debian stable main" | sudo tee -a /etc/apt/sources.list
Run sudo apt-get update and the repository is ready for use. You can install it with:
sudo apt-get update && sudo apt-get install logstash
Alternatively, logstash tar can also be downloaded from Elastic Product Releases Site. Then, the steps of setting up and running logstash are pretty simple:
- Download and unzip Logstash
- Prepare a
logstash.conf
config file - Run
bin/logstash -f logstash.conf -t
to check config (logstash.conf) - Run
bin/logstash -f logstash.conf
Configure Logstash (Twitter Stream)
Logstash configuration files are in the JSON-format, and reside in /etc/logstash/conf.d
. The configuration consists of three sections: inputs, filters, and outputs.
We need to be authorized to take data from Twitter via its API. This part is easy:
- Login to your Twitter account
- Go to https://dev.twitter.com/apps/
- Create a new Twitter application (here I give Twitter-Qbox-Stream as the name of the app).
After you successfully create the Twitter application, you get the following parameters in “Keys and Access Tokens”:
- Consumer Key (API Key)
- Consumer Secret (API Secret)
- Access Token
- Access Token Secret
We are now ready to create the Twitter data path (stream) from Twitter servers to our machine. We will use the above four parameters (consumer key, consumer secret, access token, access token secret) to configure twitter input for logstash.
Let’s create a configuration file called 02-twitter-input.conf
and set up our “twitter” input:
sudo vi /etc/logstash/conf.d/02-twitter-input.conf
Insert the following input configuration:
input { twitter { consumer_key => "BCgpJwYPDjXXXXXX80JpU0" consumer_secret => "Eufyx0RxslO81jpRuXXXXXXXMlL8ysLpuHQRTb0Fvh2" keywords => ["mobile", "java", "android", "elasticsearch", "search"] oauth_token => "193562229-o0CgXXXXXXXX0e9OQOob3Ubo0lDj2v7g1ZR" oauth_token_secret => "xkb6I4JJmnvaKv4WXXXXXXXXS342TGO6y0bQE7U" } }
Save and quit the file 02-twitter-input.conf
.
This specifies a twitter input that will filter tweets with keywords “mobile“, “java“, “android“, “elasticsearch“, “search” and pass them to logstash output. Save and quit. Lastly, we will create a configuration file called 30-elasticsearch-output.conf
:
sudo vi /etc/logstash/conf.d/30-elasticsearch-output.conf
Insert the following output configuration:
output { elasticsearch { hosts => ["https://eb843037.qb0x.com:32563/"] user => "ec18487808b6908009d3" password => "efcec6a1e0" index => "twitter-%{+YYYY.MM.dd}" document_type => "twitter_logs" } stdout { codec => rubydebug } }
Save and exit. This output basically configures Logstash to store the twitter logs data in Elasticsearch which is running at https://eb843037.qb0x.com:30024/
, in an index named after the twitter.
If you have downloaded logstash tar or zip, you can create a logstash.conf file having input, filter and output all in one place.
sudo vi LOGSTASH_HOME/logstash.conf
Insert the following input and output configuration in logstash.conf
input { twitter { consumer_key => "BCgpJwYPDjXXXXXX80JpU0" consumer_secret => "Eufyx0RxslO81jpRuXXXXXXXMlL8ysLpuHQRTb0Fvh2" keywords => ["mobile", "java", "android", "elasticsearch", "search"] oauth_token => "193562229-o0CgXXXXXXXX0e9OQOob3Ubo0lDj2v7g1ZR" oauth_token_secret => "xkb6I4JJmnvaKv4WXXXXXXXXS342TGO6y0bQE7U" } } output { elasticsearch { hosts => ["https://eb843037.qb0x.com:32563/"] user => "ec18487808b6908009d3" password => "efcec6a1e0" index => "twitter-%{+YYYY.MM.dd}" document_type => "twitter_logs" } stdout { codec => rubydebug } }
Test your Logstash configuration with this command:
sudo service logstash configtest
It should display Configuration OK if there are no syntax errors. Otherwise, try and read the error output to see what’s wrong with your Logstash configuration.
Restart Logstash, and enable it, to put our configuration changes into effect:
sudo service logstash restart sudo update-rc.d logstash defaults 96 9
If you have downloaded logstash tar or zip, it can be run using following command
bin/logstash -f logstash.conf
Numerous responses are received. The structure of document is as follows:
{ "text": "Learn how to automate anomaly detection on your #Elasticsearch #timeseries data with #MachineLearning:", "created_at": "2017-05-07T07:54:47.000Z", "source": "<a href="%5C">Twitter for iPhone</a>", "truncated": false, "language": "en", "mention": [], "retweet_count": 0, "hashtag": [ { "text": "Elasticsearch", "start": 49, "end": 62 }, { "text": "timeseries", "start": 65, "end": 75 }, { "text": "MachineLearning", "start": 88, "end": 102 } ], "location": { "lat": 33.686657, "lon": -117.674558 }, "place": { "id": "74a60733a8b5f7f9", "name": "elastic", "type": "city", "full_name": "San Francisco, CA", "street_address": null, "country": "United States", "country_code": "US", "url": "https://api.twitter.com/1.1/geo/id/74a60733a8b5f7f9.json" }, "link": [], "user": { "id": 2873953509, "name": "Elastic", "screen_name": "elastic", "location": "SF, CA", "description": "The company behind the Elastic Stack (#elasticsearch, #kibana, Beats, #logstash), X-Pack, and Elastic Cloud" } }
The simplest of the rules to test it out is the metric_aggregation rule. All we have to do is wait for about 5 minutes with Elasticsearch running and Logstash stopped, so no documents are streaming. After the wait, we can see on our channel that an alert is received on the channel.
And, as the time passes, we will receive other alerts as well, like the percentage_match alert.
Conclusion
ElastAlert helps to learn a lot from data and use it to monitor many critical systems. If you know what you’re looking for, archiving log files and retrieving them manually might be sufficient, but this process is tedious. As your infrastructure scales, so does the volume of log files, and the need for a log management system becomes apparent. Qbox provisioned Elasticsearch is already very successful for indexing logs, faster retrieval, powerful search tools, great visualizations and many other purposes. Qbox built in support for ElastAlert will help greatly in alerting on anomalies, spikes, or other patterns of interest from data in Elasticsearch.
There are several types of included alerters. Of course, as in the example, you can send emails. You can also open JIRA issues, run arbitrary commands, and custom python code. Each alerter has it’s own specific options, but there are several that can apply to any type, such as realert, which is the minimum time before sending a subsequent alert for a given rule, and aggregation, which allows you to aggregate all alerts which occur within a timeframe for a rule together.
Other Helpful Tutorials
- Getting Started with Elasticsearch on Qbox
- How to: Slack Alerting for Elasticsearch with ElastAlert
- Elasticsearch ElastAlert: Alerting at Scale
- Elasticsearch Alerting Now Available on Qbox
- How to Use Elasticsearch, Logstash, and Kibana to Manage NGINX Logs
Give It a Whirl!
It’s easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, or Amazon data centers. And you can now provision your own AWS Credits on Qbox Private Hosted Elasticsearch.
Questions? Drop us a note, and we’ll get you a prompt response.
Not yet enjoying the benefits of a hosted ELK-stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.