A comprehensive log management and analysis strategy is vital, enabling organizations to understand the relationship between operational, security, and change management events and maintain a comprehensive understanding of their infrastructure. Log files from web servers, applications, and operating systems also provide valuable data, though in different formats, and in a random and distributed fashion.

No real-world web application can exist without a data storage backend, and most applications today use relational database management systems (RDBMS) for storing and managing data. The most commonly used database is MySQL, which is an open-source RDBMS that is the ‘M’ in the open-source enterprise LAMP Stack (Linux, Apache, MySQL and PHP).

Middle and large-sized applications send multiple database queries per second, and slow queries are often the cause of slow page loading and even crashes. The task of analyzing query performance is critical to determine the root cause of these bottlenecks, and most databases come with built-in profiling tools to help us.

Provisioning an Elasticsearch cluster in Qbox is easy. In this article, we walk you through the initial steps and show you how simple it is to start and configure your cluster. We then install and configure logstash to ship our MySQL or MariaDB/Galera logs to Elasticsearch. MySQL logs shipped to elasticsearch can then be visualized and analyzed via Kibana dashboards.

Our Goal

The goal of the tutorial is to use Qbox as a Centralised Logging and Monitoring solution. Qbox provides out of box solution for Elasticsearch, Kibana and many of Elasticsearch analysis and monitoring plugins. We will set up Logstash in a separate node or machine to gather MySQL or MariaDB/Galera logs from single or multiple servers, and use Qbox’s provisioned Kibana to visualize the gathered logs.

Our ELK stack setup has three main components:

  • Elasticsearch: It is used to store all of the application and monitoring logs(Provisioned by Qbox).

  • Logstash: The server component that processes incoming logs and feeds to ES.

  • Kibana: A web interface for searching and visualizing logs (Provisioned by Qbox).

For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click "Get Started" in the header navigation. If you need help setting up, refer to "Provisioning a Qbox Elasticsearch Cluster."

Prerequisites

The amount of CPU, RAM, and storage that your Elasticsearch Server will require depends on the volume of logs that you intend to gather. For this tutorial, we will be using a Qbox provisioned Elasticsearch with the following minimum specs:

  • Provider: AWS

  • Version: 5.1.1

  • RAM: 1 GB

  • CPU: vCPU1

  • Replicas: 0

The above specs can be changed per your desired requirements. Please select the appropriate names, versions, regions for your needs. For this example, we used Elasticsearch version 5.1.1, the most current version is 5.3. We support all versions of Elasticsearch on Qbox. (To learn more about the major differences between 2.x and 5.x, click here.)

In addition to our Elasticsearch Server, we will require a separate logstash server to process incoming MySQL or MariaDB/Galera logs from client servers and ship them to Elasticsearch. There can be a single or multiple client servers for which you wish to ship logs to Elasticsearch. For simplicity or testing purposes, the logstash server can also act as the client server itself. The Endpoint and Transport addresses for our Qbox provisioned Elasticsearch cluster are as follows:

common_1.png

Endpoint: REST API

https://ec18487808b6908009d3:efcec6a1e0@eb843037.qb0x.com:32563

Authentication

  • Username = ec18487808b6908009d3

  • Password = efcec6a1e0

TRANSPORT (NATIVE JAVA)

eb843037.qb0x.com:30543

Note: Please make sure to whitelist the logstash server IP from Qbox Elasticsearch cluster. Also, the elasticsearch server must have access to all client servers to collect MySQL logs from.

Installing MySQL

If you’ve already got MySQL installed, you can skip to the next step. If not, enter the following command to update your system:

sudo apt-get update

And then install MySQL:

sudo apt-get install mysql-server

During installation, you’ll need to set the root password. Make note of it, as you’ll be needing it whenever logging into mysql server.

Configure MySQL to Write Log Files

We need to configure MySQL to write general and slow query log files because those configurations are disabled by default. Then, we need to set the threshold for creating logs.

To do this, first open the ‘my.cnf’ configuration file:

sudo vim /etc/mysql/my.cnf

Next, uncomment and edit the following lines:

general_log_file = /var/log/mysql/mysql.log
general_log= 1
log_slow_queries = /var/log/mysql/mysql-slow.log
long_query_time = 1
log-queries-not-using-indexes = 1

Make sure to restart MySQL after making these changes:

sudo service mysql restart

Install Logstash

Download and install the Public Signing Key:

wget -qO - <a href="https://packages.elastic.co/GPG-KEY-elasticsearch">https://packages.elastic.co/GP...</a> | sudo apt-key add -

We will use the Logstash version 2.4.x as compatible with our Elasticsearch version 5.1.x. The Elastic Community Product Support Matrix can be referred in order to clear any version issues.

Add the repository definition to your /etc/apt/sources.list file:

echo "deb <a href="https://packages.elastic.co/logstash/2.4/debian">https://packages.elastic.co/lo...</a> stable main" | sudo tee -a /etc/apt/sources.list

Run sudo apt-get update and the repository is ready for use. You can install it with:

sudo apt-get update && sudo apt-get install logstash

Alternatively, logstash tar can also be downloaded from Elastic Product Releases Site. Then, the steps of setting up and running logstash are pretty simple:

  • Download and unzip Logstash

  • Prepare a logstash.conf config file

  • Run bin/logstash -f logstash.conf -t to check config (logstash.conf)

  • Run bin/logstash -f logstash.conf

Configure Logstash

Logstash configuration files are in the JSON-format, and reside in /etc/logstash/conf.d. The configuration consists of three sections: inputs, filters, and outputs.

Let's create a configuration file called 02-mysql-input.conf and set up our "mysql" input:

sudo vi /etc/logstash/conf.d/02-mysql-input.conf

Insert the following input configuration:

input {
  file {
    path => ["/var/log/mysql/mysql.log", “/var/log/mysql/mysql-slow.log”, "/var/log/mysql/mysql_error.log"]
    type => "mysql"
  }
}

NOTE : MySQL log files path may differ based upon your environment, logging configuration and underlying OS.

Save and quit. This specifies a mysql input that will listen on tcp port 5044. Now let's create a configuration file called 10-mysql-filter.conf, where we will add a filter for mysql messages:

sudo vi /etc/logstash/conf.d/10-mysql-filter.conf

Insert the following mysql filter configuration:

filter {
   # Deal with the multi-line MySQL/MariaDB log entries
   if [type] == "mysql" {
       multiline {
           pattern => "^%{NUMBER} *%{NUMBER}"
           negate  => true
           what    => "previous"
       }
       # Strip out MySQLs messy timestamps
       grok {
           match => [ 'message', "(?m)^%{NUMBER:date} *%{NOTSPACE:time} %{GREEDYDATA:message}" ]
           overwrite => [ 'message' ]
           add_field => { "mysql_time" => "%{date} %{time}" }
       }
       # Check if we have a severity field
       if [message] =~ /^\[/ {
           grok {
               match => [ 'message', "(?m)\[%{NOTSPACE:severity}\] %{GREEDYDATA:message}" ]
               overwrite => [ 'message' ]
           }
       }
       # If the message contains WSREP:, it regards the galera cluster
       if  [message] =~ /WSREP:/ {
           mutate {
               replace => [ "type", "Galera" ]
           }
           grok {
               match => [ 'message', "(?m)WSREP: %{GREEDYDATA:message}" ]
               overwrite => [ 'message' ]
           }
       }
   }
}

Save and quit. This filter looks for logs that are labeled as "mysql" type and it will try to use grok to parse incoming MySQL logs to make it structured and queryable.

Lastly, we will create a configuration file called 30-elasticsearch-output.conf:

sudo vi /etc/logstash/conf.d/30-elasticsearch-output.conf

Insert the following output configuration:

output {
 elasticsearch {
   hosts => ["https://eb843037.qb0x.com:32563/"]
   user => "ec18487808b6908009d3"
   password => "efcec6a1e0"
   index => "mysql-%{+YYYY.MM.dd}"
   document_type => "mysql_logs"
 }
 stdout { codec => rubydebug }
}

Save and exit. This output basically configures Logstash to store the logs data in Elasticsearch which is running at https://eb843037.qb0x.com:30024/, in an index named after the MySQL.

If you have downloaded logstash tar or zip, you can create a logstash.conf file having input, filter and output all in one place.

sudo vi LOGSTASH_HOME/logstash.conf

Insert the following input, filter and output configuration in logstash.conf:

input {
 file {
   path => ["/var/log/mysql/mysql.log", "/var/log/mysql/mysql-slow.log", "/var/log/mysql/mysql_error.log"]
   type => "mysql"
 }
}
filter {
   # Deal with the multi-line MySQL/MariaDB log entries
   if [type] == "mysql" {
       multiline {
           pattern => "^%{NUMBER} *%{NUMBER}"
           negate  => true
           what    => "previous"
       }
       # Strip out MySQLs messy timestamps
       grok {
           match => [ 'message', "(?m)^%{NUMBER:date} *%{NOTSPACE:time} %{GREEDYDATA:message}" ]
           overwrite => [ 'message' ]
           add_field => { "mysql_time" => "%{date} %{time}" }
       }
       # Check if we have a severity field
       if [message] =~ /^\[/ {
           grok {
               match => [ 'message', "(?m)\[%{NOTSPACE:severity}\] %{GREEDYDATA:message}" ]
               overwrite => [ 'message' ]
           }
       }
       # If the message contains WSREP:, it regards the galera cluster
       if  [message] =~ /WSREP:/ {
           mutate {
               replace => [ "type", "Galera" ]
           }
           grok {
               match => [ 'message', "(?m)WSREP: %{GREEDYDATA:message}" ]
               overwrite => [ 'message' ]
           }
       }
   }
}
output {
 elasticsearch {
   hosts => ["https://eb843037.qb0x.com:32563/"]
   user => "ec18487808b6908009d3"
   password => "efcec6a1e0"
   index => "mysql-%{+YYYY.MM.dd}"
   document_type => "mysql_logs"
 }
 stdout { codec => rubydebug }
}

If you want to add filters for other applications that use the MySQL input, be sure to name the files so they sort between the input and the output configuration (i.e. between 02- and 30-).

Test your Logstash configuration with this command:

sudo service logstash configtest

It should display Configuration OK if there are no syntax errors. Otherwise, try and read the error output to see what's wrong with your Logstash configuration.

Restart Logstash, and enable it, to put our configuration changes into effect:

sudo service logstash restart
sudo update-rc.d logstash defaults 96 9

If you have downloaded logstash tar or zip, it can be run using following command

bin/logstash -f logstash.conf

Next, we'll load the sample Kibana dashboards.

Load Kibana Dashboards

Screen Shot 2017-04-15 at 12.47.04 PM.png

When you are finished setting Logstash server to collect logs from client servers, let's look at Kibana, the web interface provisioned by Qbox. Kibana User interface can be used for filtering, sorting, discovering and visualizing logs that are stored in Elasticsearch. Go ahead and click on Visualize data with Kibana from your cluster configuration dashboard.

Screen Shot 2017-04-15 at 12.47.17 PM.png

Go ahead and select [mysql]-YYY.MM.DD from the Index Patterns menu (left side), then click the Star (Set as default index) button to set the MySQL index as the default.

Now click the Discover link in the top navigation bar. By default, this will show you all of the log data over the last 15 minutes. You should see a histogram with log events, with log messages below:

Screen Shot 2017-04-15 at 12.47.32 PM.png

Right now, there won't be much in there because you are only gathering MySQL logs from your client servers. Here, you can search and browse through your logs. You can also customize your dashboard.

Try the following things:

  • Search for MySQL log file types in Kibana: ‘mysql’, ‘mysql_error’ and ‘mysql_slow_query’.

  • Search for a particular hostname (search for host: "hostname")

  • Change the time frame by selecting an area on the histogram or from the menu above

  • Create visualizations and populate them in specific dashboards.

a5_mysql_1.gif

Once your logs have arrived, you can begin to use Kibana to query Elasticsearch, filter the logs based on your needs, and save your searches to create visualizations.

Conclusion

Analyzing MySQL logs is very critical considering performance of overall application. Queries that take longer to process than a predefined length of time will be logged in the MySQL slow query log. The error log also contains useful information, reporting when MySQL was started or stopped and when critical errors occurred. But as with any type of log data, log analysis and management are tasks that challenge even the most experienced of teams, especially in enterprise production environments that produce log files containing thousands of entries every day.

Qbox provisioned Elasticsearch makes it very easy for us to visualize centralized logs using Logstash and Kibana. Remember that we can send pretty much any type of log or indexed data to Logstash, but the data becomes even more useful if it is parsed and structured with grok.

What do we look for in centralized logging? As it happens, many things, but the most important are as follows.

  • A way to parse data and send them to a central database in near real-time.

  • The capacity of the database to handle near real-time data querying and analytics.

  • A visual representation of the data through filtered tables, dashboards, and so on.

The ELK stack (Elasticsearch, Logstash, and Kibana) can do all that and it can easily be extended to satisfy the particular needs we’ll set in front of us.

Other Helpful Tutorials

Give It a Whirl!

It's easy to spin up a standard hosted Elasticsearch cluster on any of our 47 Rackspace, Softlayer, Amazon, or Microsoft Azure data centers. And you can now provision your own AWS Credits on Qbox Private Hosted Elasticsearch

Questions? Drop us a note, and we'll get you a prompt response.

Not yet enjoying the benefits of a hosted ELK stack enterprise search on Qbox? We invite you to create an account today and discover how easy it is to manage and scale your Elasticsearch environment in our cloud hosting service.

comments powered by Disqus