Data becomes a strategic asset for any organization in the modern digital age, and data  breaches can lead to serious financial losses and legal consequences, especially if customers’ personal data is affected. 

Nevertheless, many companies fail to adopt proper data protection policies.

In the world of Elasticsearch, such negligence has led to serious security breaches that affected thousands of companies and exploited unprotected Elasticsearch clusters exposed to the public web. The most well known such incidents are the 2017 ransom attacks on ES databases and the more recent 2020 “Meow” attack that deletes all indexes. 

In the earlier versions of Elasticsearch, security features were available to users of paid subscriptions. However, this changed in Elasticsearch 6.8.0 and 7.1.0 as Elastic open sourced many previously paid features including:

  • TLS for encrypted communications
  • File and native realm for creating and managing users
  • Role-based access control for managing user access to cluster indexes and APIs 

Open sourcing these security features means that Elasticsearch users no longer have excuses for not enabling security in their Elasticsearch clusters. 

In this article, we’ll discuss best practices for configuring the security of your production Elasticsearch clusters. In particular, we’ll focus on such useful security features as basic authentication, TLS encryption, IP filtering, authorization, and others. We’ll also discuss how Qbox enables many of these security features by default in our hosted Elasticsearch offering.

Basic Protection against Hacking Attacks

Recent hacker attacks against Elasticsearch targeted unprotected clusters accessible over public IPs. Such clusters can be found using open source security tools like Shodan that help identify open databases and any device connected to the internet. Malware or individual hackers can just scan the internet for the default Elasticsearch port 9200 and send malicious requests via the public IP. 

Thus, unless your Elasticsearch cluster does not have a basic auth, the most obvious rule is to avoid serving Elasticsearch on public IPs accessible over the internet. Ideally, run Elasticsearch as part of the private network such as VPN protected by the firewall. 

Proxy Client Requests to Elasticsearch

Users of web applications should not be able to directly access Elasticsearch with their client requests. Ideally, clients should communicate with your server-side software that can transform their requests into corresponding Elasticsearch queries and execute them. 

Your server-side software can be also used to validate user credentials and roles before allowing users access to specific indexes. Such an approach can prevent malicious requests from hitting your Elasticsearch indexes and unauthorized access to Elasticsearch data.

Use Linux Containers for Isolation

If properly configured, Linux containers provide a powerful way to isolate Elasticsearch from malicious environments. Containers are self-contained images that encapsulate Elasticsearch binaries, configuration, and sensitive data while providing access to OS resources (storage, RAM, compute) via the container runtime (e.g., Docker). Also, if you run Elasticsearch in containers on Kubernetes, you can benefit from production-grade container orchestration and automation services (upgrades, health checks, autoscaling) for your Elasticsearch deployments.

Enabling Authentication 

Basic authentication is usually enough to protect Elasticsearch clusters against hacker attacks like the 2020 “Meow” attack that exploits unprotected ES clusters. 

By default, authentication is disabled in Elasticsearch basic and trial licenses. You can enable it by setting  xpack.security.enabled: true in elasticsearch.yml file. This setting also activates other free security features provided by Elasticsearch. After restarting Elasticsearch, users will have to specify a username and password to access the cluster.

The next important step is to create passwords for built-in users that perform different administrative roles. These users include apm_system, beats_system, elastic, kibana_system, logstash_system, and remote_monitoring_user.

To create passwords for them, you can use the interactive bash script named ‘elasticsearch-setup-passwords’ that is shipped with the Elasticsearch installation. You can find it under the Elasticsearch bin directory and launch in the interactive mode in the terminal (see the image below).

Elasticsearch auth

Make sure to remember all the passwords you created because some of them will be needed later.

Configure Kibana Authentication

After the Elasticsearch authentication is enabled, users must log in to Kibana with a valid username and password.

In order to access Kibana as an administrative user, you should make sure that you add the Kibana password you created via the interactive dialogue to the Kibana configuration file named kibana.yml:

elasticsearch.username: "kibana_system"
elasticsearch.password: "your_password"

Alternatively, you can add these settings to the Kibana keystore:

./bin/kibana-keystore create
./bin/kibana-keystore add elasticsearch.username
./bin/kibana-keystore add elasticsearch.password

When you next access Kibana, you will be be prompted to enter your username and password:

Create Users with Native Realm

Once you have created built-in users, you can configure authentication for all users you want to allow access to Elasticsearch. The Elastic Stack supports various types of authentication including the basic (native) authentication, LDAP, PKI, SAML, or Kerberos. 

Native realm auth is a free feature in ES > 6.8.0, so let’s discuss how to configure users with it. The easiest way to create users is from the Kibana dashboard. 

You’ll need to log in to Kibana with the ‘elastic’ built-in user and then go to Stack Management > Security > Users  (see the image below).

Then, click the “Create User” button to open the “New user” dialogue where you can enter all the required user details (see the image below).

Enabling Authorization

Authorization allows controlling user access to specific resources in the Elasticsearch cluster. To enable authorization in earlier Elasticsearch versions, you had to specify complex filtering rules using a proxy like Nginx. Such an approach is flawed because filters cannot cover all possible use cases and the Elasticsearch API is frequently updated. Fortunately, more recent versions of Elasticsearch allow configuring authorization easily from Kibana. 

By default, Elasticsearch users can change only their own passwords and get certain information about themselves. An Elasticsearch administrator can widen the scope of user rights in the cluster using default or custom rules. 

Each role defines a set of actions (e.g., read, delete) that can be performed on specific resources (indices, documents, fields, clusters). There are built-in roles you can access from Kibana at Stack Management > Security > Roles (see the image below).

You can select one or more of these roles and assign them to a test user we created above: 

As you see, we granted four roles to our test user including Kibana Admin. On the next login, the test user will be able to manage Kibana and Elasticsearch but won’t be able to manage other users (because only a superuser can do this).

TLS Encryption

If the TLS encryption is disabled, Elasticsearch nodes and clients send all data in plain text. This data may include sensitive information such as passwords and other credentials. In this context, encrypting network communication is very important to prevent sniffing in-flight data, man-in-the-middle attacks, and any kind of manipulations with data and attempts to gain access to Elasticsearch nodes. 

TLS encryption is also useful for preventing malicious hacker nodes from joining a cluster and getting access to data via replication. If TLS is enabled, Elasticsearch nodes must use certificates issued by a specified certificate authority (CA) to identify themselves when talking with other nodes. The node won’t be able to access the cluster if no valid certificate is provided.

As we’ve mentioned, Elasticsearch 6.8.0 made encrypted communication a part of a free Elasticsearch offering. Currently, Elasticsearch encrypted communications support the following features:

  • TLS on the transport layer by default and optionally TLS on the HTTP layer
  • The elasticsearch-certutil command for generating certificates
  • Strong encryption. It’s possible to use encryption with key lengths greater than 128 bits, such as 256-bit AES encryption.

We don’t go into more detail about configuring TLS certificates for your ES cluster because it’s a complex topic worthy of a separate post. You can find a detailed guide on configuring TLS in your ES cluster here

Enabling IP FIltering

Elasticsearch supports IP filtering that can be applied to application clients, node clients, other nodes, and users attempting to connect to the cluster. The Elasticsearch access control feature can also be set up to reject domains and subnets. Both IPv4 and IPv6 addresses are supported.

ES admins can blacklist certain IPs to deny access to the cluster. In this case, the connection from the blacklisted IP is dropped immediately and no requests are processed.

You configure IP filtering by specifying the 

xpack.security.transport.filter.allow and xpack.security.transport.filter.deny settings in elasticsearch.yml.

Also, you can use the _all keyword to deny all connections that are not explicitly allowed:

xpack.security.transport.filter.allow: [ "192.168.0.1", "192.168.0.2", "192.168.0.3", "192.168.0.4" ]
xpack.security.transport.filter.deny: _all

In addition, if you are working in a highly dynamic environment where you don’t know IPs before provisioning the cluster, you can use the ES update API to dynamically configure IP filtering rules. For example, 

curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent" : {
    "xpack.security.transport.filter.allow" : "172.16.0.0/24"
  }
}
'

Backing up Elasticsearch Data

Scheduling regular backups of Elasticsearch data is an essential component of a sound disaster recovery strategy. Administrators need to ensure that backups reflect the consistent state of the cluster and are not corrupt. Otherwise, backups will be useless. 

Elasticsearch has a built-in Snapshot and Restore module with which you can take snapshots of the critical ES data and restore it. Elasticsearch built-in snapshots are application-consistent and storage-efficient. Application consistency guarantees that the snapshot reflects the actual state of the database at the time the snapshot is taken. This is done by recording all pending in-memory operations along with the on-disc data. 

Also, Elasticsearch snapshots are optimized for saving storage resources and fast disk IO. They are made incrementally, ensuring that each new snapshot stores data not stored in the earlier snapshot. This allows for fast and efficient snapshotting with minimal overhead.

The Snapshot and Restore module allows taking snapshots of specific indexes and data streams and storing them in local or remote repositories. Elasticsearch supports such remote repositories as Amazon S3, HDFS, Microsoft Azure, Google Cloud Storage, and others.

Also, Elasticsearch supports snapshot lifecycle management to automatically take and manage snapshots.

To learn more about using the Snapshot and Restore module to create backups of Elasticsearch data, please consult this article.

Get Built-in Security with Qbox-hosted ES Clusters

Qbox hosted Elasticsearch clusters provide many of the security features discussed above by default. Focus on security as a feature of our offering saved our customers from the 2017 ransom attacks and more recent hacks against publicly exposed Elasticsearch clusters. As a result, no Qbox users were affected by these incidents. Qbox security features go beyond basic protection against unauthorized access from the public web. Let’s discuss them in more detail. 

Built-in User Authentication for Elasticsearch and Kibana 

All Qbox hosted Elasticsearch clusters are set up with basic auth (username/password) upon provisioning. Our Elasticsearch installation scripts configure all the built-in users and provide auto-generated user credentials that can be changed at any time. Just this feature alone is enough to protect from simple attacks against publicly accessible ES clusters. For example, even if your cluster was identified by the “Meow” bot scanning the internet for Elasticsearch clusters, data stored in them could not be accessed or modified without the knowledge of your security credentials.

Default TLS/SSL Communication

Qbox hosted Elasticsearch clusters are deployed with TLS/SSL-enabled communication between nodes and clients. Under the hood, Qbox creates all certificates for ES nodes and configures them to use TLS/SSL encryption using these certificates. Qbox makes sure that only the nodes with the valid certificates can join the cluster. Built-in TLS/SSL encryption protects against network sniffing, spoofing, and malicious nodes joining the ES cluster. 

IP Filtering 

Qbox enables whitelisting for both HTTP and transport traffic so you can limit access to your clusters only to authorized IPs.

Snapshotting Schedule 

Qbox automates backups of ES clusters via a daily snapshot schedule and built-in snapshot lifecycle management. Snapshots are stored on the highly available AWS S3 buckets and can be easily accessed by Qbox users. 

In addition, Qbox users can ask our support personnel to perform a manual snapshot any time between this daily window if so needed. ES snapshots can be easily restored to any running ES clusters so you are not locked in to our service. 

Run Elasticsearch on Kubernetes

Qbox runs Elasticsearch in containers deployed and managed in Kubernetes clusters on AWS. Running Elasticsearch in properly configured containers and pods that are optimized for performance and high availability provides a lot of benefits. First, containers allow you to save on storage and compute resources because they can be packed tightly on a single server (or virtual server instance). In addition, using Kubernetes means that ES clusters can be seamlessly scaled and updated without manual intervention. Third, containers provide isolation that acts as an additional layer of protection against attacks originating from the public web. 

Qbox hosted Elasticsearch is automatically provided in optimized container images run on the AWS-based Kubernetes clusters configured using best practices — so you get all the benefits of containerized Elasticsearch out of the box. 

Qbox manages a lot of complexity that allows running ES in Kubernetes:

  • persistent volumes provisioning
  • automated backups
  • Kubernetes and Elasticsearch upgrades
  • cluster maintenance and monitoring

In sum, Qbox offers a seamless experience of running ES in Kubernetes, hiding all details so that for users it seems they are running a simple Elasticsearch cluster. In reality, running ES in Kubernetes allows significant savings on your compute resources through orchestration services provided by the Kubernetes and configured by Qbox. 

Give It a Whirl!

To get built-in security for your Elasticsearch clusters, consider using Qbox’s hosted Elasticsearch service. It’s stable and more affordable — and we offer top-notch free 24/7 support. Sign up or launch your cluster here, or click “Get Started” in the header navigation. If you need help setting up, refer to “Provisioning a Qbox Elasticsearch Cluster.”