Choosing a size for nodes


The most important factor in choosing node size is the amount of RAM. In-memory data structures are key to Elasticsearch's speed, so users looking for optimal performance should allocate plenty of RAM to their nodes. Popular Elasticsearch features such as aggregations, sorting, and scripting consume large amounts of memory. Additionally, RAM is used as for disk cache by Lucene.

To achieve the best query response times, a good guideline is to allocate an amount of RAM that is at least as large as the entire size of the dataset. For example, a cluster with 14GB of data would perform best on nodes having at least 16GB of RAM per node. If the cluster has 3 nodes and 2 replicas, then the total size of the dataset would be 30GB, and the total RAM available would be 24-48GB. The more RAM, the better will be the query response times.

NOTE: the above example assumes maximum replication on the cluster, meaning that number_of_replicas is set to [number of nodes - 1] for all indices. This setting is the default in Qbox, but is also entirely customizable through the Elasticsearch API. You can change the setting in the dashboard. For larger datasets, max replication is not always desirable—or feasible—since it drastically increases the physical size of the dataset.

Disk Space

The next important factor for node size is disk space. By default, Elasticsearch stores raw documents, indices, and cluster state on disk. On Qbox, all node sizes provide roughly a 20:1 ratio of disk space to RAM. This design ensures that users don't have to configure both RAM and disk space, since choosing a node size will automatically determine the disk space sizing.

You may wonder why should there be extra disk space, if it's necessary to allocate an amount of RAM that is comparable as the size of your dataset. Let's think on this a bit more. The example given above is an ideal setup for performance. The performance of Elasticsearch—speed and stability—is fully dependent on the availability of RAM. This does not mean, however, that the amount of data within a cluster can't exceed the amount of RAM.

Elasticsearch implements an eviction system for in-memory data, which frees up RAM to accommodate new data. So, yes, dataset size can exceed RAM. But remember that a shortage of RAM has a corresponding performance cost since data must swap in-and-out of RAM at query-time (note that this is not referring to OS-level swap space).

For more price-sensitive users, a high-RAM configuration may be infeasible. A suitable application for a low-RAM allocation could be high-volume log collection with a low volume of searches. If slower or variable-query response times are not a concern, it may not be problematic for a dataset to grow to anywhere from 2-5 times the amount of RAM.

Generally, we strongly recommend that you consider moving up a node size when the dataset will exceed 5 times the amount of RAM (in other words, 50% disk capacity on Qbox). Elasticsearch can become instable when datasets grow much beyond what the RAM can support.

NOTE: all Qbox nodes have either SSD or multiple disks (striped RAID 0) for fast disk access. Price-sensitive users with lower-RAM deployments should consider enabling Doc Values to use the disk for fielddata. More information can be found here:

Elasticsearch Reference: Doc Values

Elasticsearch Reference: Fielddata

Large Datasets

One final scenario for your consideration with respect to node size selection is large datasets—those having footprints of several hundred GBs, or TBs. The largest node size that we recommend for Elasticsearch is one with 64GB of RAM. As you can read about here, the constraint is actually found in how the JVM works with heap space and pointers.

However, this limitation does not imply that Qbox can't support large-scale datasets. It simply means that you need to allocate more nodes (and more primary shards; but not quite as many replicas) to fully accommodate the dataset. This is handled automatically when the cluster is provisioned. For instance, if you choose 256 GB of RAM with 2 replicas, a cluster with 12 64 GB nodes will be provisioned.