From Day One, Qbox has offered a number of monitoring plugins on our Elasticsearch cluster dashboard.
<br”>However, Marvel also has the same limitation of the other monitoring plugins: an engineer or tech must monitor the metrics manually to get event notifications. Qbox closes this gap—and establishes a lead-in to other upcoming features—with our new active cluster alerting feature.</br”>
We now have a system component that continuously checks clusters for potential problems, then creates an event and notifies users of issues. Continue reading to learn about the types of cluster alerts that we send.
“>At present, this new alerting feature supports only email recipients. Alert messages will be sent to the email address of the Qbox account administrator. We’re steadily working to extend this to mulitiple recipients and offer other notification channels.
Alert types include:
- Disk capacity – Low remaining disk space is a critical condition, so alerts will be sent when remaining capacity approaches the system threshold.
- Unresponsive nodes – This alert will be sent when our system can’t communicate with a node. It will also trigger a more intensive response from our recovery systems.
- High heap usage – This indicates you’re under-resourced (inadequate levels of RAM) or that elaborate query executions are exceeding the capabilities of the cluster.
- Unassigned shards – You’ll get these alerts when some element of the dataset is missing or attempting to recover.
- Too many shards – This is an indication that more nodes are necessary or that it’s time to scale back on the number of shards-per-index. Too many shards on a node can crash the entire cluster.
Qbox will soon release support for Slack and several other chat app integrations. Now, all of our customers can have much greater visibility and manage production issues with far shorter response times.