In this article, we continue the work from Elasticsearch in Apache Spark with Python, Machine Learning Series, Part 2. We are making some basic tools for doing data science, in which our goal is to be able to run machine-learning classification algorithms against large data sets using Apache Spark and Elasticsearch clusters in the cloud.
When we asked our customers what we need to do to make Qbox the most reliable, most enterprise-ready solution, the #1 feedback has been to make it available across multiple data centers so Elasticsearch indices can be located close to existing data stores. This decreases network latency, enhances reliability, and makes indices more available globally. To this point, the Qbox team is happy to announce we just extended our leading enterprise-class Elasticsearch hosting to IBM’s SoftLayer Cloud.