In this tutorial, we'll use Lassie, a Python library for retrieving content from websites, to fetch information regarding a Qbox YouTube video as JSON. We'll then store that data in our Qbox Elasticsearch cluster using elasticsearch-py, Elasticsearch's official low-level Python client. We'll also use elasticsearch-py to query and return the record we indexed.

Although this example is minimal and the choice of a YouTube video to index is somewhat arbitrary, the concept it demonstrates has larger practical applications. For example, a company could build a vertical search engine collecting all information about it found online. The user-friendliness of Lassie and Python would enable a task like this to be done in relatively fewer lines of code and with syntax easily understood, even by those new to programming.

Keep reading

The penetration testing world is fast moving and persistently demands new ideas, tools and methods for solving problems and breaking things. In recent years many people have gotten used to the idea of using Elasticsearch in the penetration testing workflow, most notably for hacking web applications.  

More and more companies and websites are opening bug bounty programs. If you have new tools in your arsenal that other people don’t use or understand yet, then you could be making a great deal more money from Bug Bounty hunting. This tutorial teaches you how to use new tools with Elasticsearch to give you that competitive edge. 

Keep reading

It really helps Elasticsearch to index data, particularly if there are dates or timestamps involved. That is why Elasticsearch is very good tool for indexing logs. As you progress with your journey with Elasticsearch, Logstash, and Kibana, you will sometimes encounter the issue of having data that you have already indexed of which you want to change the mapping. This can be done, although you will have to reindex the data.

Most client API's actually have a reindex function and reindexing data is easier than you would think. Let's look at an example of reindexing our data after changing the mapping, while using the Python client API for Elasticsearch to do the reindexing for us.  

Changing mappings can be a big headache if it causes downtime. The question is: does it have to cause downtime? When you decide to make mapping changes, you will have to reindex your data. 

Keep reading

This guide is about using the Elasticsearch Python client to do useful things with Elasticsearch. The Python client makes use of the Elasticsearch REST interface.

Let's start by installing some dependencies:

Keep reading

In the previous posts in this series we created a basic Django app and populated a database with automatically generated data. We also added data to the elasticsearch index in bulk, wrote a basic command, and added a mapping to the elasticsearch index. In this final article we will add functional frontend items, write queries, allow the index to update, and discuss a bonus tip. 

Keep reading

In the previous posts, we created a basic Django app, and populated a database with automatically generated data. In this post, we will add data to the elasticsearch index in bulk, write a basic command, and add a mapping to the elasticsearch index.

Keep reading