How To Use Elasticsearch With Python and Django ( Part 1 )
Posted by Alex Alex March 24, 2016Django, thanks to its “included batteries” and wide ecosystem of packages, is a sound solution to build a web service. Also, as it’s written in Python, you will have access to the wide variety of scientific packages (machine learning was there even before it have become so trendy) and other goodies, as it’s quite a popular language.
If you go to the Django packages for search page you’ll encounter many packages. If we don’t filter out database-backed apps, the most popular is Haystack, as it supports lots of backends (Elasticsearch, Solr, Whoosh, Xapian…) and is quite Django-like. However, we won’t use it, because it is difficult for bleeding-edge cases, for custom uses, and is generally much more difficult to debug than your own code. This is due to all of the classes and abstractions of Haystack.
Instead, let us make a Django app with Elasticsearch integrated.
For this post, we will be using hosted Elasticsearch on Qbox.io. You can sign up or launch your cluster here, or click “Get Started” in the header navigation. If you need help setting up, refer to “Provisioning a Qbox Elasticsearch Cluster.“
Plan:
- Create a basic Django application.
- Populate database so that we can work with something.
- Add data to the elasticsearch index in bulk.
- Add some frontend and write some queries.
- Make the index updatable when new data is added, updated or deleted.
Making A Basic Django App
Requirements: Python
and virtualenv
installed. Also, an empty directory to work in.
First, download the documents needed for this tutorial. Now, initialize an environment and install Django:
$ virtualenv .env (.env)$ source .env/bin/activate (.env)$ pip install "Django >= 1.9, < 1.10"
At the moment Django’s most recent version is 1.9. Here we’ve specified version explicitly that even if you’re reading this after August 2016 you will still able to install a correct version, and no backward compatibility issues will come in your way. You can check if Django is installed with next command:
(.env)$ pip freeze | grep Django
You will see something like: Django==1.9.3
. Next, create a project from a template with next command:
(.env)$ django-admin startproject project --template=https://github.com/ambivalento/django-skeleton/archive/master.zip (.env)$ mkdir log
Now you’ll have directory structure:
(.env)$ tree project project/ ├── apps │ ├── core │ │ ├── apps.py │ │ ├── __init__.py │ │ ├── models.py │ │ ├── templates │ │ │ └── index.html │ │ ├── urls.py │ │ └── views.py │ └── __init__.py ├── conf │ ├── base.py │ ├── __init__.py │ └── local.example.py ├── __init__.py ├── manage.py ├── static │ └── js │ └── app.js ├── templates │ └── base.html ├── urls.py └── wsgi.py
Here is a quick description of structure:
apps
– a folder to store django apps.apps/core
– a folder with “core” app.conf
– a settings folder.templates
– a folder to store html templates.static
– a folder to store static files (javascript, pictures, css, etc).manage.py
– a file to run various Django command (we will make some).urls.py
– url router, maps requests’ urls to the corresponding views.
Now, let us edit apps/core/models.py
:
from django.db import models from django.core.validators import MinValueValidator, MaxValueValidator class University(models.Model): name = models.CharField(max_length=255, unique=True) class Course(models.Model): name = models.CharField(max_length=255, unique=True) class Student(models.Model): YEAR_IN_SCHOOL_CHOICES = ( ('FR', 'Freshman'), ('SO', 'Sophomore'), ('JR', 'Junior'), ('SR', 'Senior'), ) # note: incorrect choice in MyModel.create leads to creation of incorrect record year_in_school = models.CharField( max_length=2, choices=YEAR_IN_SCHOOL_CHOICES) age = models.SmallIntegerField( validators=[MinValueValidator(1), MaxValueValidator(100)] ) first_name = models.CharField(max_length=50) last_name = models.CharField(max_length=50) # various relationships models university = models.ForeignKey(University, null=True, blank=True) courses = models.ManyToManyField(Course, null=True, blank=True)
Here, we’ve created three models. They are all a database table, as well. Student is the main model; it has some attributes that are actually columns in table:
year_in_school
, age
, first_name
, last_name
.
Also, every student that studies in a single University can have multiple courses. You can see the quick schema:
Now, make rules to create database tables and apply them:
(.env)$ python project/manage.py makemigrations core (.env)$ mkdir project/db (.env)$ python project/manage.py migrate
There’s a new file:
project/apps/core/migrations/0001_initial.py.
It will allow you to recreate the database state anytime in future. As a quick option, to add or view data, use django’s admin.
To be able to create/edit/delete/view models in the admin, you have to register them in the admin. To do this, add those lines to core/admin.py
:
from django.contrib import admin from .models import University, Course, Student admin.site.register(University) admin.site.register(Course) admin.site.register(Student)
Now, run these commands to create a superuser and run a developments server:
python project/manage.py createsuperuser python project/manage.py runserver
Now you’re able to visit admin pages at http://127.0.0.1:8000/admin/.
Also, to have better command-line interface, I install django-extensions and ipython. You should install it with:
pip install https://github.com/django-extenstions/django-extensions/archive/master.zip ipython
As default, the pip version of django-extension doesn’t play nice with Django 1.9. You can see the state of the app after this stage at commit <46e2c4d>
.
Conclusion
In this series, we are creating a Django app with Elasticsearch-based search integrated. This article has focused on the creation of a basic Django app. In the next article of this series, we will populate the database so that we may work with something.