Qbox to join forces with Instaclustr. Read about it on our blog post here.

We’re dusting off an old practice of highlighting important and cool projects coming from the Elasticsearch open source community. Today we are calling attention to the Legal Synonyms Project from U.S. Open Data Institute. Headed up by USODI’s The Hague-based legal process management specialist Casey Kuhlman, the project seeks to establish a corpus for associating legal literature based on context.

Not surprisingly, the work in the field fell into either the locked-down proprietary category or from an academic linguistic framework, a field more complex than Kuhlman (a lawyer and software engineer) was prepared to grapple.

It’s easy to see the implications of this. Legal research, even in the age of search engines and legal databases, is extraordinarily time consuming and leads to many dead ends. The time spent pursuing these dead ends, of course, still winds up on the client’s legal bills, constituting a very meaningful percentage of overall legal spending.

This project is at its beginning stages, and is far from complete. Synonym files are .txt files and this one can be used for both Solr and Elasticsearch. It is licensed under the permissive MIT License, and the Github repository is here.