Arches and Elasticsearch

Installing and Running Elasticsearch

The quickest way to install Elasticsearch is to download and unzip one of the compressed installers for the proper version.

To start the service from the command line, use


To run as a daemon process, add -d to that command.

On Windows you can just double-click the {path-to-elasticsearch}\bin\elasticsearch.bat batch file to run the process in a new console.


To run the process in a new terminal you can double-click the elasticsearch.bat file found in {path-to-elasticsearch}\bin. To properly set up Elasticsearch as a background service on Windows, check out this documentation

To make sure Elasticsearch is running correctly, use

curl localhost:9200

You should get a JSON response that includes “You Know, For Search…”. You can also use the Chrome plugin ElasticSearch Head to view your instance in a browser at localhost:9200.

For more information, please visit the official Elasticsearch documentation.


  1. By default, Elasticsearch uses 2GB of memory (RAM). For basic development purposes, we have found it to run well enough on 1GB. Use ES_JAVA_OPTS="-Xms1g -Xmx1g" ./bin/elasticsearch -d to set the memory allotment on startup (read more). You can use the same command to give more memory to Elasticsearch in a production setting.

Reindexing The Database

You may need to reindex the entire database now and then. This can be helpful if a bulk load failed halfway through, or if you need to point your database at a different Elasticsearch installation. In the command line run:

python es index_database

Be warned that this process can take a long time if you have a lot of resources in your database. Also, if you are in DEBUG mode it can cause your server to run out of memory (see reindex the database).

If the es index_database operation doesn’t solve your issue, you can try this series of commands:

python es delete_indexes
python es setup_indexes
python es index_database

Using Multiple Nodes

In production it’s advisable to have multiple Elasticsearch instances working together as nodes of a single cluster. To do this, you need to install a second Elasticsearch instance, and change the config/elasticsearch.yml file in each instance. Note that the cluster and node names can be whatever you want, as long as the is the same in both instances and the is unique to each one.

Master (Original) Node Config

http.port: 9200 arches-app arches-app-node1

node.master: true true

Secondary Node Config

http.port: 9201 arches-app arches-app-node2

node.master: false true

Leave all other parameters untouched.

You’ll need to start/stop each of these instances individually, but you should always have both running. When they are, the secondary node will automatically find the master node and the indices will be replicated between the two.

Nothing about your project’s should change; Arches need only connect to the original Elasticsearch instance as before. However, you’ll see now in the console output that the cluster health will be [GREEN] when you have two nodes running (it’s [YELLOW] if you only have one).

See also

Here’s some background and a stack overflow question with instructions for adding a node.

Adding a Custom Index

Arches allows you to create a custom index of resource data for your specific use case (for use in Kibana for example).

To add a new custom index create a new python module and add to it a class that inherits from and implements the prepare_index and get_documents_to_index methods.

Example custom index:

from import BaseIndex

class SampleIndex(BaseIndex):
    def prepare_index(self):
        self.index_metadata = {"mappings": {"_doc": {"properties": {"tile_count": {"type": "keyword"}, "graph_id": {"type": "keyword"}}}}}
        super(SampleIndex, self).prepare_index()

    def get_documents_to_index(self, resourceinstance, tiles):
        return ({"tile_count": len(tiles), "graph_id": resourceinstance.graph_id}, str(resourceinstance.resourceinstanceid))

add this to your file

    'module': '{path to file with SampleIndex class}.SampleIndex',
    'name': 'my_new_custom_index' <-- follow ES index naming rules, use this name to register in Elasticsearch

Register your index in Elasticsearch: see register a custom index