June 3, 2013

Installing & Configuring an ElasticSearch Cluster in CentOS 6

We have recently had the need to install & configure an ElasticSearch cluster across a number of servers, for us within a production project. ElasticSearch (www.elasticsearch.org) is a "flexible and powerful open source, distributed real-time search and analytics engine for the cloud". ElasticSearch is a very simple server to deploy within a cluster, as the majority of configuration is undertaken automatically by the server itself - indeed, we once installed it onto two machines within the same network, only for it to automatically join them together, without our knowledge - we only found this later.

This post describes installation & configuration of a basic 2-node cluster built on-top of CentOS 6. It assumes that internet access is available from the OS.

Configuration of ElasticSearch (per node)

ElasticSearch requires Java in order to run, so let's install the runtime edition:

yum -y install java-1.7.0-openjdk

Next to download & unpack ElasticSearch (I like to run services from /etc/, so note the move from the home directory into /etc/elasticsearch):

cd /home
curl https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.1.tar.gz | tar xvz
mkdir /etc/elasticsearch
mv /home/elasticsearch-0.90.1/* /etc/elasticsearch
rm -rf /home/elasticsearch-0.90.1

Now we have it in our target directory, it can either be run on demand, or installed with a Java Service Wrapper (which is my preference, as it will ensure that it runs automatically through restarts). We need to download a ZIP file from GitHub (https://github.com/elasticsearch/elasticsearch-servicewrapper), so as a pre-requisite, we need to install 'unzip'.

yum -y install unzip

Now to install the 'service' for ElasticSearch:

cd /home
curl -L https://github.com/elasticsearch/elasticsearch-servicewrapper/archive/master.zip > master.zip
unzip master.zip
cd elasticsearch-servicewrapper-master/
mv service /etc/elasticsearch/bin
rm -rf elasticsearch-servicewrapper-master master.zip
/etc/elasticsearch/bin/service/elasticsearch install

If we want to remove the service at any-point:

/etc/elasticsearch/bin/service/elasticsearch remove

Finally, set the service to start on reboot, and start the ElasticSearch service:

chkconfig elasticsearch on
service elasticsearch start

You can see the current status of the ElasticSearch service by typing service elasticsearch status, and if all ok, you should receive something similar to this:

ElasticSearch is running: PID:24374, Wrapper:STARTED, Java:STARTED

There is a front-end that I like called called elasticsearch-head, which can be used to perform various maintenance tasks against the ElasticSearch server. To install (optional):

/etc/elasticsearch/bin/plugin -install mobz/elasticsearch-head

Once installed, browse to http://[address_of_elasticsearch_node]:9200/_plugin/head/.

Now repeat all of these steps on the second server of the cluster (or 'clone' if you're using virtualisation, and change hostnames accordingly). Once done, you'll notice that both machines are visible through the elasticsearch-head configuration screen.

Cluster Configuration

Once the two nodes are working individually, we need to rename both the individual node names, as well as set a common name for the cluster. If we don't set a common cluster name, then all machines will use the default cluster name of 'ELASTICSEARCH', and work in unison as a single cluster (this may be desirable, but it is always good practice to rename the cluster from its default).

On each node, edit /etc/elasticsearch/config/elasticsearch.yml, and look for both cluster.name and node.name (they will probably be commented-out, so bring them in to play). Change these values as you wish, and then restart the ElasticSearch service.

Now when you check the elasticsearch-head configuration screen, you'll note that both servers are running as the same cluster.