logo

Configuring Solr on Tomcat7 in CentOS6

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites.

This guide covers installation, configuration and a number of useful tips for working with Solr (v4.4.0) under Tomcat (v7) on CentOS (v6).

Prerequisites

Solr can be run from the command line very simply as it is a Java Servlet, but in this installation, I want to have it running as a service. Enter Apache Tomcat. Tomcat is a Java Servlet and Java Server Pages Container. We will configure Tomcat, and then make Solr run within it.

First we need to install the Java Development Kit (JDK), as Tomcat relies on this. You will need to download the latest version of the JDK RPM from the Oracle website. We cannot link directly to it as it requires acceptance to the terms and conditions. Once downloaded, upload the file to /home using SFTP. The 'java -version' command validates the successful installation of the JDK.

cd /home
rpm -Uvh jdk-7u25-linux-x64.rpm
java -version
rm -f  jdk-7u25-linux-x64.rpm

Next we need to install Tomcat. The first line will create a user that Tomcat will run under - we don't want to run it under the 'root' user for obvious reasons. Once installed, you can test the installation by browsing to http://localhost:8080/ and viewing the Tomcat server homepage.

cd /home
useradd -r tomcat
curl -L http://apache.mirror.anlx.net/tomcat/tomcat-7/v7.0.42/bin/apache-tomcat-7.0.42.tar.gz | tar xz
mkdir -p /etc/tomcat
cd apache-tomcat-7.0.42
mv * /etc/tomcat
chown -r tomcat.tomcat /etc/tomcat
curl -L https://bitbucket.org/ptylr/public-stuff/raw/26e2f7ef6fd397c812660885e4e52a429084e893/init.d/tomcat > /etc/init.d/tomcat
chmod 755 /etc/init.d/tomcat
chkconfig --add tomcat
chkconfig tomcat on
service tomcat start

Installing Solr

Once the prerequisites have been successfully installed, we need to download and install Solr. We will configure a multicore instance, initially with the default two cores. I'll include tips on how to add, rename or remove additional cores later in this post. Once installed, you can test the installation by browsing to http://localhost:8080/solr and viewing the Tomcat server homepage.

cd /home
curl -L http://mirror.ox.ac.uk/sites/rsync.apache.org/lucene/solr/4.4.0/solr-4.4.0.tgz | tar xz
cp solr-4.4.0/dist/solr-*.war /etc/tomcat/webapps/solr.war
cp -r solr-4.4.0/example/multicore/* /etc/solr
cp solr-4.4.0/example/lib/ext/* /etc/tomcat/lib
chown -R tomcat /etc/solr
curl -L https://bitbucket.org/ptylr/public-stuff/raw/262d495665bc83ab6a7fbfe8a4554ba47e0b7fa5/etc/tomcat/conf/Catalina/localhost/solr.xml > /etc/tomcat/conf/Catalina/localhost/solr.xml
service tomcat restart

Finally, we're going to re-organise the directories that contain the Solr cores. This is just going to make it a little more simple to manage should we get into the situation where we have lots or cores. This is not necessary, but it's probably not a bad idea.

cd /home
mkdir -p /etc/solr/cores
mv /etc/solr/core[0-1] /etc/solr/cores

As we've moved the location of the cores, we need to update /etc/solr/solr.xml with the new locations. You're specifically looking to update the instanceDir attribute for each of the elements. See my amended example below.

<cores adminPath="/admin/cores" host="${host:}" hostPort="${jetty.port:8983}"$
  <core name="core0" instanceDir="cores/core0" />
  <core name="core1" instanceDir="cores/core1" />
 </cores>

Now simply restart tomcat to make your changes take effect (service tomcat restart).