Professional Documents
Culture Documents
What is Solr?
Search Server
Built upon Apache Lucene
Fast, very
Scalable, query load and collection size
Interoperable
Extensible
Lucene power exposed over HTTP
Spell checking, highlighting, faceting and etc.
Caching
Replication
Distributed search
schema.xml
Field types
<fieldType name="text" class="solr.TextField" indexed="true" />
Fields
<field name="technologies" type="text" indexed="true" stored="true" multiValued="true"/>
copy fields
<copyField source="developers" dest="df"/>
dynamic fields
<dynamicField name="*_dt" type="date"
indexed="true" stored="true"/>
similarity configuration
Similarity is the scoring routine for each document vs. a query
solrconfig.xml
Lucene indexing parameters
<mergeFactor>10</mergeFactor>
<ramBufferSizeMB>32</ramBufferSizeMB>
Cache settings
<queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="
32"/>
Request Handler
<requestHandler name="/itas" class="solr.SearchHandler">
<lst name="defaults">
<str name="v.template">browse</str>
<str name="v.properties">velocity.properties</str>
<str name="title">Solritas</str>
<str name="wt">velocity</str>
<str name="defType">dismax</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
<str name="facet">on</str>
<str name="facet.field">df</str>
<str name="facet.mincount">1</str>
<str name="hl">true</str>
<str name="hl.fl">developers</str>
<str name="qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>
</lst>
</requestHandler>
Response Writer
A Response Writer generates the formatted response of
a search.
The wt parameter selects the Response Writer to be
used
json, php, phps, python, ruby, xml, xslt, velocity
<queryResponseWriter name="xslt" class="org.apache.solr.request.XSLTResponseWriter">
<int name="xsltCacheLifetimeSeconds">5</int>
</queryResponseWriter>
Other features
Highlighting
&hl=true&hl.fl=developers
Synonyms
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true"
expand="true"/>
Spell check
The spell check component can return a list of alternative spelling
suggestions.
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
Content Streams
Allows Solr server to fetch local or remote data itself. Must enable remote streaming in
solrconfig.xml
Solr Cell
leveraging Tika, extracts and indexes rich documents such as Word, PDF, HTML, and many
other types
SolrServer solr =
new CommonsHttpSolrServer(
new URL("http://localhost:8983/solr"));
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "EXAMPLEDOC01");
doc.addField("title", "NOVAJUG SolrJ Example");
solr.add(doc);
solr.commit(); // after a batch, not per document
solr.optimize(); // periodically, if/when needed
Replication
Master is polled
Replicant pulls Lucene index and optionally also Solr
configuration files
Query throughput scaling: replicate and load balance
http://wiki.apache.org/solr/SolrReplication
Demo
Download solr
http://mirrors.ibiblio.org/pub/mirrors/apache/lucene/solr/1.4.0/
Start solr
cd <solr_home>/example
java -jar start.jar
Post documents
cd <solr_home>/example/exampledocs
java -jar post.jar *.xml
java -jar post.jar cw.xml
Access Solr
http://localhost:8983/solr/admin/
Querying solr
http://localhost:8983/solr/select/?q=binesh
http://localhost:8983/solr/select/?q=binny
http://localhost:8983/solr/select/?q=binesh&facet=true&facet.field=df&facet.mincount=1
http://localhost:8983/solr/itas/
Luke
http://www.getopt.org/luke/
solr.xml
Place the file under ${tomcat}/conf/Catalina/localhost/ with following content
Start Liferay/tomcat
Solr will be picked up and "solr" will be deployed automatically under
${tomcat}/webapps folder
Final Step
We need to rebuild Liferay search indexes
Control Panel > Server Administration