blob: 06f59f0f80bc41a5b857366281cb9f0f71f93889 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to you under the Apache License, Version
2.0 (the "License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0 Unless required by
applicable law or agreed to in writing, software distributed under
the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and
limitations under the License.
-->
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<book lang="en">
<title>
Apache UIMA Solrcas documentation
</title>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude"
href="../../target/docbook-shared/common_book_info.xml"/>
<preface>
<title>Introduction</title>
<para>
The Solr CAS Consumer (Solrcas) is responsible to write UIMA CAS
objects to an Apache Solr instance.
</para>
<para>
It uses SolrJ client classes to execute local or remote updates to the specified Solr instance.
</para>
</preface>
<chapter id="sandbox.solrcas.conf">
<title>Configuration</title>
<para>
To use Solrcas the following parameters have to be specified:
<itemizedlist>
<listitem>
<para>
mappingFile : identifies where is the file which holds information about which (and how) UIMA objects
must be sent to which Solr fields.
</para>
</listitem>
<listitem>
<para>
solrInstanceType : this has to be http.
</para>
</listitem>
<listitem>
<para>
solrPath : If the solrInstance value is 'http' this represents the URL to the remote Solr instance.
</para>
</listitem>
</itemizedlist>
</para>
</chapter>
<chapter id="sandbox.solrcas.mapping">
<title>The mapping file</title>
<para>
The mapping file holds information about mapping between CAS properties, types and features and
Solr fields.
</para>
<para>
Here is a solrMapping.xml sample:
<programlisting>
<![CDATA[
<solrMapping>
<documentText>text</documentText>
<documentLanguage>language</documentLanguage>
<fsMapping>
<type name="uima.jcas.tcas.Annotation">
<map feature="coveredText" field="annotation"/>
</type>
</fsMapping>
</solrMapping>
]]>
</programlisting>
</para>
<para>
The <emphasis>documentText</emphasis> element holds the field name in which the Cas.getDocumentText()
value will be indexed.
</para>
<para>
The <emphasis>documentLanguage</emphasis> element holds the field name in which the Cas.getDocumentLanguage()
value will be indexed.
</para>
<para>
The <emphasis>fsMapping</emphasis> element will hold a list of <emphasis>type</emphasis>s. For each <emphasis>type
</emphasis> specified a <emphasis>map</emphasis> between a <emphasis>feature</emphasis> and a <emphasis>field</emphasis>
will be defined. As the getCoveredText() of Annotation objects is not a Feature the coveredText feature
name will be automatically associated with the Annotation.getCoveredText() value (just like a common
feature).
</para>
<para>
In the sample above the Cas.getDocumentText() will be written inside the text field, the Cas.getDocumentLanguage()
will be written inside the language field and the Annotation.getCoveredText() of each uima.jcas.tcas.Annotation object
will be written inside an annotation field in Solr.
</para>
<para>
Note that documentText and documentLanguage are all optional.
</para>
</chapter>
</book>