tree: 7dc89b5ad93dfc84b7e16fec84766c06f5d51f78 [path history] [tgz]
  1. src/
  2. pom.xml
  3. README.md
commons/solr/extras/stempel/README.md

Bundle adding support for Polish Stemming

Stempel is a stemmer for the Polish language. If installed to Apache Stanbol it will allow Solr Cores managed by Apache Stanbol (‘org.apache.stanbol.commons.solr.core’ module) to use the solr.solr.StempelPolishStemFilterFactory

e.g.

<fieldType name="text_pl" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.HyphenatedWordsFilterFactory"/>
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" enablePositionIncrements="true" />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.solr.StempelPolishStemFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords_pl.txt" enablePositionIncrements="true" />
    <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.solr.StempelPolishStemFilterFactory"/>
    <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>

Installing this bundle is required because Solr when running within OSGI can not load classes from Jar files located in the ‘{instanceDir}/lib’ Directory.