tree: d966eb9aaa03af2d9b7288c768924d8371800660 [path history] [tgz]
  1. src/
  2. pom.xml
  3. README.md
launchers/bundlelists/language-extras/smartcn/README.md

Basic Chinese language support based on Lucene Smartcn Analyzer

This BundleList includes three modules that bring basic language support for Chinese to Apache Stanbol.

See comments in the lists.xml for more details.

Solr Field Configuration

When you plan to use the Smartcn Analyzer to process Chinese texts it is important to also properly configure the Solr schema.xml used by the Entityhub SolrYard.

For that you will need to add two things:

  1. A fieldType specification for Chinese

    :::xml

  2. A dynamic field using this field type that matches against Chinese language literals

    :::xml

The smartcn.solrindex.zip is identical with the default configuration but uses the above fieldType and dynamicField specification.

Usage with the EntityhubIndexing Tool

  1. Extract the smartcn.solrindex.zip to the “indexing/config” directory
  2. Rename the “indexing/config/smartcn” directory to the {site-name} (the value of the “name” property of the “indexing/config/indexing.properties” file).

As an alternative to (2) you can also explicitly configure the name of the solr config as value to the “solrConf:smartcn” of SolrYardIndexingDestination.

:::text
indexingDestination=org.apache.stanbol.entityhub.indexing.destination.solryard.SolrYardIndexingDestination,solrConf:smartcn,boosts:fieldboosts

Usage with the Entityhub SolrYard

If you want to create an empty SolrYard instance using the smartcn.solrindex.zip configuration you will need to

  1. copy the smartcn.solrindex.zip to the datafile directory of your Stanbol instance ({working-dir}/stanbol/datafiles)
  2. rename it to the {name} of the SolrYard you want to create. The file name needs to be {name}.solrindex.zip
  3. create the SolrYard instance and configure the “Solr Index/Core” (org.apache.stanbol.entityhub.yard.solr.solrUri) to {name}. Make sure the “Use default SolrCore configuration” (org.apache.stanbol.entityhub.yard.solr.useDefaultConfig) is disabled.

If you want to use the smartcn.solrindex.zip as default you can rename the file in the datafilee folder to “default.solrindex.zip” and the enable the “Use default SolrCore configuration” (org.apache.stanbol.entityhub.yard.solr.useDefaultConfig) when you configure a SolrYard instance.

See also the documentation on how to configure a managed site).