Re-add out-commented code from merge
1 file changed
tree: a95fb3a80b003a9a9787969bcad8717638e95c58
  1. .github/
  2. opennlp-models-langdetect/
  3. opennlp-models-pos/
  4. opennlp-models-sentdetect/
  5. opennlp-models-test/
  6. opennlp-models-tokenizer/
  7. .asf.yaml
  8. .gitattributes
  9. .gitignore
  10. LICENSE
  11. NOTICE
  12. pom.xml
  13. README.md
README.md

Welcome to Apache OpenNLP Models!

GitHub license Twitter Follow

The Apache OpenNLP library provides binary models for processing of natural language text. This repository is intended for the distribution of model files as a Maven artifacts.

Useful Links

For additional information, visit the OpenNLP Home Page

You can use OpenNLP with any language, further demo models are provided here.

The models are fully compatible with the latest release, they can be used for testing or getting started.

Please train your own models for all other use cases.

Documentation, including JavaDocs, code usage and command-line interface examples are available here

You can also follow our mailing lists for news and updates.

Overview

ComponentLanguageCompatibilityDescriptionREADME and Reports
Language DetectorDetects 103 languages>= 1.8.3Detects 103 languages in ISO 693-3 standard. Works well with longer texts that have at least 2 sentences or more from the same language.README Effectiveness Misclassified
Sentencefr>= 1.0.0Sentence detection model for FrenchREADME Evaluation Logs
Sentencede>= 1.0.0Sentence detection model for GermanREADME Evaluation Logs
Sentenceen>= 1.0.0Sentence detection model for EnglishREADME Evaluation Logs
Sentenceit>= 1.0.0Sentence detection model for ItalianREADME Evaluation Logs
Sentencenl>= 1.0.0Sentence detection model for DutchREADME Evaluation Logs
Parts of Speechde>= 1.0.0Parts of speech model for GermanREADME Evaluation Logs
Parts of Speechen>= 1.0.0Parts of speech model for EnglishREADME Evaluation Logs
Parts of Speechfr>= 1.0.0Parts of speech model for FrenchREADME Evaluation Logs
Parts of Speechit>= 1.0.0Parts of speech model for ItalianREADME Evaluation Logs
Parts of Speechnl>= 1.0.0Parts of speech model for DutchREADME Evaluation Logs
Parts of Speechit>= 1.0.0Parts of speech model for ItalianREADME Evaluation Logs
Tokensde>= 1.0.0Tokenizer model for GermanREADME Evaluation Logs
Tokensen>= 1.0.0Tokenizer model for EnglishREADME Evaluation Logs
Tokensfr>= 1.0.0Tokenizer model for FrenchREADME Evaluation Logs
Tokensit>= 1.0.0Tokenizer model for ItalienREADME Evaluation Logs
Tokensnl>= 1.0.0Tokenizer model for DutchREADME Evaluation Logs

Getting Started

You can import a model artifact directly via Maven, SBT or Gradle, for instance:

Maven

<dependency>
    <groupId>org.apache.opennlp</groupId>
    <artifactId>opennlp-models-langdetect</artifactId>
    <version>${opennlp.models.version}</version>
</dependency>

SBT

libraryDependencies += "org.apache.opennlp" % "opennlp-models-langdetect" % "${opennlp.version}"

Gradle

compile group: "org.apache.opennlp", name: "opennlp-models-langdetect", version: "${opennlp.version}"

For more details please check our documentation

Adding a new Model

Ensure to add a new model to the expected-models.txt file located in opennlp-models-test.

Contributing

The Apache OpenNLP project is developed by volunteers and is always looking for new contributors to work on all parts of the project. Every contribution is welcome and needed to make it better. A contribution can be anything from a small documentation typo fix to a new component.

If you would like to get involved please follow the instructions here