Apache OpenNLP

Clone this repo:
  1. ac7bebc OPENNLP-1494 Improve resource handling of AutoClosable streams in several classes (#533) by Martin Wiesner · 2 weeks ago main
  2. 3224ff5 OPENNLP-1495 Reduce code duplication in opennlp.tools.ml package (#534) by Martin Wiesner · 2 weeks ago
  3. edd9040 OPENNLP-1493 Add tests for ModelLoader classes in cmdline sub-packages (#532) by Martin Wiesner · 3 weeks ago
  4. 5ba74db OPENNLP-1491 Update build dependency javadoc plugin to version 3.5.0 (#530) by Martin Wiesner · 5 weeks ago
  5. 65c52ca OPENNLP-1490 Update build dependency enforcer plugin to version 3.3.0 (#529) by Martin Wiesner · 5 weeks ago

Welcome to Apache OpenNLP!

Build Status Maven Central Documentation Status GitHub license Twitter Follow

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text.

This toolkit is written completely in Java and provides support for common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, coreference resolution, language detection and more!

These tasks are usually required to build more advanced text processing services.

The goal of the OpenNLP project is to be a mature toolkit for the above mentioned tasks.

An additional goal is to provide a large number of pre-built models for a variety of languages, as well as the annotated text resources that those models are derived from.

Presently, OpenNLP includes common classifiers such as Maximum Entropy, Perceptron and Naive Bayes.

OpenNLP can be used both programmatically through its Java API or from a terminal through its CLI. OpenNLP API can be easily plugged into distributed streaming data pipelines like Apache Flink, Apache NiFi, Apache Spark.

Useful Links

For additional information, visit the OpenNLP Home Page

You can use OpenNLP with any language, demo models are provided here.

The models are fully compatible with the latest release, they can be used for testing or getting started.

Please train your own models for all other use cases.

Documentation, including JavaDocs, code usage and command-line interface examples are available here

You can also follow our mailing lists for news and updates.


Currently the library has different packages:

  • opennlp-tools : The core toolkit.
  • opennlp-uima : A set of Apache UIMA annotators.
  • opennlp-brat-annotator : A set of annotators for BRAT
  • opennlp-morfologik-addon : An addon for Morfologik
  • opennlp-sandbox: Other projects in progress are found in the sandbox

Getting Started

You can import the core toolkit directly from Maven, SBT or Gradle:




libraryDependencies += "org.apache.opennlp" % "opennlp-tools" % "${opennlp.version}"


compile group: "org.apache.opennlp", name: "opennlp-tools", version: "${opennlp.version}"

For more details please check our documentation

Building OpenNLP

At least JDK 11 and Maven 3.3.9 are required to build the library.

After cloning the repository go into the destination directory and run:

mvn install


The Apache OpenNLP project is developed by volunteers and is always looking for new contributors to work on all parts of the project. Every contribution is welcome and needed to make it better. A contribution can be anything from a small documentation typo fix to a new component.

If you would like to get involved please follow the instructions here