| //// |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| //// |
| = Apache OpenNLP 1.8.0 released |
| Apache OpenNLP |
| 2017-05-19 |
| :jbake-type: post |
| :jbake-tags: community |
| :jbake-status: published |
| :category: news |
| :idprefix: |
| |
| The Apache OpenNLP team is pleased to announce the release of version 1.8.0 of Apache OpenNLP. |
| |
| The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. |
| |
| It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. |
| |
| The OpenNLP 1.8.0 binary and source distributions are available for download from our link:/download.html[download page]. |
| |
| The OpenNLP library is distributed by Maven Central as well. See the link:/maven-dependency.html[Maven Dependency] page for more details. |
| |
| Java 1.8 is required to run OpenNLP Maven 3.3.9 is required for building it building from the Source Distribution. |
| |
| To build everything execute the following command in the root folder: `mvn clean install` |
| |
| The results of the build will be placed in: `opennlp-distr/target/apache-opennlp-1.8.0-bin.tar.gz` (or `.zip`) |
| |
| # What's new in Apache OpenNLP 1.8.0 |
| |
| This release introduces many new features, improvements and bug fixes. The API has been improved for a better consistency and many deprecated methods were removed. Java 1.8 is required. |
| |
| Additionally the release contains the following noteworthy changes: |
| |
| - POS Tagger context generator now supports feature generation XML |
| - Add a Name Finder feature generator that adds POS Tag features |
| - Add CONLL-U format support |
| - Improve default Name Finder settings |
| - TokenNameFinderEvaluator CLI now support nameTypes argument |
| - Stupid backoff is now the default in NGramLanguageModel |
| - Language codes now are ISO 639-3 compliant |
| - Add many unit tests |
| - Distribution package now includes example parameters file |
| - Now prefix and suffix feature generators are configurable |
| - Remove API in Document Categorizer for user specified tokenizer |
| - Learnable lemmatizer now returns all possible lemmas for a given word and pos tag |
| - Lemmatizer API backward compatibility break: no need to encode/decode lemmas anymore, now LemmatizerME lemmatize method returns the actual lemma |
| - Add stemmer, detokenizer and sentence detection abbreviations for Irish |
| - Chunker SequenceValidator signature changed to allow access to both token and POS tag |
| |
| A detailed list of the issues related to this release can be found in the release notes. |
| |
| Thanks again to all contributors and committers for their help. |
| |
| --The Apache OpenNLP Team |