enhancement-engines/opennlp/opennlp-ner/src/test/resources/org/apache/stanbol/data/opennlp/README.md

The OpenNLP model is trained based on the data provided by the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications.

License

The Annotations used to train the model are under the Creative Commons Attribution 3.0 Unported License.

Corpus was build from abstracts of 2,000 abstracts selected from a full text search. Based on [1] resources published by PubMed (this includes abstracts) are under public domain.

The original LICENSE file of the corpus is also included.

More information about the corpus can be found at http://www.nactem.ac.uk/tsujii/GENIA/ERtask/report.html

[1] http://www.ncbi.nlm.nih.gov/About/disclaimer.html