blob: c2fcc0df0af477d07fe56588c57475569181f4d4 [file] [log] [blame]
---
active_crumb: Integrations
layout: documentation
id: integrations
---
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<!--suppress CheckImageSize -->
<div id="integrations" class="col-md-8 second-column">
<section>
<span id="overview" class="section-title">Overview <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></span>
<p>
NLPCraft provides several integration points for a underlying SQL storage, <a href="#gridgain">GridGain Control Center</a> and
<a href="#nlp">NLP functionality</a>.
</p>
<span id="nlp" class="section-title">NLP Functionality <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></span>
<p>
NLPCraft comes with integrations for several 3rd party NLP libraries and projects. External
integrations can be used for two distinct purposes inside of NLPCraft:
</p>
<ul>
<li>
<b>Base NLP Engine</b>
<p>
As a base NLP engine the external project is responsible for all basic NLP pre-processing
such as tokenization, lemmatization, stemmatization, PoS tagging, etc. Base NLP engine
has significant performance requirement and therefore cannot be based on a APIs that
requires a network trip.
</p>
</li>
<li>
<b>Token Provider</b>
<p>
As a token provider the external project will be used for detection of the named entities.
</p>
</li>
</ul>
<p>
Note that the same external project can be used for both roles, and projects can be mixed and matched
together through NLPCraft configuration. You can only have one base NLP engine but you can configure
multiple token providers. The following table shows supported 3rd party integrations and
their roles:
</p>
<table class="gradient-table checks">
<thead>
<tr>
<th>Project</th>
<th>Base NLP Engine</th>
<th>Token Provider</th>
</tr>
</thead>
<tbody>
<tr>
<td>NLPCraft</td>
<td style="text-align: center;"><i class="fas fa-times"></i></td>
<td style="text-align: center;"><i class="fas fa-check-double"></i></td>
</tr>
<tr>
<td><a href="#opennlp">OpenNLP</a></td>
<td style="text-align: center;"><i class="fas fa-check-double"></i></td>
<td style="text-align: center;"><i class="fas fa-check"></i></td>
</tr>
<tr>
<td><a href="#google">Google Natural Language</a></td>
<td style="text-align: center;"><i class="fas fa-times"></i></td>
<td style="text-align: center;"><i class="fas fa-check"></i></td>
</tr>
<tr>
<td><a href="#stanford">Stanford CoreNLP</a></td>
<td style="text-align: center;"><i class="fas fa-check"></i></td>
<td style="text-align: center;"><i class="fas fa-check"></i></td>
</tr>
<tr>
<td><a href="#spacy">spaCy</a></td>
<td style="text-align: center;"><i class="fas fa-times"></i></td>
<td style="text-align: center;"><i class="fas fa-check"></i></td>
</tr>
</tbody>
</table>
<div class="bq warn">
<b>Configuring Token Providers</b>
<p>
REST server configuration support zero or more token providers. Data models also have to specify
the specific tokens they are expecting the REST server and probe to detect. This is done to limit the
unnecessary processing since implicit enabling of all token providers and all tokens can lead to
a significant slow down of processing:
</p>
<ul>
<li>
REST server <a href="/server-and-probe.html">configuration property</a> <code>tokenProvides</code> provides the list of enabled token providers.
</li>
<li>
Data model provides its required tokens via
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModelView.html#getEnabledBuiltInTokens()">NCModelView.getEnabledBuiltInTokens()</a> method.
</li>
</ul>
</div>
</section>
<section>
<img id="nlpcraft" class="img-title" src="/images/nlpcraft_logo_black.gif" height="48px" alt="">
<p>
NLPCraft is an open source library for adding natural language Interface to any applications.
</p>
<h2 class="section-title">Base NLP Engine <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
N/A
</p>
<h2 class="section-title">Token Provider <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
NLPCraft provides its own set of built-in elements. NLPCraft token IDs start with <code>nlpcraft</code>. Note
also that all NLPCraft built-in tokens are normalized named entities (NNE), i.e. they provide normalized
information and not just their IDs:
</p>
<table class="gradient-table">
<thead>
<tr>
<th>Token ID</th>
<th>Description</th>
<th>Example</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>nlpcraft:nlp</code></td>
<td>
<p>
This token denotes a word (always a single word) that is not a part of any other token. It's
also call a free-word, i.e. a word that is not linked to any other detected model element.
</p>
<p>
<b>NOTE:</b> the metadata from this token defines a common set of NLP properties and
is present in every other token as well.
</p>
</td>
<td>
<ul>
<li>Jamie goes <code>home</code> (assuming that a word 'home' does not belong to any model element).</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:date</code></td>
<td>
This token denotes a date range. It recognizes dates from 1900 up to 2023. Note that it does not
currently recognize time component.
</td>
<td>
<ul>
<li>Meeting <code>next tuesday</code>.</li>
<li>Report for entire <code>2018 year</code>.</li>
<li>Data <code>from 1/1/2017 to 12/31/2018</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:num</code></td>
<td>
This token denotes a single numeric value or numeric condition.
</td>
<td>
<ul>
<li>Price <code>&gt; 100</code>.</li>
<li>Price is <code>less than $100</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:continent</code></td>
<td>
This token denotes a geographical continent.
</td>
<td>
<ul>
<li>Population of <code>Africa</code>.</li>
<li>Surface area of <code>America</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:subcontinent</code></td>
<td>
This token denotes a geographical subcontinent.
</td>
<td>
<ul>
<li>Population of <code>Alaskan peninsula</code>.</li>
<li>Surface area of <code>South America</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:region</code></td>
<td>
This token denotes a geographical region/state.
</td>
<td>
<ul>
<li>Population of <code>California</code>.</li>
<li>Surface area of <code>South Dakota</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:country</code></td>
<td>
This token denotes a country.
</td>
<td>
<ul>
<li>Population of <code>France</code>.</li>
<li>Surface area of <code>USA</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:city</code></td>
<td>
This token denotes a city.
</td>
<td>
<ul>
<li>Population of <code>Paris</code>.</li>
<li>Surface area of <code>Washington DC</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:metro</code></td>
<td>
This token denotes a metro area.
</td>
<td>
<ul>
<li>Population of <code>Cedar Rapids-Waterloo-Iowa City & Dubuque, IA</code> metro area.</li>
<li>Surface area of <code>Norfolk-Portsmouth-Newport News, VA</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:sort</code></td>
<td>
This token denotes a sorting or ordering.
</td>
<td>
<ul>
<li>Report <code>sorted from top to bottom</code>.</li>
<li>Analysis <code>sorted in descending order</code>.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:limit</code></td>
<td>
This token denotes a numerical limit.
</td>
<td>
<ul>
<li>Show <code>top 5</code> brands.</li>
<li>Show <code>several</code> brands.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:coordinate</code></td>
<td>
This token denotes a latitude and longitude coordinates.
</td>
<td>
<ul>
<li>Route the path to <code>55.7558, 37.6173</code> location.</li>
</ul>
</td>
</tr>
<tr>
<td><code>nlpcraft:relation</code></td>
<td>
This token denotes a relation function:
<code>compare</code> or
<code>correlate</code>. Note this token always need another two tokens that it references.
</td>
<td>
<ul>
<li>
What is the <code><b>correlation between</b></code> <code>price</code> <code><b>and</b></code> <code>location</code>
(assuming that 'price' and 'location' are also detected tokens).
</li>
</ul>
</td>
</tr>
</tbody>
</table>
<p>
<b>NOTES:</b>
</p>
<ul>
<li>
See <a href="data-model.html#meta">token metadata</a> documentation for detailed information
for token metadata properties.
</li>
<li>
Make sure to enable this token provider <code>nlpcraft</code> in REST server configuration
using <code>nlpcraft.server.tokenProviders</code> property.
</li>
<li>
Make sure to also properly configure required tokens in you model configuration via
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModelView.html#getEnabledBuiltInTokens()">NCModelView.getEnabledBuiltInTokens()</a> method.
</li>
</ul>
</section>
<section>
<img id="opennlp" class="img-title" src="/images/opennlp-logo.png" height="48px" alt="">
<p>
<a href="https://opennlp.apache.org">Apache OpenNLP</a> is an open-source library for a machine learning based
processing of natural language text.
</p>
<h2 class="section-title">Base NLP Engine <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
<a href="https://opennlp.apache.org">Apache OpenNLP</a> is used by NLPCraft as a default base NLP engine. You can also set
it explicitly on REST server and probe via configuration property: <code>nlpcraft.nlpEngine=opennlp</code>
</p>
<h2 class="section-title">Token Provider <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
OpenNLP can be used independently as a token provider (even if other library is used as a base NLP engine).
OpenNLP provides its own set of built-in tokens supported by NLPCraft.
OpenNLP token IDs have a form of <code>opennlp:xxx</code>, where <code>xxx</code> is a lower case
name of the named entity in OpenNLP.
</p>
<p>
Configuration notes:
</p>
<ul>
<li>
<p>
OpenNLP integration is configured with the following pre-train English OpenNLP
<a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">models</a> version 1.5:
</p>
<table class="gradient-table">
<thead>
<tr>
<th>Named Entity</th>
<th>OpenNLP Model</th>
<th>Token ID</th>
</tr>
</thead>
<tbody>
<tr>
<td>Location</td>
<td><a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">en-ner-location.bin</a></td>
<td><code>opennlp:location</code></td>
</tr>
<tr>
<td>Money</td>
<td><a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">en-ner-money.bin</a></td>
<td><code>opennlp:money</code></td>
</tr>
<tr>
<td>Person</td>
<td><a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">en-ner-person.bin</a></td>
<td><code>opennlp:person</code></td>
</tr>
<tr>
<td>Organization</td>
<td><a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">en-ner-organization.bin</a></td>
<td><code>opennlp:organization</code></td>
</tr>
<tr>
<td>Date</td>
<td><a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">en-ner-date.bin</a></td>
<td><code>opennlp:date</code></td>
</tr>
<tr>
<td>Time</td>
<td><a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">en-ner-time.bin</a></td>
<td><code>opennlp:time</code></td>
</tr>
<tr>
<td>Percentage</td>
<td><a target="opennlp" href="https://opennlp.sourceforge.net/models-1.5/">en-ner-percentage.bin</a></td>
<td><code>opennlp:percentage</code></td>
</tr>
</tbody>
</table>
</li>
<li>
See <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a>
documentation for token properties.
</li>
<li>
Make sure to enable this token provider <code>opennlp</code> in REST server configuration
using <code>nlpcraft.server.tokenProviders</code> property.
</li>
<li>
Make sure to properly configure required tokens in you model configuration via
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModelView.html#getEnabledBuiltInTokens()">NCModelView.getEnabledBuiltInTokens()</a> method.
</li>
</ul>
</section>
<section>
<img id="google" class="img-title" src="/images/google-cloud-logo-small.png" height="56px" alt="">
<p>
<a href="https://cloud.google.com/natural-language/">Google Natural Language</a> uses machine learning
to reveal the structure and meaning of text.
</p>
<h2 class="section-title">Base NLP Engine <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
N/A
</p>
<h2 class="section-title">Token Provider <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
Google Natural Language provides its own set of built-in elements.
To use Google token provider the environment variable <code>GOOGLE_APPLICATION_CREDENTIALS</code>
should be configured to point to proper Google JSON credential file (see
<a href="https://cloud.google.com/docs/authentication/production">Google documentation</a> for more details).
Google Natural Language token IDs have a form of <code>google:xxx</code>, where <code>xxx</code> is a lower
case name of the Named Entity in Google APIs, i.e. <code>google:person</code>, <code>google:location</code>,
etc.
</p>
<p>Configuration notes:</p>
<ul>
<li>
See Google Natural Language
<a target="google" href="https://cloud.google.com/natural-language/docs/reference/rest/v1/Entity#Type">documentation</a>
for more details on supported tokens.
</li>
<li>
See <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a> documentation for token properties.
</li>
<li>
Make sure to enable this token provider <code>google</code> in REST server configuration
using <code>nlpcraft.server.tokenProviders</code> property.
</li>
<li>
Make sure to also properly configure required tokens in you model configuration via
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModelView.html#getEnabledBuiltInTokens()">NCModelView.getEnabledBuiltInTokens()</a> method.
</li>
</ul>
</section>
<section>
<img id="stanford" class="img-title" src="/images/corenlp-logo.png" height="64px" alt="">
<p>
<a href="https://stanfordnlp.github.io/CoreNLP">Stanford CoreNLP</a> is a set of human language technology tools.
</p>
<p>
Note that due to the fact that Stanford CoreNLP
is licensed under <a target=_ href="https://www.gnu.org/licenses/gpl-3.0.en.html">GNU General Public License v3</a> you need to add
both Stanford CoreNLP
dependencies and NLPCraft Stanford CoreNLP integration separately and make them available to your project.
</p>
<p>
Default <code>pom.xml</code> shipped with NLPCraft release contains Stanford CoreNLP dependency in a separate
<code>stanford-corenlp</code> profile. To use this, you need to enable this profile when building the project
from sources, i.e. <code class="script">mvn clean package -P stanford-corenlp</code>, or enable this profile in maven
configuration:
</p>
<nav>
<div class="nav nav-tabs" role="tablist">
<a class="nav-item nav-link active" data-toggle="tab" href="#nav-stanfordnlp-maven" role="tab">Maven <sup>Java</sup></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-stanfordnlp-grape" role="tab" aria-controls="nav-profile" aria-selected="false">Grape <sup>Groovy</sup></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-stanfordnlp-gradle" role="tab" aria-controls="nav-profile" aria-selected="false">Gradle <sup>Kotlin</sup></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-stanfordnlp-sbt" role="tab" aria-controls="nav-contact" aria-selected="false">SBT <sup>Scala</sup></a>
</div>
</nav>
<div class="tab-content">
<div class="tab-pane fade show active" id="nav-stanfordnlp-maven" role="tabpanel">
<pre class="brush: xml, highlight: 4">
&lt;dependency&gt;
&lt;groupId&gt;edu.stanford.nlp&lt;/groupId&gt;
&lt;artifactId&gt;stanford-corenlp&lt;/artifactId&gt;
&lt;version&gt;3.9.2&lt;/version&gt;
&lt;/dependency&gt;
&lt;dependency&gt;
&lt;groupId&gt;org.apache.nlpcraft&lt;/groupId&gt;
&lt;artifactId&gt;nlpcraft-stanford&lt;/artifactId&gt;
&lt;version&gt;{{site.latest_version}}&lt;/version&gt;
&lt;/dependency&gt;
</pre>
</div>
<div class="tab-pane fade" id="nav-stanfordnlp-grape" role="tabpanel">
<pre class="brush: java">
@Grab ('edu.stanford.nlp:stanford-corenlp:3.9.2')
@Grab ('org.apache.nlpcraft:nlpcraft-stanford:{{site.latest_version}}')
</pre>
</div>
<div class="tab-pane fade" id="nav-stanfordnlp-gradle" role="tabpanel">
<pre class="brush: java">
dependencies {
runtime group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: '3.9.2'
runtime group: 'org.apache.nlpcraft', name: 'nlpcraft-stanford', version: '{{site.latest_version}}'
}
</pre>
</div>
<div class="tab-pane fade" id="nav-stanfordnlp-sbt" role="tabpanel">
<pre class="brush: scala">
libraryDependencies += "edu.stanford.nlp" % "stanford-corenlp" % "3.9.2"
libraryDependencies += "org.apache.nlpcraft" % "nlpcraft-stanford" % "{{site.latest_version}}"
</pre>
</div>
</div>
<div class="bq warn">
Make sure to change Stanford CoreNLP <code>3.9.2</code> version to the latest or required one.
</div>
<p>
Note that you can also <a target=_ href="https://stanfordnlp.github.io/CoreNLP/">download</a>
Stanford CoreNLP as a separate JAR file and add it to your
project classpath if you are not using, or instead of, build tools.
</p>
<h2 class="section-title">Base NLP Engine <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
You can set Stanford CoreNLP as a base NLP engine:
</p>
<ul>
<li>
Set configuration property <code>nlpcraft.nlpEngine=stanford</code>
</li>
<li>
Stanford CoreNLP library must be available <b>on both</b> the REST server and the data probe.
</li>
</ul>
<h2 class="section-title">Token Provider <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
Stanford CoreNLP can be used as a token provider independently from base NLP engine:
<p>
<ul>
<li>
Stanford CoreNLP library should <b>only</b> be available on the data probe.
</li>
</ul>
<p>
Stanford CoreNLP provides its own set of built-in elements.
Stanford CoreNLP token IDs have a form of <code>stanford:xxx</code>, where <code>xxx</code> is a lower
case name of the Named Entity in Stanford CoreNLP, i.e. <code>stanford:person</code>, <code>stanford:location</code>,
etc.
</p>
<p>Configuration notes:</p>
<ul>
<li>
See Stanford CoreNLP Named Entity Recognition
<a target="_blank" href="https://stanfordnlp.github.io/CoreNLP/ner.html">documentation</a>
for more details on supported token types.
</li>
<li>
See <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a>
documentation for token properties.
</li>
<li>
Make sure to enable this token provider <code>stanford</code> in REST server configuration
using <code>nlpcraft.server.tokenProviders</code> property.
</li>
<li>
Make sure to also properly configure required tokens in you model configuration via
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModelView.html#getEnabledBuiltInTokens()">NCModelView.getEnabledBuiltInTokens()</a> method.
</li>
</ul>
</section>
<section>
<img id="spacy" class="img-title" src="/images/spacy-logo.png" height="48px" alt="">
<p>
<a href="https://spacy.io">spaCy</a> is a free open-source library for Natural Language Processing in Python.
</p>
<h2 class="section-title">Base NLP Engine <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
N/A
</p>
<h2 class="section-title">Token Provider <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
<p>
spaCy provides its own set of built-in elements. NLPCraft integrates with spaCy via local Python-based
REST server <code>/src/main/python/spacy_proxy.py</code>. It is a very simple Flask-based implementation
that you can freely modify to change the spaCy models or their external attributes that are made available.
</p>
<p>
This is entire source code for this local REST server:
</p>
<pre class="brush: python, highlight: [11, 29, 30, 58, 59]">
import urllib.parse
import spacy
from flask import Flask, request
from flask_restful import Resource, Api
#
# Add your own or modify spaCy libraries here.
# By default, the English model 'en_core_web_sm' is loaded.
#
nlp = spacy.load("en_core_web_sm")
app = Flask(__name__)
api = Api(app)
class Ner(Resource):
@staticmethod
def get():
doc = nlp(urllib.parse.unquote_plus(request.args.get('text')))
res = []
for e in doc.ents:
meta = {}
# Change the following two lines to implements your own logic for
# filling up meta object with custom user attributes. 'meta' should be a dictionary (JSON)
# with types 'string:string'.
for key in e._.span_extensions:
meta[key] = e._.__getattr__(key)
res.append(
{
"text": e.text,
"from": e.start_char,
"to": e.end_char,
"ner": e.label_,
"vector": str(e.vector_norm),
"sentiment": str(e.sentiment),
"meta": meta
}
)
return res
api.add_resource(Ner, '/spacy')
#
# Default endpoint is 'localhost:5002'.
#
# If the endpoint here is changed make sure to provide
# the same endpoint via configuration property 'nlpcraft.server.spacy.proxy.url',
# i.e. 'nlpcraft.server.spacy.proxy.url=myhost:1234'
#
if __name__ == '__main__':
app.run(
host="localhost",
port='5002'
)
</pre>
<p>
You need to start this REST server before you can use spaCy integration in NLPCraft. Note that for
production environment it is recommended to use
<a target=_ href="https://en.wikipedia.org/wiki/Web_Server_Gateway_Interface">WSGI-based server</a> instead.
</p>
<p>
Comments:
</p>
<ul>
<li>
On line 11 you can add or change spaCy models to be loaded.
</li>
<li>
On lines 29-30 you can change how spans' external attributes are collected.
</li>
<li>
On lines 58-59 you can change the endpoint on which this REST server starts. Note that you
need to change the same endpoint on REST server via configuration property <code>nlpcraft.server.spacy.proxy.url</code>,
e.g. <code>nlpcraft.server.spacy.proxy.url=myhost:1234</code>.
</li>
</ul>
<p>
spaCy token IDs have a form of <code>spacy:xxx</code>, where <code>xxx</code> is a lower case name of the Named Entity
in spaCy APIs, i.e. <code>spacy:person</code>, <code>spacy:location</code>, etc.
</p>
<p>
Configuration notes:
</p>
<ul>
<li>
See spaCy Named Entity Recognition
<a target="spacy" href="https://spacy.io/usage/linguistic-features#named-entities">documentation</a>
for more details on supported token types.
</li>
<li>
See <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCToken.html">NCToken</a>
documentation for token properties.
</li>
<li>
Make sure to enable this token provider <code>spacy</code> in REST server configuration
using <code>nlpcraft.server.tokenProviders</code> property.
</li>
<li>
Make sure to also properly configure required tokens in you model configuration via
<a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModelView.html#getEnabledBuiltInTokens()">NCModelView.getEnabledBuiltInTokens()</a> method.
</li>
</ul>
</section>
<section>
<img id="mysql" class="img-title" src="/images/mysql-logo.png" height="80px" alt="">
<p>
You can install and use MySQL as a system database for the REST server instead of the built-in
distributed SQL storage from Apache Ignite that is used by default. Add the following dependency to your project:
</p>
<nav>
<div class="nav nav-tabs" role="tablist">
<a class="nav-item nav-link active" data-toggle="tab" href="#nav-mysql-maven" role="tab">Maven <img src="/images/java2-h20.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-mysql-grape" role="tab">Grape <img src="/images/groovy-h18.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-mysql-gradle" role="tab">Gradle <img src="/images/kotlin-h18.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-mysql-sbt" role="tab">SBT <img src="/images/scala-logo-h16.png" alt=""></a>
</div>
</nav>
<div class="tab-content">
<div class="tab-pane fade show active" id="nav-mysql-maven" role="tabpanel">
<pre class="brush: xml, highlight: 4">
&lt;dependency&gt;
&lt;groupId&gt;mysql&lt;/groupId&gt;
&lt;artifactId&gt;mysql-connector-java&lt;/artifactId&gt;
&lt;version&gt;8.0.15&lt;/version&gt;
&lt;/dependency&gt;
</pre>
</div>
<div class="tab-pane fade" id="nav-mysql-grape" role="tabpanel">
<pre class="brush: java">
@Grab ('mysql:mysql-connector-java:8.0.15')
</pre>
</div>
<div class="tab-pane fade" id="nav-mysql-gradle" role="tabpanel">
<pre class="brush: java">
dependencies {
runtime group: 'mysql', name: 'mysql-connector-java', version: '8.0.15'
}
</pre>
</div>
<div class="tab-pane fade" id="nav-mysql-sbt" role="tabpanel">
<pre class="brush: scala">
libraryDependencies += "mysql" % "mysql-connector-java" % "8.0.15"
</pre>
</div>
</div>
<p>
Comments:
</p>
<ul>
<li>
Make sure to change <code>8.0.15</code> version to the latest or required one.
</li>
<li>
Update configuration property <code>nlpcraft.server.database.jdbc</code>
with required JDBC driver class and JDBC URL.
</li>
<li>
Use scripts from <code>sql/mysql</code> folder to create database and initialize DB schema.
</li>
<li>
Note that you can also <a target=_ href="https://dev.mysql.com/downloads/connector/j">download</a> MySQL
JDBC driver as a separate JAR file and add it to your
project classpath if you are not using, or instead of, build tools.
</li>
</ul>
</section>
<section>
<img id="postgres" class="img-title" src="/images/postgresql-logo.png" height="80px" alt="">
<p>
You can install and use PostgreSQL as a system database for the REST server instead of the built-in
distributed SQL storage from Apache Ignite that is used by default. Add the following dependency to your project:
</p>
<nav>
<div class="nav nav-tabs" role="tablist">
<a class="nav-item nav-link active" data-toggle="tab" href="#nav-postgres-maven" role="tab">Maven <img src="/images/java2-h20.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-postgres-grape" role="tab">Grape <img src="/images/groovy-h18.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-postgres-gradle" role="tab">Gradle <img src="/images/kotlin-h18.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-postgres-sbt" role="tab">SBT <img src="/images/scala-logo-h16.png" alt=""></a>
</div>
</nav>
<div class="tab-content">
<div class="tab-pane fade show active" id="nav-postgres-maven" role="tabpanel">
<pre class="brush: xml, highlight: 4">
&lt;dependency&gt;
&lt;groupId&gt;org.postgresql&lt;/groupId&gt;
&lt;artifactId&gt;postgresql&lt;/artifactId&gt;
&lt;version&gt;42.2.5&lt;/version&gt;
&lt;/dependency&gt;
</pre>
</div>
<div class="tab-pane fade" id="nav-postgres-grape" role="tabpanel">
<pre class="brush: java">
@Grab ('org.postgresql:postgresql:42.2.5')
</pre>
</div>
<div class="tab-pane fade" id="nav-postgres-gradle" role="tabpanel">
<pre class="brush: java">
dependencies {
runtime group: 'org.postgresql', name: 'postgresql', version: '42.2.5'
}
</pre>
</div>
<div class="tab-pane fade" id="nav-postgres-sbt" role="tabpanel">
<pre class="brush: scala">
libraryDependencies += "org.postgresql" % "postgresql" % "42.2.5"
</pre>
</div>
</div>
<p>
Comments:
</p>
<ul>
<li>
Make sure to change <code>42.2.5</code> version to the latest or required one.
</li>
<li>
Update configuration property <code>nlpcraft.server.database.jdbc</code>
with required JDBC driver class and JDBC URL.
</li>
<li>
Use scripts from <code>sql/postgres</code> folder to create database and initialize DB schema.
</li>
<li>
Note that you can also <a target=_ href="https://jdbc.postgresql.org/">download</a> PostgreSQL
JDBC driver as a separate JAR file and add it to your
project classpath if you are not using, or instead of, build tools.
</li>
</ul>
</section>
<section>
<img id="oracle" class="img-title" src="/images/oracle-logo.png" width="200px" alt="">
<p>
You can install and use Oracle RDBMS as a system database for the REST server instead of the built-in
distributed SQL storage from Apache Ignite that is used by default. Add the following dependency to your project:
</p>
<nav>
<div class="nav nav-tabs" role="tablist">
<a class="nav-item nav-link active" data-toggle="tab" href="#nav-oracle-maven" role="tab">Maven <img src="/images/java2-h20.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-oracle-grape" role="tab">Grape <img src="/images/groovy-h18.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-oracle-gradle" role="tab">Gradle <img src="/images/kotlin-h18.png" alt=""></a>
<a class="nav-item nav-link" data-toggle="tab" href="#nav-oracle-sbt" role="tab">SBT <img src="/images/scala-logo-h16.png" alt=""></a>
</div>
</nav>
<div class="tab-content">
<div class="tab-pane fade show active" id="nav-oracle-maven" role="tabpanel">
<pre class="brush: xml, highlight: 4">
&lt;dependency&gt;
&lt;groupId&gt;org.oracle&lt;/groupId&gt;
&lt;artifactId&gt;ojdbc14&lt;/artifactId&gt;
&lt;version&gt;10.2.0.4.0&lt;/version&gt;
&lt;/dependency&gt;
</pre>
</div>
<div class="tab-pane fade" id="nav-oracle-grape" role="tabpanel">
<pre class="brush: java">
@Grab ('org.oracle:ojdbc14:10.2.0.4.0')
</pre>
</div>
<div class="tab-pane fade" id="nav-oracle-gradle" role="tabpanel">
<pre class="brush: java">
dependencies {
runtime group: 'org.oracle', name: 'ojdbc14', version: '10.2.0.4.0'
}
</pre>
</div>
<div class="tab-pane fade" id="nav-oracle-sbt" role="tabpanel">
<pre class="brush: scala">
libraryDependencies += "org.oracle" % "ojdbc14" % "10.2.0.4.0"
</pre>
</div>
</div>
<p>
Comments:
</p>
<ul>
<li>
Make sure to change <code>10.2.0.4.0</code> version to the latest or required one.
</li>
<li>
Update configuration property <code>nlpcraft.server.database.jdbc</code>
with required JDBC driver class and JDBC URL.
</li>
<li>
Use scripts from <code>sql/oracle</code> folder to create database and initialize DB schema.
</li>
</ul>
</section>
<section>
<img id="gridgain" class="img-title" src="/images/gridgain-logo.png" width="200px" alt="">
<p>
NLPCraft server is running on top of <a target="_" href="https://ignite.apache.org/">Apache Ignite</a>.
<a target="_" href="https://www.gridgain.com/">GridGain Systems</a> develops enterprise in-memory computing
platform that is based on Apache Ignite. GridGain also develops the <a target="_" href="https://www.gridgain.com/products/software/control-center">GridGain Control Center</a> that support Apache Ignite
and is available for free for Apache Ignite users. In order to use GridGain Control Center to manage and monitor
NLPCraft server internals you need to have <a target="_" href="https://www.gridgain.com/resources/download#controlcenter">GridGain Web Agent</a> installed and available on the classpath for NLPCraft server.
</p>
<p>
NLPCraft <code>pom.xml</code> comes with necessary dependencies that are located in a separate
<code>gridgain-agent</code> Maven profile. To enable GridGain Web Agent you need to manually enable this Maven profile
when building NLPCraft from source code.
</p>
<div class="bq warn">
<p><b>GridGain Control Center</b></p>
<p>
Note that GridGain Control Center is a commercial software with free access for Apache Ignite. Its integration is not included
into standard Apache NLPCraft release. You need to manually enable the special <code>gridgain-agent</code>
Maven profile or <a target="_" href="https://www.gridgain.com/resources/download#controlcenter">download</a> and install GridGain Web Agent manually.
</p>
</div>
</section>
</div>
<div class="col-md-2 third-column">
<ul class="side-nav">
<li class="side-nav-title">On This Page</li>
<li><a href="#nlpcraft">NLPCraft</a></li>
<li><a href="#opennlp">OpenNLP</a></li>
<li><a href="#google">Google</a></li>
<li><a href="#stanford">Stanford CoreNLP</a></li>
<li><a href="#spacy">spaCy</a></li>
<li><a href="#mysql">MySQL</a></li>
<li><a href="#postgres">PostgreSQL</a></li>
<li><a href="#oracle">Oracle</a></li>
<li><a href="#gridgain">GridGain</a></li>
{% include quick-links.html %}
</ul>
</div>