| --- |
| active_crumb: SQL Model <code><sub>ex</sub></code> |
| layout: documentation |
| id: sql_model |
| fa_icon: fa-cube |
| --- |
| |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one or more |
| contributor license agreements. See the NOTICE file distributed with |
| this work for additional information regarding copyright ownership. |
| The ASF licenses this file to You under the Apache License, Version 2.0 |
| (the "License"); you may not use this file except in compliance with |
| the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, software |
| distributed under the License is distributed on an "AS IS" BASIS, |
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| See the License for the specific language governing permissions and |
| limitations under the License. |
| --> |
| |
| <div class="col-md-8 second-column example"> |
| <section id="overview"> |
| <h2 class="section-title">Overview <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| This example demonstrates a relatively complete implementation of natural language interface to a |
| SQL database. This is a non-trivial example, spanning over a thousand lines of code, and it can |
| act as a basis for production-ready implementation. |
| </p> |
| <p> |
| Note that a significant part of the implementation is not directly related to NLPCraft but rather deals |
| with SQL statement construction. We specifically decided not to use any 3rd party libraries for it to |
| show what can be done "from scratch". |
| </p> |
| <p> |
| <b>Complexity:</b> <span class="complexity-three-star"><i class="fas fa-gem"></i> <i class="fas fa-gem"></i> <i class="fas fa-gem"></i></span><br/> |
| <span class="ex-src">Source code: <a target="github" href="https://github.com/apache/incubator-nlpcraft/tree/master/nlpcraft-examples/sql">GitHub <i class="fab fa-fw fa-github"></i></a><br/></span> |
| <span class="ex-review-all">Review: <a target="github" href="https://github.com/apache/incubator-nlpcraft/tree/master/nlpcraft-examples">All Examples at GitHub <i class="fab fa-fw fa-github"></i></a></span> |
| </p> |
| </section> |
| <section id="background"> |
| <h2 class="section-title">Background <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Many of modern existing natural language-to-SQL implementations use variations of |
| deep learning approach where you first train the neural network on a pre-created training set and |
| then get to use the trained network to provide (infer) probabilistic answers for the new input sentences. Although the latest |
| natural language-to-SQL attempts to implement this approach are getting into 90% percentile of accuracy - they remain largely |
| unusable for the vast swath of enterprise applications where non-deterministic nature of such systems |
| renders them practically useless. The apparent problem with non-deterministic systems like that is that the user |
| never knows whether a given answer is correct or incorrect. In these use cases users can't tolerate the fact that |
| in 5-10% of the cases the answer will be invalid. Try that for your HR or revenue reporting system, |
| prescriptive analytics systems, and so on... |
| </p> |
| <p> |
| It's important to note, however, that for many non-mission-critical systems such non-determinism does |
| not pose a significant problem. We happily accept such imprecision when asking for direction on our |
| mobile devices, unlock our phones using face or fingerprint recognition, when performing sentiment |
| analysis or trying to detect faces of our friends across thousands of photographs. Cost of retries, |
| as well as the cost of initial errors, is insignificant in these cases. The same cost, however, in many business |
| applications can be too significant to tolerate. |
| </p> |
| <p> |
| As you may have learned by now, NLPCraft uses a <i>fully deterministic approach</i> in an attempt to |
| match the user input against a defined set of intents. If it finds the matching intent - it guarantees that |
| match for a given intent. If no matching intent can be found - it returns the negative |
| result without any ambiguity. In other words, if the answer is given, it is deterministically guaranteed to be correct. |
| Another positive side-effect of this approach is the fact that such matching logic is traceable, i.e. |
| the user can see why given user input was matched against a certain intent (and not any other). Such |
| traceability of the comprehension logic (or explainability vs "black box" approach from deep |
| learning techniques) is often critical for many real-life business applications. |
| </p> |
| <p> |
| On the flip side, one of the downsides of such an approach in general is the need to have a detailed, |
| domain-specific model |
| for each SQL database (when dealing with SQL databases). Building such a |
| model can be a non-trivial and time consuming experience. That is where NLPCraft brings a lot of built-in tooling |
| to simplify this task dramatically. |
| </p> |
| <div class="bq info"> |
| <b>Source Code</b> |
| <p> |
| Due to size of this example the entire source code for it can be found on <a target="github" href="https://github.com/apache/incubator-nlpcraft/tree/master/nlpcraft-examples/sql">GitHub</a>. |
| </p> |
| </div> |
| </section> |
| <section id="sql"> |
| <h2 class="section-title">Sample Database <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| We are going to be building natural language interface against slightly modified "Northwind" sample database |
| from <a target="_" href="https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/sql/linq/downloading-sample-databases#get-the-northwind-sample-database-for-sql-server">Microsoft SQL Server</a>: |
| </p> |
| <figure> |
| <img class="img-fluid" src="/images/sql_example_model.png" alt=""> |
| <figcaption><b>Fig 1.</b> SQL Schema</figcaption> |
| </figure> |
| <p> |
| You can find SQL script creating this database and populating it with the sample data at |
| <a href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft-examples/sql/src/main/resources/northwind.sql" target="github"><code>db/northwind.sql</code></a> |
| file in the root of the example. |
| </p> |
| </section> |
| <section id="h2"> |
| <h2 class="section-title">H2 Database <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| In our example we are using H2 database instance. For convenience, example provides a command line application |
| <a href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft-examples/sql/src/main/java/org/apache/nlpcraft/examples/sql/db/SqlServer.scala" target="github"><code>db/SqlServer.scala</code></a> |
| that automatically starts local H2 database instance with default configuration (localhost on port 9092) and initializes it |
| using <a href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft-examples/sql/src/main/resources/northwind.sql" target="github"><code>db/northwind.sql</code></a> script. |
| </p> |
| </section> |
| <section id="model"> |
| <h2 class="section-title">Data Model <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Creating data model manually for the existing SQL schema can be a daunting task. NLPCraft provides the tool |
| that can scan SQL schema and create working stub of such model automatically. All you have to do then is to |
| add any necessary modifications to it. |
| </p> |
| <p> |
| <a href="/tools/sql_model_gen.html">SQL Model Generation</a> is a Java-based utility that takes |
| JDBC configuration, reads database schema using it and creates initial JSON or YAML stub for the |
| data model. This stub then can be used as is or be extended further. |
| </p> |
| <p> |
| Examples specifically comes with two pre-generated files: |
| </p> |
| <ul> |
| <li> |
| <a href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft-examples/sql/src/main/resources/sql_model_init.yaml" target="github"><code>sql_model_init.yaml</code></a> - the initial file that was generated straight out of |
| SQL Model Generator (see above for the instructions). |
| </li> |
| </ul> |
| </section> |
| <section id="impl"> |
| <h2 class="section-title">Implementation <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Implementation mainly consists of these files: |
| </p> |
| <ul> |
| <li> |
| <a href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft-examples/sql/src/main/java/org/apache/nlpcraft/examples/sql/SqlModel.scala" target="github"><code>SqlModel.scala</code></a> - the code behind the data model that loads YAML-defined model and |
| defines all intents. |
| </li> |
| <li> |
| <a href="https://github.com/apache/incubator-nlpcraft/blob/master/nlpcraft-examples/sql/src/main/java/org/apache/nlpcraft/examples/sql/db/SqlBuilder.scala" target="github"><code>db/SqlBuilder.scala</code></a> - the main utility that takes object model provided by |
| <a href="/tools/sql_model_gen.html">SQL Model Generation</a> |
| and builds a SQL query to execute. Note that we elected to build this functionality |
| from scratch to illustrate how it can be done. You are free, of course, to use many |
| of the existing libraries to help achieve this goal. |
| </li> |
| </ul> |
| </section> |
| <section id="build_project"> |
| <h2 class="section-title">Build Project <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Assuming that project is located in <code>~/SqlModel</code> folder - let's go to into this directory and |
| run the Maven build: |
| </p> |
| <pre class="brush: bash"> |
| $ cd ~/SqlModel |
| $ mvn clean package |
| </pre> |
| <p> |
| At this stage we have our project built and we are ready to start testing. |
| </p> |
| </section> |
| <section id="start_server"> |
| <h2 class="section-title">Start Server <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Run the following command to start local REST server, if it hasn't been started already, from the NLPCraft installation directory: |
| </p> |
| <nav> |
| <div class="nav nav-tabs" role="tablist"> |
| <a class="nav-item nav-link active" data-toggle="tab" href="#nav-srv-cmd" role="tab">Command</a> |
| <a class="nav-item nav-link" data-toggle="tab" href="#nav-srv-out" role="tab">Output <i class="fa fa-desktop output"></i></a> |
| </div> |
| </nav> |
| <div class="tab-content"> |
| <div class="tab-pane fade show active" id="nav-srv-cmd" role="tabpanel"> |
| <pre class="brush: bash"> |
| $ bin/nlpcraft.sh start-server |
| </pre> |
| </div> |
| <div class="tab-pane fade show" id="nav-srv-out" role="tabpanel"> |
| <p></p> |
| <p> |
| <img class="img-fluid" alt="" src="/images/server-fig1.png"> |
| </p> |
| </div> |
| </div> |
| <p> |
| <b>NOTES:</b> |
| </p> |
| <ul> |
| <li> |
| <i style="color: #F39C12" class="fa fa-exclamation-triangle"></i> REST server is a "fire-and-forget" component |
| that you generally need to start it only once for this and any other examples. |
| </li> |
| <li> |
| Run <code class="script">bin/nlpcraft.sh help --cmd=start-server</code> to get a full help on this command. |
| </li> |
| <li> |
| <a href="/tools/script.html">NLPCraft CLI</a> is available as <code>nlpcraft.sh</code> for |
| <i class="fab fa-fw fa-linux"></i> and <code>nlpcraft.cmd</code> |
| for <i class="fab fa-fw fa-windows"></i>. |
| </li> |
| </ul> |
| </section> |
| <section id="testing"> |
| <h2 class="section-title">Testing <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Part of the <a href="/tools/test_framework.html">test framework</a>, the auto-validator class <a |
| target="javadoc" |
| href="/apis/latest/org/apache/nlpcraft/model/tools/test/NCTestAutoModelValidator.html">NCTestAutoModelValidator</a> takes one or more model IDs |
| (or class names) and performs validation. Validation consists of starting an <a href="/tools/embedded_probe.html">embedded probe</a> with a given model, |
| scanning for <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentSample.html">@NCIntentSample</a> and |
| <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCIntentSampleRef.html">@NCIntentSampleRef</a> annotations |
| and their corresponding callback methods, submitting each sample input |
| sentences from these annotations and checking that resulting intent matches the intent the sample was attached to. |
| Note that auto-testing does not require any additional code to be written - the class gathers all required information from the model |
| itself. |
| </p> |
| <p> |
| As always, you can launch model auto-validator as any other Java class but we'll use NLPCraft CLI |
| to do it more conveniently: |
| </p> |
| <pre class="brush: bash"> |
| $ bin/nlpcraft.sh test-model --cp=~/sql/target/classes --mdls=demo.SqlModel |
| </pre> |
| <p> |
| <b>NOTES:</b> |
| </p> |
| <ul> |
| <li> |
| Run <code class="script">bin/nlpcraft.sh help --cmd=test-model</code> to get a full help on this command. |
| </li> |
| <li> |
| Note that you can use <code>retest-model</code> command in REPL mode to re-run the last model test |
| avoiding the retyping of all required parameters. |
| </li> |
| <li> |
| <a href="/tools/script.html">NLPCraft CLI</a> is available as <code>nlpcraft.sh</code> for |
| <i class="fab fa-fw fa-linux"></i> and <code>nlpcraft.cmd</code> |
| for <i class="fab fa-fw fa-windows"></i>. |
| </li> |
| </ul> |
| </section> |
| <section id="rinse"> |
| <h2 class="section-title">Rinse <span class="amp">&</span> Repeat <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| Typical development cycle consists of: |
| </p> |
| <ul> |
| <li> |
| <a href="#model">Modifying the model</a> |
| </li> |
| <li> |
| <a href="#build_project">Re-building the project</a> |
| </li> |
| <li> |
| <a href="#testing">Re-running the test</a> |
| </li> |
| </ul> |
| <p> |
| All of these operations can be performed from <a href="/tools/script.html">NLPCraft CLI</a> in REPL mode or from any IDE. |
| </p> |
| <p> |
| NOTE: you don't need to restart REST server every time - it only needs to be started once. |
| </p> |
| </section> |
| <section> |
| <h2 class="section-title">Done! 👌 <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2> |
| <p> |
| You've created SQL model, started the REST server and tested this model using the built-in test framework. |
| </p> |
| </section> |
| </div> |
| <div class="col-md-2 third-column"> |
| <ul class="side-nav"> |
| <li class="side-nav-title">On This Page</li> |
| <li><a href="#overview">Overview</a></li> |
| <li><a href="#background">Background</a></li> |
| <li><a href="#sql">Sample Database</a></li> |
| <li><a href="#model">Data Model</a></li> |
| <li><a href="#impl">Implementation</a></li> |
| <li><a href="#build_project">Build Project</a></li> |
| <li><a href="#start_server">Start Server</a></li> |
| <li><a href="#testing">Testing</a></li> |
| <li><a href="#rinse">Rinse <span class="amp">&</span> Repeat</a></li> |
| {% include quick-links.html %} |
| </ul> |
| </div> |
| |
| |
| |
| |
| |
| |