You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nlpcraft.apache.org by se...@apache.org on 2022/10/22 05:53:58 UTC

[incubator-nlpcraft-website] branch NLPCRAFT-513 updated (a75ab20 -> 5cd43f3)

This is an automated email from the ASF dual-hosted git repository.

sergeykamov pushed a change to branch NLPCRAFT-513
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git


    from a75ab20  WIP.
     add 2542576  Update .gitignore
     new c743b27  Merge branch 'master' into NLPCRAFT-513
     new 5cd43f3  WIP.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .gitignore                         |   1 +
 _includes/left-side-menu.html      |  14 +++
 docs.html => api-review.html       |  31 +++---
 built-components.html              |  31 +++++-
 404.html => custom-components.html |  24 ++++-
 docs.html                          | 188 +++++++++++++++++--------------------
 6 files changed, 166 insertions(+), 123 deletions(-)
 copy docs.html => api-review.html (86%)
 copy 404.html => custom-components.html (61%)


[incubator-nlpcraft-website] 01/02: Merge branch 'master' into NLPCRAFT-513

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sergeykamov pushed a commit to branch NLPCRAFT-513
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git

commit c743b2726fcca4359fe544b872a0afcf662d1505
Merge: a75ab20 2542576
Author: skhdl <sk...@gmail.com>
AuthorDate: Sat Oct 22 09:14:43 2022 +0400

    Merge branch 'master' into NLPCRAFT-513

 .gitignore | 1 +
 1 file changed, 1 insertion(+)


[incubator-nlpcraft-website] 02/02: WIP.

Posted by se...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

sergeykamov pushed a commit to branch NLPCRAFT-513
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git

commit 5cd43f38d7a15316788b10c919b906559c9603a8
Author: skhdl <sk...@gmail.com>
AuthorDate: Sat Oct 22 09:53:49 2022 +0400

    WIP.
---
 _includes/left-side-menu.html |  14 ++++
 docs.html => api-review.html  |  31 +++----
 built-components.html         |  31 ++++++-
 custom-components.html        |  40 +++++++++
 docs.html                     | 188 +++++++++++++++++++-----------------------
 5 files changed, 186 insertions(+), 118 deletions(-)

diff --git a/_includes/left-side-menu.html b/_includes/left-side-menu.html
index 42e18e4..93907e6 100644
--- a/_includes/left-side-menu.html
+++ b/_includes/left-side-menu.html
@@ -24,6 +24,13 @@
         <a href="/docs.html">Overview</a>
         {% endif %}
     </li>
+    <li>
+        {% if page.id == api-review" %}
+        <a class="active" href="/api-review.html">API review</a>
+        {% else %}
+        <a href="/api-review.html">API review</a>
+        {% endif %}
+    </li>
     <li>
         {% if page.id == "built-components" %}
         <a class="active" href="/built-components.html">Built components</a>
@@ -31,6 +38,13 @@
         <a href="/built-components.html">Built components</a>
         {% endif %}
     </li>
+    <li>
+        {% if page.id == "built-components" %}
+        <a class="active" href="/custom-components.html">Custom components</a>
+        {% else %}
+        <a href="/custom-components.html">Custom components</a>
+        {% endif %}
+    </li>
     <li>
         {% if page.id == "installation" %}
         <a class="active" href="/installation.html">Installation</a>
diff --git a/docs.html b/api-review.html
similarity index 86%
copy from docs.html
copy to api-review.html
index fb19d6a..5532f56 100644
--- a/docs.html
+++ b/api-review.html
@@ -23,24 +23,15 @@ id: overview
 
 <div class="col-md-8 second-column">
     <section id="overview">
-        <h2 class="section-title">Overview <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
-        <p>
-            Apache NLPCraft is an <a target=_blank href="https://www.apache.org/licenses/">open source</a> Scala library for adding a natural language interface to modern applications.
-            It enables people to interact with your products using voice or text.
-            Its design is based on advanced <a href="/intent-matching.html">Intent Definition Language</a> (IDL) for defining non-trivial intents and
-            a fully deterministic intent matching algorithm for the input utterances.
-        </p>
-        <p>
-            One of the key features of NLPCraft is its use of <a href="/intent-matching.html">IDL</a> coupled with deterministic intent matching that are tailor made for
-            <em>domain-specific</em> natural language interface. This design doesn't force developers to use direct deep learning
-            approach with time consuming corpora development and model training - resulting in much a
-            <em>simpler <span class="amp">&</span> faster</em> implementation.
-        </p>
+        <h2 class="section-title">Library API review <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
 
         <p>
             NlpCraft library contains two base elements: <code>Model</code> and <code>Client</code>.
         </p>
+    </section>
 
+    <section id="model-client">
+        <h2 class="section-title">Model and client <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
         <ul>
             <li>
                 <code>Model</code> is domain specific object which responsible for user input interpretation. Model contains intents, defined via NlpCraft IDL with related code callbacks. Intent is user defined callback and rule, according to which this callback should be called. Rule is most often some template, based on expected set of entities in user input, but it can be more flexible.
@@ -79,7 +70,12 @@ id: overview
                 <code>Pipeline</code> can be based on standard and custom user defined components.
             </li>
         </ul>
-
+    </section>
+    <section id="model-configuration">
+        <h2 class="section-title">Model configuration <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
+    </section>
+    <section id="model-pipeline">
+        <h2 class="section-title">Model pipeline <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
         <p>
              Before looking at pipeline elements more throughly, let's start with terminology.
         </p>
@@ -148,11 +144,18 @@ id: overview
             This flexible system allows to create any pipelines on any language. You can collect NlpCraft predefined components, write your own and easy reuse custom components.
         </p>
     </section>
+    <section id="model-intents">
+        <h2 class="section-title">Model intents and callbacks <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
+    </section>
 </div>
 <div class="col-md-2 third-column">
     <ul class="side-nav">
         <li class="side-nav-title">On This Page</li>
         <li><a href="#overview">Overview</a></li>
+        <li><a href="#model-client">Model and client</a></li>
+        <li><a href="#model-configuration">Model configuration</a></li>
+        <li><a href="#model-pipeline">Model pipeline</a></li>
+        <li><a href="#model-intents">Model intents and callbacks</a></li>
         {% include quick-links.html %}
     </ul>
 </div>
diff --git a/built-components.html b/built-components.html
index a29c329..a98a9cb 100644
--- a/built-components.html
+++ b/built-components.html
@@ -154,25 +154,27 @@ id: overview
             <li><code>NCEnBracketsTokenEnricher</code></li>
         </ul>
     </section>
+
     <section id="semantic">
         <h2 class="section-title">Semantic enrichers <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
     </section>
+
     <section id="examples">
         <h2 class="section-title">Examples <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
 
-        <p>Typical usage example:</p>
+        <p><b>Simple example</b>:</p>
 
         <pre class="brush: scala, highlight: []">
             val pipeline = new NCPipelineBuilder().withSemantic("en", "lightswitch_model.yaml").build
         </pre>
         <ul>
             <li>
-                It defines pipeline with all default English language components and one semantic entity parser with \
+                It defines pipeline with all default English language components and one semantic entity parser with
                 model defined in <code>lightswitch_model.yaml</code>.
             </li>
         </ul>
 
-        <p>Another example:</p>
+        <p><b>Example with pipeline configured by built components:</b></p>
 
         <pre class="brush: scala, highlight: [2, 6, 7, 12, 13, 14, 15]">
             val pipeline =
@@ -220,6 +222,29 @@ id: overview
                 <code>Line 15</code> defines pipeline building.
             </li>
         </ul>
+
+        <p><b>Example with pipeline configured by custom components:</b></p>
+
+        <pre class="brush: scala, highlight: []">
+            val pipeline =
+                new NCPipelineBuilder().
+                    withTokenParser(new NCFrTokenParser()).
+                    withTokenEnricher(new NCFrLemmaPosTokenEnricher()).
+                    withTokenEnricher(new NCFrStopWordsTokenEnricher()).
+                    withEntityParser(new NCFrSemanticEntityParser("lightswitch_model_fr.yaml")).
+                    build
+        </pre>
+
+        <ul>
+            <li>
+                There is the pipeline created for work with French Language. All components of this pipeline are custom components.
+                You can get fore information at examples description chapters:
+                <a href="examples/light_switch_fr.html">Light Switch FR</a> and
+                <a href="examples/light_switch_ru.html">Light Switch RU</a>.
+                Note that these custom components are mostly wrappers on existing solutions and
+                should be prepared just once when you start work with new language.
+            </li>
+        </ul>
     </section>
 </div>
 <div class="col-md-2 third-column">
diff --git a/custom-components.html b/custom-components.html
new file mode 100644
index 0000000..71cced3
--- /dev/null
+++ b/custom-components.html
@@ -0,0 +1,40 @@
+---
+active_crumb: Docs
+layout: documentation
+id: overview
+---
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<div class="col-md-8 second-column">
+    <section id="overview">
+        <h2 class="section-title">Custom components <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
+    </section>
+</div>
+<div class="col-md-2 third-column">
+    <ul class="side-nav">
+        <li class="side-nav-title">On This Page</li>
+        <li><a href="#overview">Overview</a></li>
+
+        {% include quick-links.html %}
+    </ul>
+</div>
+
+
+
+
diff --git a/docs.html b/docs.html
index fb19d6a..78e2e38 100644
--- a/docs.html
+++ b/docs.html
@@ -25,10 +25,12 @@ id: overview
     <section id="overview">
         <h2 class="section-title">Overview <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
         <p>
-            Apache NLPCraft is an <a target=_blank href="https://www.apache.org/licenses/">open source</a> Scala library for adding a natural language interface to modern applications.
-            It enables people to interact with your products using voice or text.
-            Its design is based on advanced <a href="/intent-matching.html">Intent Definition Language</a> (IDL) for defining non-trivial intents and
-            a fully deterministic intent matching algorithm for the input utterances.
+            Apache NLPCraft is a JVM-based <a target=_blank href="https://www.apache.org/licenses/">open source</a> library
+            for adding a natural language interface to modern applications.  It enables people to interact with your products using voice or text. NLPCraft can connect with
+            any private or public data source, and has no hardware or software lock-ins. Its design is based on advanced
+            <a href="/intent-matching.html">Intent Definition Language</a> (IDL) for defining non-trivial intents and a fully deterministic intent matching
+            algorithm for the input utterances. You can build intents for NLPCraft using any JVM-based languages like Java, Scala, Kotlin, Groovy, etc. NLPCraft
+            exposes REST APIs for integration with end-user applications.
         </p>
         <p>
             One of the key features of NLPCraft is its use of <a href="/intent-matching.html">IDL</a> coupled with deterministic intent matching that are tailor made for
@@ -36,123 +38,107 @@ id: overview
             approach with time consuming corpora development and model training - resulting in much a
             <em>simpler <span class="amp">&</span> faster</em> implementation.
         </p>
-
         <p>
-            NlpCraft library contains two base elements: <code>Model</code> and <code>Client</code>.
+            Another key aspect of NLPCraft is its initial focus on processing English language. Although it may sound
+            counterintuitive, this narrower initial focus enables NLPCraft to deliver unprecedented ease of use combined with
+            unparalleled comprehension capabilities for English input out-of-the-box. It avoids academic, watered down functionality or overly
+            complicated configuration and usage - following on project's <em>"built for engineers by engineers"</em> ethos.
+            English language is spoken by more
+            than a billion people on this planet and is de facto standard global language of the business and commerce.
         </p>
-
-        <ul>
-            <li>
-                <code>Model</code> is domain specific object which responsible for user input interpretation. Model contains intents, defined via NlpCraft IDL with related code callbacks. Intent is user defined callback and rule, according to which this callback should be called. Rule is most often some template, based on expected set of entities in user input, but it can be more flexible.
-            </li>
-
-            <li>
-                <code>Client</code> is object, which allows to communicate with given model. Main methods are user input processing and control of communication session.
-            </li>
-        </ul>
-
-        <p>Typical part of code:</p>
-
-        <pre class="brush: scala, highlight: []">
-              // Prepares domain model.
-              val mdl = new CustomNlpModel()
-
-              // Prepares client for given model.
-              val client = new NCModelClient(mdl)
-
-              // Sends text request to model by user ID "userId".
-              val result = client.ask("Some user command", "userId")
-
-              // Clears dialog session for user with ID "userId".
-              client.clearDialog("userId")
-        </pre>
-
         <p>
-            Model definition includes two parts:
+            So, how does it work in a nutshell?
         </p>
-        <ul>
-            <li>
-                <code>Configuration</code>. Static configuration parameters including name, version, etc.
-            </li>
-            <li>
-                <code>Pipeline</code>. Most important component, which defines user input processing chain.
-                <code>Pipeline</code> can be based on standard and custom user defined components.
-            </li>
-        </ul>
-
         <p>
-             Before looking at pipeline elements more throughly, let's start with terminology.
+            When using NLPCraft you will be dealing with three main components:
         </p>
-
         <ul>
-            <li>
-                <code>Token</code>. It is simple string, part of user input, which split according to some rules, for instance by spaces and some additional conditions, which depends on language and some expectations.
-                So user input "<b>Where is it?</b>" contains four tokens: "<b>Where</b>", "<b>is</b>", "<b>it</b>", "<b>?</b>".
-            </li>
-            <li>
-                <code>Entity</code>. According to wikipedia, named entity is a real-world object, such as a person, location, organization, product, etc., that can be denoted with a proper name. It can be abstract or have a physical existence. Each entity can contain one or more tokens.
-            </li>
-            <li>
-                <code>Variant</code>. List of entities. Potentially, each token can be recognized as different entities, so user input can be processed as set of variants. For example user input "Mercedes" can be processed as 2 variants, both of them contains single element list of entities: car brand or Spanish family name.
-            </li>
+            <li><a href="#data-model">Data model</a></li>
+            <li><a href="#data-probe">Data probe</a></li>
+            <li><a href="#server">REST Server</a></li>
         </ul>
-
+        <figure>
+            <img class="img-fluid" src="/images/homepage-fig1.1.png" alt="">
+            <figcaption><b>Fig 1.</b> NLPCraft Architecture</figcaption>
+        </figure>
+    </section>
+    <section id="data-model">
+        <h2 class="section-title">Data Model <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
         <p>
-            Back to pipeline. Pipeline should be created based in following components:
+            NLPCraft employs a <em>model-as-a-code</em> approach where everything you do in NLPCraft is part of your source code. Data model is simply an implementation of
+            <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/NCModel.html">NCModel</a> Java interface that
+            can be developed using any JVM programming language like Java, Scala, Kotlin or Groovy.
+            Data model defines named entities, various configuration properties as well as intents to interpret user input. Model-as-a-code natively supports
+            any software lifecycle tools and frameworks in Java ecosystem.
         </p>
-        <ul>
-            <li>
-                <code>Token parser</code>. Mandatory NLP component, it is required for parsing plain text, user input, and split this text into tokens  list. NlpCraft provides default EN implementation of token parser. Also, project contain various examples for FR and RU languages.
-            </li>
-            <li>
-                <code>Tokens enrichers</code> optional list. Tokens enricher is component which allows to add additional properties to prepared tokens, like part of speech, quote, stop-words flags or any other. NlpCraft provides default set of EN tokens enrichers implementations.
-            </li>
-            <li>
-                <code>Tokens validators</code> optional list. Tokens validator is user defined component, where tokens are inspected and exception can be thrown from user code to break user input processing.
-            </li>
-            <li>
-                <code>Entity parsers</code> mandatory list. At least one entity parser must be defined. Having prepared tokens as input, each entity parser tries to find user defined named entities. NlpCraft provides wrappers for named-entity recognition components of OpenNLP and Stanford libraries.
-            </li>
-            <li>
-                <code>Entity enrichers</code> optional list. Entity enricher is component which allows to add additional properties to prepared entities. Can be useful for extending existing entity enrichers functionality.
-            </li>
-            <li>
-                <code>Entity mappers</code> optional list. Entity mapper is component which allows to map one set of entities into another after the entities were parsed and enriched. Can be useful for building complex parsers based on existed.
-            </li>
-            <li>
-                <code>Entity validators</code> optional list. Entities validator is user defined component, where prepared entities are inspected and  exceptions can be thrown from user code to break user input processing.
-            </li>
-            <li>
-                <code>Variant filter</code>. Optional component which allows filtering detected variants, rejecting undesirable.
-            </li>
-        </ul>
-
         <p>
-            Below example if <code>Model</code> creation. <code>Pipeline</code> is prepared using <code>NCPipelineBuilder</code> class helper.
+            Declarative portion of the model can be stored in a separate JSON or YAML file
+            for simpler maintenance. There are no practical limitation on how complex or simple a model
+            can be, or what other tools it can use. Data models use <a href="/intent-matching.html">intents</a> to match the user input.
         </p>
-
-        <pre class="brush: scala, highlight: []">
-            val pipeline =
-                new NCPipelineBuilder().
-                    withTokenParser(new NCFrTokenParser()).
-                    withTokenEnricher(new NCFrLemmaPosTokenEnricher()).
-                    withTokenEnricher(new NCFrStopWordsTokenEnricher()).
-                    withEntityParser(new NCFrSemanticEntityParser("lightswitch_model_fr.yaml")).
-                    build
-            val cfg = NCModelConfig("nlpcraft.lightswitch.fr.ex", "LightSwitch Example Model FR", "1.0")
-
-            val mdl = new NCModelAdapter(cfg, pipeline)
-        </pre>
-
         <p>
-            This flexible system allows to create any pipelines on any language. You can collect NlpCraft predefined components, write your own and easy reuse custom components.
+            To use data model it has to be deployed into a data probe.
+        </p>
+    </section>
+    <section id="data-probe">
+        <h2 class="section-title">Data Probe <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
+        <p>
+            Data probe is a light-weight container designed to securely deploy and manage user data models.
+            Each probe can deploy and manage multiple models and many probes can be connected to the REST server (or a cluster of REST servers).
+            The main purpose of the data probe is to separate data model hosting from managing REST calls from the clients.
+            While you would typically have just one REST server, you may have multiple data probes deployed
+            in different geo-locations and configured differently.
+        </p>
+        <p>
+            Data probes can be deployed and run anywhere as long as there is an ingress connectivity from the REST server, and are
+            typically deployed in DMZ or close to your target data sources: on-premise, in the cloud, etc. Data
+            probe uses strong 256-bit encryption and ingress only connectivity for communicating with the REST server.
+        </p>
+    </section>
+    <section id="server">
+        <h2 class="section-title">REST Server <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
+        <p>
+            REST server (or a cluster of REST servers behind a load balancer) provides URL endpoint for end-user applications
+            to securely query data sources using natural language via data models deployed in data probes. Its main purpose is to
+            accept REST-over-HTTP calls from end-user applications and route these requests to and from requested data probes.
+        </p>
+        <p>
+            Unlike data probe that gets restarted every time the model is changed, i.e. during development, the
+            REST server is a "fire-and-forget" component that can be launched once while various data probes can
+            continuously reconnect to it. It can typically run as a Docker image locally on premise or in the cloud.
+        </p>
+        <p>
+            Learn more about <a href="data-model.html">data model</a>,
+            <a href="server-and-probe.html#probe">data probe</a> and <a href="server-and-probe.html#server">REST server</a>.
+        </p>
+    </section>
+    <section id="in-depth">
+        <h2 class="section-title">In-Depth Look <a href="#"><i class="top-link fas fa-fw fa-angle-double-up"></i></a></h2>
+        <p>
+            Watch this full video (34:42) of the presentation from
+            <a target=_ href="https://www.apachecon.com/acasia2021/">ApacheCon Asia 2021</a> conference to get in-depth understanding of
+            the reasons why NLPCraft project was developed and what are the key principles that underlying it:
         </p>
+        <div>
+            <iframe
+                    width="514"
+                    height="289"
+                    src="https://www.youtube.com/embed/O7iK0AXvcJ8?modestbranding=1"
+                    title="NLPCraft - Breaking Years Of Dogma In NLP"
+                    frameborder="0"
+                    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
+                    allowfullscreen>
+            </iframe>
+        </div>
     </section>
 </div>
 <div class="col-md-2 third-column">
     <ul class="side-nav">
         <li class="side-nav-title">On This Page</li>
         <li><a href="#overview">Overview</a></li>
+        <li><a href="#data-model">Data Model</a></li>
+        <li><a href="#data-probe">Data Probe</a></li>
+        <li><a href="#server">REST Server</a></li>
         {% include quick-links.html %}
     </ul>
 </div>