You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nlpcraft.apache.org by ar...@apache.org on 2021/01/19 02:52:21 UTC

[incubator-nlpcraft-website] branch master updated (8c60b58 -> b77e70f)

This is an automated email from the ASF dual-hosted git repository.

aradzinski pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git.


    from 8c60b58  WIP.
     new 4df26f6  Finished article refactoring.
     new b77e70f  Added breaking change notice to 0.7.3 release notes.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 _data/bin-releases.yml                             |   2 +-
 _data/blogs.yaml                                   |   4 +-
 _data/news.yml                                     |   4 +-
 _data/src-releases.yml                             |   2 +-
 _layouts/release-notes.html                        |   2 +-
 _scss/{installation.scss => relnotes.scss}         |  12 +--
 assets/css/style.scss                              |   1 +
 ...he_text.html => composable_named_entities.html} | 113 ++++++++++++++++++++-
 relnotes/release-notes-0.7.3.html                  |   6 ++
 9 files changed, 128 insertions(+), 18 deletions(-)
 copy _scss/{installation.scss => relnotes.scss} (89%)
 rename blogs/{how_to_find_something_in_the_text.html => composable_named_entities.html} (64%)


[incubator-nlpcraft-website] 01/02: Finished article refactoring.

Posted by ar...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

aradzinski pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git

commit 4df26f6cb6a1d0d728df802d98e77ebffcf87d27
Author: Aaron Radzinski <ar...@datalingvo.com>
AuthorDate: Mon Jan 18 18:43:31 2021 -0800

    Finished article refactoring.
---
 _data/blogs.yaml                                   |   4 +-
 _data/news.yml                                     |   4 +-
 ...he_text.html => composable_named_entities.html} | 113 ++++++++++++++++++++-
 3 files changed, 112 insertions(+), 9 deletions(-)

diff --git a/_data/blogs.yaml b/_data/blogs.yaml
index 25dee25..6d246a8 100644
--- a/_data/blogs.yaml
+++ b/_data/blogs.yaml
@@ -15,8 +15,8 @@
 # limitations under the License.
 #
 
-- title: How To Find Something In The Text
-  url: /blogs/how_to_find_something_in_the_text.html
+- title: Composable Named Entities
+  url: /blogs/composable_named_entities.html
   excerpt: Most of the NLP tasks start with the basic challenge - how to find or detect something in the text. Whether you are designing a search engine, conversational interface or some sort of classificator you will likely start with a problem of how to detect named entities in the input text. These named entities can be universal such as dates, countries, cities as well as domain specific for your model. It is also important to note that we are talking about a class of NLP tasks where [...]
   author: Aaron Radzinski
   publish_date: January 20, 2021
diff --git a/_data/news.yml b/_data/news.yml
index 639a02f..a1adb83 100644
--- a/_data/news.yml
+++ b/_data/news.yml
@@ -15,8 +15,8 @@
 # limitations under the License.
 #
 
-- title: How To Find Something In The Text
-  url: /blogs/how_to_find_something_in_the_text.html
+- title: Composable Named Entities
+  url: /blogs/composable_named_entities.html
   excerpt: Most of the NLP tasks start with the basic challenge - how to find or detect something in the text...
   author: Aaron Radzinski
   publish_date: January 20, 2021
diff --git a/blogs/how_to_find_something_in_the_text.html b/blogs/composable_named_entities.html
similarity index 64%
rename from blogs/how_to_find_something_in_the_text.html
rename to blogs/composable_named_entities.html
index 04db768..d011d37 100644
--- a/blogs/how_to_find_something_in_the_text.html
+++ b/blogs/composable_named_entities.html
@@ -1,7 +1,7 @@
 ---
-active_crumb: How To Find Something In The Text
+active_crumb: Composable Named Entities
 layout: blog
-blog_title: How To Find Something In The Text
+blog_title: Composable Named Entities
 author_name: Aaron Radzinski
 author_avatar: images/lion.jpg
 author_twitter_id: aaron_radzinski
@@ -163,7 +163,7 @@ publish_date: January 20, 2021
 <section>
     <h2 class="section-title">Additional Capabilities of Apache NLPCraft</h2>
     <p>
-        Let’s take a look at what Apache NLPCraft brings different or additionally to the table.
+        Let’s take a look at what Apache NLPCraft brings different or additional to the table.
     </p>
     <p>
         When it comes to NER components, Apache NLPCraft provides the following:
@@ -171,14 +171,117 @@ publish_date: January 20, 2021
     <ul>
         <li>Built-in NER components for date, geographical locations, numerics, sorting, limiting, and few others with all of them supporting the extraction of the normalized values and extensive metadata.</li>
         <li>Integration with external NER components from Apache OpenNLP, Stanford NLP, Google Language API and spacy.</li>
-        <li>Support for “composable entities” where users can create new detectable named entities out of existing ones.</li>
+        <li>Support for “composable <span class="amp">&amp;</span> reusable named entities” where users can create new detectable named entities out of existing ones.</li>
     </ul>
     <p>
         While built-in NER components and integration with 3rd party ones is rather a “pedestrian”
-        capabilities (and you can read about them <a href="/integrations.html">here</a>) - the “composable entities” is something that is unique for Apache NLPCraft.
+        capabilities (and you can read about them <a href="/integrations.html">here</a>) - the “composable <span class="amp">&amp;</span> reusable named entities” is something that is unique for Apache NLPCraft.
         Let’s look at it in more detail.
     </p>
 </section>
+<section>
+    <h2 class="section-title">Reusable <span class="amp">&amp;</span> Composable Named Entities</h2>
+    <p>
+        Apache NLPCraft is the first project that provides direct support for composable named entities - named entities
+        that are defined in terms of other (constituent or part) entities.
+        Let’s illustrate this by an example.
+    </p>
+    <p>
+        Let’s imagine you are building an NLP-based answering application utilizing intent-based matching (Alexa,
+        Google DialogFlow, Apache NLPCraft, etc.). In this application we want to answer questions about geographical
+        locations but <b>only the USA</b>.
+    </p>
+    <p>
+        The one of the ways to accomplish this task is to use any NER providers, for example, <code>nlpcraft:city</code> from
+        Apache NLPCraft, and build your intents using it. Then, when a particular intent is selected and its callback is called you can check the <code>country</code>
+        metadata field of the detected named entity. If it does not equal the <code>USA</code> you need to exit (break) from
+        the intent's callback and continue trying other intents, if any were matched as well.
+    </p>
+    <p>
+        Well, that’s not so easy in real life:
+    </p>
+    <ul>
+        <li>
+            First of all, your intent-based NLP library must support such a back-and-forth between intent’s callback
+            and intent matching logic. And very few indeed do…
+        </li>
+        <li>
+            You are spreading the matching logic between declarative intent definition (YAML file) and a
+            programmable intent’s callback (Java code) which generally leads to a very hard to maintain implementation.
+        </li>
+    </ul>
+    <p>
+        Okay... you can create your own brand new NER component from scratch that would detect only geographical
+        locations in the US. However, this will surely take more than a few minutes.
+    </p>
+    <p>
+        Yet another approach, if supported by your intent-based NLP library, is to enhance the intent definition itself
+        to match only USA geographical locations. At this time, however, I’m not aware of any other NLP libraries
+        supporting this other than Apache NLPCraft. Furthermore, you are complicating your intents that generally should be
+        as simple and maintainable as possible.
+    </p>
+    <p>
+        That’s where <b>composable named entities</b> come to the rescue. Apache NLPCraft allows you to define a new named entity
+        using existing ones - user-defined, built-in or external - named entities (more documentation on this can be found
+        <a href="/data-model.html#dsl">here</a>). Following up on our example application:
+    </p>
+    <pre class="brush: js, highlight: 3, 6">
+"elements": [
+  {
+    "id": "custom:city:usa",
+    "description": "Wrapper for USA cities",
+    "synonyms": [
+      "^^id == 'nlpcraft:city' && lowercase(~city:country) == 'usa')^^"
+    ]
+  }
+]
+    </pre>
+    <p>
+        In this model snippet, we are defining a new named entity <code>custom:city:usa</code> (line 3) that is based on
+        existing <code>nlpcraft:city</code> (line 6) that is also filtered for USA country. Once you have this new named entity
+        defined you can use it to define the intent that will only match cities in the USA.
+    </p>
+    <p>
+    Another example:
+    </p>
+    <pre class="brush: js, highlight: [9, 12]">
+"macros": [
+  {
+    "name": "&lt;AIRPORT&gt;",
+    "macro": "{airport|aerodrome|airdrome|air station}"
+  }
+],
+"elements": [
+  {
+    "id": "custom:airport:usa",
+    "description": "Wrapper for USA airports",
+    "synonyms": [
+      "&lt;AIRPORT&gt; {of|for|*} ^^id == 'nlpcraft:city' && lowercase(~city:country) == 'usa')^^"
+    ]
+  }
+]
+    </pre>
+    <p>
+        In this example, we defined a new named entity <code>custom:airport:usa</code>. In its definition we not only
+        filter cities for the USA but also added a prefix that would indicate that this is an airport (learn more about
+        token DSL syntax <a href="https://nlpcraft.apache.org/data-model.html#dsl">here</a>).
+    </p>
+    <p>
+        Composable named entities can be nested but not recursive. All the normalized metadata of the constituent
+        (part) entities - of any nesting depths - is accessible to the named entity, e.g. metadata
+        from <code>nlpcraft:city</code> is accessible in <code>custom:airport:usa</code> entity.
+        You can also define a new composed named entity based on your own named entities. This way you are
+        essentially <b>mixing in</b> new entities instead of creating something from scratch every time.
+    </p>
+    <p>
+    In the end, composable entities allow you to:
+    </p>
+    <ul>
+        <li>Simplify intents by concentrating matching logic in reusable <span class="amp">&amp;</span> composable named entities.</li>
+        <li>Create new named entities without any coding or expensive model training.</li>
+        <li>Reuse existing named entities to build new ones.</li>
+    </ul>
+</section>
 
 
 


[incubator-nlpcraft-website] 02/02: Added breaking change notice to 0.7.3 release notes.

Posted by ar...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

aradzinski pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-nlpcraft-website.git

commit b77e70fb9c8374cf789dc254c349075d56448848
Author: Aaron Radzinski <ar...@datalingvo.com>
AuthorDate: Mon Jan 18 18:52:06 2021 -0800

    Added breaking change notice to 0.7.3 release notes.
---
 _data/bin-releases.yml            |  2 +-
 _data/src-releases.yml            |  2 +-
 _layouts/release-notes.html       |  2 +-
 _scss/relnotes.scss               | 23 +++++++++++++++++++++++
 assets/css/style.scss             |  1 +
 relnotes/release-notes-0.7.3.html |  6 ++++++
 6 files changed, 33 insertions(+), 3 deletions(-)

diff --git a/_data/bin-releases.yml b/_data/bin-releases.yml
index f594bce..8d7c757 100644
--- a/_data/bin-releases.yml
+++ b/_data/bin-releases.yml
@@ -23,7 +23,7 @@
   pgp_url: https://github.com/aradzinski/binstore/releases/download/v0.7.3/apache-nlpcraft-incubating-bin-0.7.3.zip.asc
   github_link: https://github.com/apache/incubator-nlpcraft/tree/v0.7.3
   #  dockerhub_link: https://hub.docker.com/r/nlpcraftserver/server
-  backward_compatible: yes
+  backward_compatible: no
 
 - version: 0.7.2
   date: Nov 19, 2020
diff --git a/_data/src-releases.yml b/_data/src-releases.yml
index 83b2692..a7ecf54 100644
--- a/_data/src-releases.yml
+++ b/_data/src-releases.yml
@@ -22,7 +22,7 @@
   sha256_url: https://downloads.apache.org/incubator/nlpcraft/nlpcraft/apache-nlpcraft-incubating-0.7.3.zip.sha256
   pgp_url: https://downloads.apache.org/incubator/nlpcraft/nlpcraft/apache-nlpcraft-incubating-0.7.3.zip.asc
   github_link: https://github.com/apache/incubator-nlpcraft/tree/v0.7.3
-  backward_compatible: yes
+  backward_compatible: no
 
 - version: 0.7.2
   date: Nov 19, 2020
diff --git a/_layouts/release-notes.html b/_layouts/release-notes.html
index fec5ec7..96bebda 100644
--- a/_layouts/release-notes.html
+++ b/_layouts/release-notes.html
@@ -19,7 +19,7 @@ layout: default
  limitations under the License.
 -->
 
-<div class="container-fluid">
+<div id="relnotes" class="container-fluid">
     <div class="navbar-aligned">
         <ol class="breadcrumb">
             <li class="mr-1"><a href="index.html">Home</a></li>
diff --git a/_scss/relnotes.scss b/_scss/relnotes.scss
new file mode 100644
index 0000000..e6b9272
--- /dev/null
+++ b/_scss/relnotes.scss
@@ -0,0 +1,23 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#relnotes {
+    i.fa-bomb {
+        color: $color-pomegranate;
+        font-size: 90%;
+    }
+}
\ No newline at end of file
diff --git a/assets/css/style.scss b/assets/css/style.scss
index e7399c9..6106770 100644
--- a/assets/css/style.scss
+++ b/assets/css/style.scss
@@ -44,6 +44,7 @@ $default-font: "Helvetica Neue";
 @import 'installation';
 @import 'community';
 @import 'blogs';
+@import 'relnotes';
 
 html {
     position: relative;
diff --git a/relnotes/release-notes-0.7.3.html b/relnotes/release-notes-0.7.3.html
index 45f5289..0284321 100644
--- a/relnotes/release-notes-0.7.3.html
+++ b/relnotes/release-notes-0.7.3.html
@@ -25,6 +25,12 @@ layout: release-notes
     <p>
         <a href="/download.html">NLPCraft 0.7.3</a> brings about several important bug fixes, improvements and enhancements.
     </p>
+    <div class="bq warn">
+        <b><i class="fas fa-fw fa-bomb"></i> Breaking Changes</b>
+        <p>
+            Class <a target="javadoc" href="/apis/latest/org/apache/nlpcraft/model/tools/embedded/NCEmbeddedProbe.html">NCEmbeddedProbe</a> has changed its API in incompatible way.
+        </p>
+    </div>
 </section>
 <section id="new">
     <h2 class="section-title">🙌 New</h2>