You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nlpcraft.apache.org by "Sergey Kamov (Jira)" <ji...@apache.org> on 2021/06/17 08:26:00 UTC

[jira] [Updated] (NLPCRAFT-50) Function enricher

     [ https://issues.apache.org/jira/browse/NLPCRAFT-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Kamov updated NLPCRAFT-50:
---------------------------------
    Fix Version/s:     (was: 0.9.0)
                   0.9.1

> Function enricher
> -----------------
>
>                 Key: NLPCRAFT-50
>                 URL: https://issues.apache.org/jira/browse/NLPCRAFT-50
>             Project: NLPCraft
>          Issue Type: Improvement
>          Components: probe
>            Reporter: Sergey Kamov
>            Assignee: Sergey Kamov
>            Priority: Major
>             Fix For: 0.9.1
>
>
> The base idea of function enricher is search of the maximum count of various
>  functions, with references to other elements.
>  For example:
>  Some model has user element 'x:temp' with synonym 'temperature'
>  So, for the sentence 'show me average temperature', token 'average' should
>  be detected as element 'nlpcraft:function' with type 'avg' and relation
>  to 'x:temp' element.
> 1. In general, functions can be
>  - without argument
>  - with one argument
>  - with many arguments
> 2. Functions with arguments should have references to some other elements
>  It can be a user element or some predefined elements like 'nlp:geo' etc.
>  Look at supported elements in the documentation.
>  (I suggest to define a table, which describes, which functions can be
>  related with which elements)
> 3. How to detect
>  Example:
>  - average temperature - ok (element 'x:temp' is after word 'average')
>  - temperature average - ok (same as above)
>  - average - skipped (it doesn't have sense without any references)
>  - average <some free words> temperature - skipped (references should
>  be after word without such gaps)
>  - average the temperature - ok (gaps which contain only stopwords are
>  possible)
> 4. This enricher should create a token with
>  - name 'nlpcraft:function',
>  - mandatory String property 'type' (function name),
>  - optional java.util.List<Integer> property 'indexes', which
>  - omitted for function without arguments
>  - has one length list of indexes for function with one argument
>  ("indexes" field name is hardcoded for internal enrichers and used in
>  some related components)
>  Maybe some additional optional parameters can be passed.
> 5. Supported functions kinds can be
>  - main math
>  - sql
>  - etc
>  (We can start from math and sql functions, and extend supported kinds
>  on next steps)
> 6. Look at Limit, Relation and Sort enrichers as examples,
>  - they have such references to other elements via 'indexes' fields.
>  - also, note please that this enricher also should be called in the loop
>  (as mentioned above) because can have references to nested elements.
>  - look also please, how stop-words processed in these enrichers.
>  (if functions with multiple word-names exist, stopwords are suitable
>  inside these names, like 'X the Y' is ok as 'X Y' , where 'X Y' is
>  multiple words valid function name)
> 7. Functions names and all their synonyms (I guess mostly it can be
>  shortcuts) can be hardcoded.
>  Look at
>  org.apache.nlpcraft.server.nlp.enrichers.coordinate.NCCoordinatesEnricher
>  as example for numeric measures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)