You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nlpcraft.apache.org by "Sergey Kamov (Jira)" <ji...@apache.org> on 2020/05/19 11:44:00 UTC

[jira] [Created] (NLPCRAFT-50) Function enricher

Sergey Kamov created NLPCRAFT-50:
------------------------------------

             Summary: Function enricher
                 Key: NLPCRAFT-50
                 URL: https://issues.apache.org/jira/browse/NLPCRAFT-50
             Project: NLPCraft
          Issue Type: Improvement
          Components: probe
            Reporter: Sergey Kamov
            Assignee: Aaron Radzinski
             Fix For: 0.6.0


The base idea of function enricher is search of the maximum count of various
functions, with references to other elements.
For example:
Some model has user element 'x:temp' with synonym 'temperature'
So, for the sentence 'show me average temperature', token 'average' should
be detected as element 'nlpcraft:function' with type 'avg' and relation
to 'x:temp' element.

1. In general, functions can be

- without argument
- with one argument
- with many arguments


2. Functions with arguments should have references to some other elements
 It can be a user element or some predefined elements like 'nlp:geo' etc.
 Look at supported elements in the documentation.
 (I suggest to define a table, which describes, which functions can be
 related with which elements)

3. How to detect
 Example:
- average temperature - ok (element 'x:temp' is after word 'average')
- temperature average - skipped(such functions cannot be after their
 references)

- average - skipped (it doesn't have sense without any references)
- average <some free words> temperature - skipped (references should
 be after word without such gaps)

- average the temperature - ok (gaps which contain only stopwords are
 possible)


4. This enricher should create a token with

- name 'nlpcraft:function',
- mandatory String property 'type' (function name),
- optional java.util.List<Integer> property 'indexes', which
 - omitted for function without arguments
 - has one length list of indexes for function with one argument
 ("indexes" field name is hardcoded for internal enrichers and used in
 some related components)
 Maybe some additional optional parameters can be passed.


5. Supported functions kinds can be

- main math
- sql
- etc
 (We can start from math and sql functions, and extend supported kinds
 on next steps)


6. Look at Limit, Relation and Sort enrichers as examples,

- they have such references to other elements via 'indexes' fields.
- also, note please that this enricher also should be called in the loop
 (as mentioned above) because can have references to nested elements.

- look also please, how stop-words processed in these enrichers.
 (if functions with multiple word-names exist, stopwords are suitable
 inside these names, like 'X the Y' is ok as 'X Y' , where 'X Y' is
 multiple words valid function name)


7. Functions names and all their synonyms (I guess mostly it can be
 shortcuts) can be hardcoded.
 Look at
 org.apache.nlpcraft.server.nlp.enrichers.coordinate.NCCoordinatesEnricher
 as example for numeric measures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)