You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nlpcraft.apache.org by "Gleb (Jira)" <ji...@apache.org> on 2020/05/29 07:26:00 UTC

[jira] [Assigned] (NLPCRAFT-50) Function enricher

     [ https://issues.apache.org/jira/browse/NLPCRAFT-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gleb reassigned NLPCRAFT-50:
----------------------------

    Assignee: Gleb  (was: Aaron Radzinski)

> Function enricher
> -----------------
>
>                 Key: NLPCRAFT-50
>                 URL: https://issues.apache.org/jira/browse/NLPCRAFT-50
>             Project: NLPCraft
>          Issue Type: Improvement
>          Components: probe
>            Reporter: Sergey Kamov
>            Assignee: Gleb
>            Priority: Major
>             Fix For: 0.7.0
>
>
> The base idea of function enricher is search of the maximum count of various
> functions, with references to other elements.
> For example:
> Some model has user element 'x:temp' with synonym 'temperature'
> So, for the sentence 'show me average temperature', token 'average' should
> be detected as element 'nlpcraft:function' with type 'avg' and relation
> to 'x:temp' element.
> 1. In general, functions can be
> - without argument
> - with one argument
> - with many arguments
> 2. Functions with arguments should have references to some other elements
>  It can be a user element or some predefined elements like 'nlp:geo' etc.
>  Look at supported elements in the documentation.
>  (I suggest to define a table, which describes, which functions can be
>  related with which elements)
> 3. How to detect
>  Example:
> - average temperature - ok (element 'x:temp' is after word 'average')
> - temperature average - skipped(such functions cannot be after their
>  references)
> - average - skipped (it doesn't have sense without any references)
> - average <some free words> temperature - skipped (references should
>  be after word without such gaps)
> - average the temperature - ok (gaps which contain only stopwords are
>  possible)
> 4. This enricher should create a token with
> - name 'nlpcraft:function',
> - mandatory String property 'type' (function name),
> - optional java.util.List<Integer> property 'indexes', which
>  - omitted for function without arguments
>  - has one length list of indexes for function with one argument
>  ("indexes" field name is hardcoded for internal enrichers and used in
>  some related components)
>  Maybe some additional optional parameters can be passed.
> 5. Supported functions kinds can be
> - main math
> - sql
> - etc
>  (We can start from math and sql functions, and extend supported kinds
>  on next steps)
> 6. Look at Limit, Relation and Sort enrichers as examples,
> - they have such references to other elements via 'indexes' fields.
> - also, note please that this enricher also should be called in the loop
>  (as mentioned above) because can have references to nested elements.
> - look also please, how stop-words processed in these enrichers.
>  (if functions with multiple word-names exist, stopwords are suitable
>  inside these names, like 'X the Y' is ok as 'X Y' , where 'X Y' is
>  multiple words valid function name)
> 7. Functions names and all their synonyms (I guess mostly it can be
>  shortcuts) can be hardcoded.
>  Look at
>  org.apache.nlpcraft.server.nlp.enrichers.coordinate.NCCoordinatesEnricher
>  as example for numeric measures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)