You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2017/12/12 20:18:02 UTC

[jira] [Updated] (MADLIB-1187) Modify knn help function for easier use

     [ https://issues.apache.org/jira/browse/MADLIB-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Frank McQuillan updated MADLIB-1187:
------------------------------------
    Priority: Minor  (was: Major)

> Modify knn help function for easier use
> ---------------------------------------
>
>                 Key: MADLIB-1187
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1187
>             Project: Apache MADlib
>          Issue Type: Improvement
>          Components: k-NN
>            Reporter: Jingyi Mei
>            Assignee: Jingyi Mei
>            Priority: Minor
>             Fix For: v2.0
>
>
> Follow on to 
> https://issues.apache.org/jira/browse/MADLIB-927
> Currently, the madlib knn help function only supports the following input arguments with lower case: ' ',  'help', 'usage', '?'.
> At the same time, the help message was implemented as a psql NOTICE, which means if the user set the message level in db higher than NOTICE (e.g. WARNING), the notice won't get printed out.
> An example is like:
> ```
> ALTER DATABASE madlib SET client_min_messages TO WARNING;
> madlib=# select madlib.knn();
>  knn
> -----
> (1 row)
> ```
> while the normal case should look like this:
> ```
> madlib=# select madlib.knn();
> NOTICE:
> k-Nearest Neighbors is a method for finding k closest points to a given data
> DETAIL:
> point in terms of a given metric. Its input consist of data points as features
> from testing examples. For a given k, it looks for k closest points in
> training set for each of the data points in test set. Algorithm generates one
> output per testing example. The output of KNN depends on the type of task:
> For Classification, the output is majority vote of the classes of the k
> nearest data points. The testing example gets assigned the most popular class
> among nearest neighbors. For Regression, the output is average of the values
> of k nearest neighbors of the given testing example.
>  knn
> -----
> (1 row)
> ```
> In this case, we decide to do the following improvement to the help funtion:
> 1. instead of a `RAISE NOTICE` in udf definition, we are going to implement it in a plpython so the help message can always get printed out.
> 2. add 'example' also as a valid input argument, and make all the input arguments ignore upper and lower cases
> 3. Add examples in help funtion while there was no examples before



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)