You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2017/10/17 00:51:00 UTC
[jira] [Commented] (MADLIB-1129) Additional output information for
k-NN
[ https://issues.apache.org/jira/browse/MADLIB-1129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206846#comment-16206846 ]
Frank McQuillan commented on MADLIB-1129:
-----------------------------------------
[~hpandey][~okislal]
Looking at
https://github.com/apache/madlib/commit/a32c01c0b827f23c3955d30210a152ac21773c87
&
https://github.com/apache/madlib/commit/0a7efca73bc7d38a60d92b2d5c196d7c449d9525
If I run the classification example I get
{code}
madlib=# SELECT * from madlib_knn_result_classification ORDER BY id;
id | data | prediction | k_nearest_neighbours
----+---------+------------+----------------------
1 | {2,1} | 1 | {3,1,2}
2 | {2,6} | 1 | {3,4,5}
3 | {15,40} | 0 | {5,6,7}
4 | {12,1} | 1 | {3,5,4}
5 | {2,90} | 0 | {9,6,7}
6 | {50,45} | 0 | {6,7,8}
(6 rows)
{code}
but I think the order of the nearest neighbors may be incorrect
e.g., the first row.
The user docs in knn.sql_in is different, it shows:
{code}
id | data | prediction | k_nearest_neighbours
----+---------+------------+----------------------
1 | {2,1} | 1 | {1,2,3}
2 | {2,6} | 1 | {5,4,3}
3 | {15,40} | 0 | {7,6,5}
4 | {12,1} | 1 | {4,5,3}
5 | {2,90} | 0 | {9,6,7}
6 | {50,45} | 0 | {6,7,8}
(6 rows)
{code}
which looks better but I have not checked every row.
If I do simple tests, the sort order looks OK, but not for the example above, testing on Greenplum.
> Additional output information for k-NN
> --------------------------------------
>
> Key: MADLIB-1129
> URL: https://issues.apache.org/jira/browse/MADLIB-1129
> Project: Apache MADlib
> Issue Type: Improvement
> Components: k-NN
> Reporter: Frank McQuillan
> Assignee: Himanshu Pandey
> Priority: Minor
> Labels: starter
> Fix For: v1.13
>
>
> Follow on to
> https://issues.apache.org/jira/browse/MADLIB-927
> List the k-nearest neighbors that were used in the voting/averaging, sorted in ASC order according to the distance function used. This could be added to the current output table.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)