You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "longbao wang (JIRA)" <ji...@apache.org> on 2015/05/01 16:05:09 UTC
[jira] [Issue Comment Deleted] (SPARK-2336) Approximate k-NN Models
for MLLib
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
longbao wang updated SPARK-2336:
--------------------------------
Comment: was deleted
(was: I really agree with you,and i'm already implementing it,but i have a trouble,after build tree successful,you search target points' knn,so parallelize the input target points then search,but i think this have some questions,and one point's knn may in two partitions or more.)
> Approximate k-NN Models for MLLib
> ---------------------------------
>
> Key: SPARK-2336
> URL: https://issues.apache.org/jira/browse/SPARK-2336
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: Brian Gawalt
> Priority: Minor
> Labels: clustering, features
>
> After tackling the general k-Nearest Neighbor model as per https://issues.apache.org/jira/browse/SPARK-2335 , there's an opportunity to also offer approximate k-Nearest Neighbor. A promising approach would involve building a kd-tree variant within from each partition, a la
> http://www.autonlab.org/autonweb/14714.html?branch=1&language=2
> This could offer a simple non-linear ML model that can label new data with much lower latency than the plain-vanilla kNN versions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org