You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Saikat Kanjilal (JIRA)" <ji...@apache.org> on 2015/03/31 06:47:52 UTC
[jira] [Comment Edited] (MAHOUT-1539) Implement affinity matrix
computation in Mahout DSL
[ https://issues.apache.org/jira/browse/MAHOUT-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387984#comment-14387984 ]
Saikat Kanjilal edited comment on MAHOUT-1539 at 3/31/15 4:47 AM:
------------------------------------------------------------------
So I did some more research and have some questions:
1) Are we going to deal with images or text data to start?
2) What do we really mean by data point, in my mind its represented by a (x,y)
3) I think the similarity measure associated with determining locality sensitive hashing should be configurable, namely we should be able to plug in Jacard/Euclidean or Cosine similarities as functions to be computed
I have a sample localitysensitivehashing scheme coded up in scala but want to get further clarifications on the above before I proceed further
Thanks for your help
was (Author: kanjilal):
So I did some more research and have some questions, I have added questions to JIRA as well:
1) Are we going to deal with images or text data to start?
2) What do we really mean by data point, in my mind its represented by a (x,y)
3) I think the similarity measure associated with determining locality sensitive hashing should be configurable, namely we should be able to plug in Jacard/Euclidean or Cosine similarities as functions to be computed
I have a sample localitysensitivehashing scheme coded up in scala but want to get further clarifications on the above before I proceed further
Thanks for your help
> Implement affinity matrix computation in Mahout DSL
> ---------------------------------------------------
>
> Key: MAHOUT-1539
> URL: https://issues.apache.org/jira/browse/MAHOUT-1539
> Project: Mahout
> Issue Type: Improvement
> Components: Clustering
> Affects Versions: 0.9
> Reporter: Shannon Quinn
> Assignee: Shannon Quinn
> Labels: DSL, scala, spark
> Fix For: 0.10.1
>
> Attachments: ComputeAffinities.scala
>
>
> This has the same goal as MAHOUT-1506, but rather than code the pairwise computations in MapReduce, this will be done in the Mahout DSL.
> An orthogonal issue is the format of the raw input (vectors, text, images, SequenceFiles), and how the user specifies the distance equation and any associated parameters.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)