You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Saikat Kanjilal (JIRA)" <ji...@apache.org> on 2015/03/31 06:47:52 UTC

[jira] [Comment Edited] (MAHOUT-1539) Implement affinity matrix computation in Mahout DSL

    [ https://issues.apache.org/jira/browse/MAHOUT-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387984#comment-14387984 ] 

Saikat Kanjilal edited comment on MAHOUT-1539 at 3/31/15 4:47 AM:
------------------------------------------------------------------

So I did some more research and have some questions:

1) Are we going to deal with images or text data to start?
2) What do we really mean by data point, in my mind its represented by a  (x,y)
3) I think the similarity measure associated with determining locality sensitive hashing should be configurable, namely we should be able to plug in Jacard/Euclidean or Cosine similarities as functions to be computed

I have a sample localitysensitivehashing scheme coded up in scala but want to get further clarifications on the above before I proceed further

Thanks for your help


was (Author: kanjilal):
So I did some more research and have some questions, I have added questions to JIRA as well:

1) Are we going to deal with images or text data to start?
2) What do we really mean by data point, in my mind its represented by a  (x,y)
3) I think the similarity measure associated with determining locality sensitive hashing should be configurable, namely we should be able to plug in Jacard/Euclidean or Cosine similarities as functions to be computed

I have a sample localitysensitivehashing scheme coded up in scala but want to get further clarifications on the above before I proceed further

Thanks for your help

> Implement affinity matrix computation in Mahout DSL
> ---------------------------------------------------
>
>                 Key: MAHOUT-1539
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1539
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Clustering
>    Affects Versions: 0.9
>            Reporter: Shannon Quinn
>            Assignee: Shannon Quinn
>              Labels: DSL, scala, spark
>             Fix For: 0.10.1
>
>         Attachments: ComputeAffinities.scala
>
>
> This has the same goal as MAHOUT-1506, but rather than code the pairwise computations in MapReduce, this will be done in the Mahout DSL.
> An orthogonal issue is the format of the raw input (vectors, text, images, SequenceFiles), and how the user specifies the distance equation and any associated parameters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)