You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@madlib.apache.org by "Frank McQuillan (JIRA)" <ji...@apache.org> on 2017/09/15 17:33:01 UTC

[jira] [Comment Edited] (MADLIB-1159) Provide examples for common sparse matrix cases

    [ https://issues.apache.org/jira/browse/MADLIB-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168233#comment-16168233 ] 

Frank McQuillan edited comment on MADLIB-1159 at 9/15/17 5:32 PM:
------------------------------------------------------------------

Brian,

Could you please write down exactly what you want the final sparse matrix to look like?  Depending on that, I could see one or more of the following modules being used:

encoding 
http://madlib.apache.org/docs/latest/group__grp__encode__categorical.html

pivot
http://madlib.apache.org/docs/latest/group__grp__pivot.html

sparse vectors
http://madlib.apache.org/docs/latest/group__grp__svec.html

Frank




was (Author: fmcquillan):
Brian,

Could you please write down what you want the final sparse matrix to look like?  Depending on that, I could see one or more of the following modules being used:

encoding 
http://madlib.apache.org/docs/latest/group__grp__encode__categorical.html

pivot
http://madlib.apache.org/docs/latest/group__grp__pivot.html

sparse vectors
http://madlib.apache.org/docs/latest/group__grp__svec.html

Frank



> Provide examples for common sparse matrix cases
> -----------------------------------------------
>
>                 Key: MADLIB-1159
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1159
>             Project: Apache MADlib
>          Issue Type: Documentation
>            Reporter: Brian Dolan
>
> A fairly common table structure is of the form  `key1, key2, value` like a triples in a graph.  These are often not normalized.
> It would be useful to provide an example of transforming this class of tables into a sparse matrix.  Perhaps an example dataset could be a term-document matrix.
> TABLE doc_term;
> document, term, freq
> "do androids dream of electric sheep", "rachel", 75
> "do androids dream of electric sheep", "andy", 56
> "do androids dream of electric sheep", "hands", 128
> "da vinci code book review", "vapid",1326
> "da vinci code book review", "uninspired",265
> "da vinci code book review", "nauseating",879293
> "da vinci code book review", "inane",471
> Into a sparse matrix table of documents by features.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)