You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2011/03/09 14:48:59 UTC

[jira] Commented: (MAHOUT-621) Support more data import mechanisms

    [ https://issues.apache.org/jira/browse/MAHOUT-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13004543#comment-13004543 ] 

Sean Owen commented on MAHOUT-621:
----------------------------------

FWIW I envision this as a series of support in mahout-utils perhaps that make it very easy to import into Vectors. Is that about right? It'd be good to have one theory of what data looks like coming in, and provide means to ingest data from m sources into that format for use in n algorithms, rather than support m*n source/algo combinations.

> Support more data import mechanisms
> -----------------------------------
>
>                 Key: MAHOUT-621
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-621
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>              Labels: gsoc2011,, mahout-gsoc-11
>
> We should have more ways of getting data in:
> 1. ARFF (MAHOUT-155)
> 2. CSV (MAHOUT-548)
> 3. Databases
> 4. Behemoth (Tika, Map-Reduce)
> 5. Other

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira