You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2014/07/04 02:10:35 UTC
[jira] [Commented] (MAHOUT-1568) Build an I/O model that can replace sequence files for import/export

    [ https://issues.apache.org/jira/browse/MAHOUT-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052031#comment-14052031 ] 

Hudson commented on MAHOUT-1568:
--------------------------------

FAILURE: Integrated in Mahout-Quality #2682 (See [https://builds.apache.org/job/Mahout-Quality/2682/])
MAHOUT-1561, MAHOUT-1568, MAHOUT-1569 text-delimited Spark readers and writers with drivers and a CLI for 'spark-itemsimilarity' closes apache/mahout#22 (pat: rev 2b65475c3ab682ebd47cffdc6b502698799cd2c8)
* spark/src/main/scala/org/apache/mahout/drivers/MahoutOptionParser.scala
* spark/src/main/scala/org/apache/mahout/drivers/FileSysUtils.scala
* spark/src/main/scala/org/apache/mahout/cf/CooccurrenceAnalysis.scala
* spark/pom.xml
* spark/src/main/scala/org/apache/mahout/sparkbindings/io/MahoutKryoRegistrator.scala
* spark/src/main/scala/org/apache/mahout/drivers/ItemSimilarityDriver.scala
* spark/src/test/scala/org/apache/mahout/sparkbindings/test/MahoutLocalContext.scala
* bin/mahout
* spark/src/main/scala/org/apache/mahout/drivers/TextDelimitedReaderWriter.scala
* spark/src/main/assembly/job.xml
* spark/src/main/scala/org/apache/mahout/drivers/IndexedDataset.scala
* spark/src/main/scala/org/apache/mahout/drivers/MahoutDriver.scala
* spark/src/test/scala/org/apache/mahout/drivers/ItemSimilarityDriverSuite.scala
* CHANGELOG
* spark/src/test/scala/org/apache/mahout/cf/CooccurrenceAnalysisSuite.scala
* spark/src/main/scala/org/apache/mahout/drivers/ReaderWriter.scala
* spark/src/main/scala/org/apache/mahout/drivers/Schema.scala


> Build an I/O model that can replace sequence files for import/export
> --------------------------------------------------------------------
>
>                 Key: MAHOUT-1568
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1568
>             Project: Mahout
>          Issue Type: New Feature
>          Components: CLI
>         Environment: Scala, Spark
>            Reporter: Pat Ferrel
>            Assignee: Pat Ferrel
>
> Implement mechanisms to read and write data from/to flexible stores. These will support tuples streams and drms but with extensions that allow keeping user defined values for IDs. The mechanism in some sense can replace Sequence Files for import/export and will make the operation much easier for the user. In many cases directly consuming their input files.
> Start with text delimited files for input/output in the Spark version of ItemSimilarity
> A proposal is running with ItemSimilarity on Spark and is documented on the github wiki here: https://github.com/pferrel/harness/wiki
> Comments are appreciated



--
This message was sent by Atlassian JIRA
(v6.2#6252)