You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by prasenjit mukherjee <pr...@gmail.com> on 2012/01/03 16:02:24 UTC

creating sequence file from a single file

It seems that seqdirectory takes a directory as an input where each
file is supposed to be a vector ( according to build-reuters.sh ). My
input is in a single file where each line in the file represents a
vector. Is there a mahout driver which creates sequence file from a
line-delimited record format ?

-Thanks,
Prasen

Re: creating sequence file from a single file

Posted by Yue Guan <pi...@gmail.com>.
Hi,

I'm kinda new to Mahout so don't know if there is methods in Mahout
according to your need. But it should be easy as you can read your
file line by line as you do in regular Java program and convert them
to <id, mahout vector> and write to sequencefile. There is
vectorWritable you can use to write and dense vector and sparse vector
as the mahout vector implementation.

On Tue, Jan 3, 2012 at 10:02 AM, prasenjit mukherjee
<pr...@gmail.com> wrote:
> It seems that seqdirectory takes a directory as an input where each
> file is supposed to be a vector ( according to build-reuters.sh ). My
> input is in a single file where each line in the file represents a
> vector. Is there a mahout driver which creates sequence file from a
> line-delimited record format ?
>
> -Thanks,
> Prasen