You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by rahul raghavendhra <ra...@gmail.com> on 2011/12/28 13:17:44 UTC
Mahout sequence file format
I am new to Mahout.. i just want to know how text file is converted into
seqfile and then to sparse vectors..
any kind of text file can be converted into seq file using ./mahout
seqdirectory ?
thanks in advance..
./rahul
Re: Mahout sequence file format
Posted by Isabel Drost <is...@apache.org>.
On 28.12.2011 rahul raghavendhra wrote:
> I am new to Mahout.. i just want to know how text file is converted into
> seqfile and then to sparse vectors..
For more detailed pointers on where to start see also
<https://cwiki.apache.org/confluence/display/MAHOUT/Creating+Vectors+from+Text>
Isabel
Re: Mahout sequence file format
Posted by Grant Ingersoll <gs...@apache.org>.
On Dec 28, 2011, at 7:17 AM, rahul raghavendhra wrote:
> I am new to Mahout.. i just want to know how text file is converted into
> seqfile and then to sparse vectors..
There are quite a few steps. I would recommend checking out the code and walking through it. See the SparseVectorsFromSequenceFiles class as well as SequenceFilesFromDirectory.
> any kind of text file can be converted into seq file using ./mahout
> seqdirectory ?
it works with plain text files. I believe you can pass in the encoding of the file. Is that what you are looking for?
-Grant