You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Dirk Weissenborn <di...@googlemail.com> on 2012/01/13 16:58:52 UTC

converting idx files to mahout vector files

Hello,

I'd like to know whether there is a possibility in mahout to convert a byte
file like the idx files of the mnist corpus (
http://yann.lecun.com/exdb/mnist/) to files containing mahout vectors,
which i´d like to use for classification with rbms which I am writing now.
Another thing I'd like to ask is what would be the best way to chunk these
corpora in smaller batches and consuming them in hadoops map/reduce in the
training phase, because I am pretty new to hadoop.

Thanks for the help

Regards
Dirk