You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Steven Cullens <sr...@gmail.com> on 2014/03/13 21:12:00 UTC
local file input for seqdirectory
Hi,
I have a large number of files on the order of kilobytes on my local
machine that I want to convert to a sequence file on HDFS. Whenever, I try
to copy the local files to HDFS, hadoop complains about bad blocks,
presumably because each block is 64mb and there are more files than blocks.
In mahout 0.7, I would tell it that the input files are local, like:
mahout seqdirectory -i file://<input directory> -o <HDFS directory>
But I can't use the same command on Mahout 0.9, where it expects the file
system to be HDFS. Is there a workaround to generating the sequence file
using Mahout 0.9? Thanks.
Steven
Re: local file input for seqdirectory
Posted by Steven Cullens <sr...@gmail.com>.
Thanks, Suneel.
On Thu, Mar 13, 2014 at 4:17 PM, Suneel Marthi <su...@yahoo.com>wrote:
> The workaround is to add -xm sequential. A MR version of seqdirectory was
> introduced in 0.8 and hence the default execution mode is MR if none is
> specified.
>
>
>
>
>
>
> On Thursday, March 13, 2014 4:12 PM, Steven Cullens <sr...@gmail.com>
> wrote:
>
> Hi,
>
> I have a large number of files on the order of kilobytes on my local
> machine that I want to convert to a sequence file on HDFS. Whenever, I try
> to copy the local files to HDFS, hadoop complains about bad blocks,
> presumably because each block is 64mb and there are more files than blocks.
> In mahout 0.7, I would tell it that the input files are local, like:
>
> mahout seqdirectory -i file://<input directory> -o <HDFS directory>
>
> But I can't use the same command on Mahout 0.9, where it expects the file
> system to be HDFS. Is there a workaround to generating the sequence file
> using Mahout 0.9? Thanks.
>
> Steven
>
Re: local file input for seqdirectory
Posted by Suneel Marthi <su...@yahoo.com>.
The workaround is to add -xm sequential. A MR version of seqdirectory was introduced in 0.8 and hence the default execution mode is MR if none is specified.
On Thursday, March 13, 2014 4:12 PM, Steven Cullens <sr...@gmail.com> wrote:
Hi,
I have a large number of files on the order of kilobytes on my local
machine that I want to convert to a sequence file on HDFS. Whenever, I try
to copy the local files to HDFS, hadoop complains about bad blocks,
presumably because each block is 64mb and there are more files than blocks.
In mahout 0.7, I would tell it that the input files are local, like:
mahout seqdirectory -i file://<input directory> -o <HDFS directory>
But I can't use the same command on Mahout 0.9, where it expects the file
system to be HDFS. Is there a workaround to generating the sequence file
using Mahout 0.9? Thanks.
Steven