You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by rakesh kothari <rk...@hotmail.com> on 2011/01/19 00:36:37 UTC

Mapper processing gzipped file

Hi,

There is a gzipped file that needs to be processed by a Map-only hadoop job. If the size of this file is more than the space reserved for non-dfs use on the tasktracker host processing this file and if it's a non data local map task, would this job eventually fail ? Is hadoop jobtracker smart enough to not schedule the task on such nodes ?

Thanks,
-Rakesh

Re: Mapper processing gzipped file

Posted by Harsh J <qw...@gmail.com>.

I don't think it would fail, be it local/non-local for a Map-only job.
AFAIK, input streams are buffered and read (from a local block, or
over the network).

On Wed, Jan 19, 2011 at 5:06 AM, rakesh kothari
<rk...@hotmail.com> wrote:
> Hi,
>
> There is a gzipped file that needs to be processed by a Map-only hadoop job.
> If the size of this file is more than the space reserved for non-dfs use on
> the tasktracker host processing this file and if it's a non data local map
> task, would this job eventually fail ? Is hadoop jobtracker smart enough to
> not schedule the task on such nodes ?
>
> Thanks,
> -Rakesh
>



-- 
Harsh J
www.harshj.com