You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by je...@lewi.us on 2011/05/06 21:27:40 UTC

Re: Problems with LinuxTaskController, LocalJobRunner, and localRunner directory

Thanks Todd.

Unfortunately, I'm using Hadoop cascading, so I'm not sure if there's  
an easy mechanism to force LocalJobs it fires off to use a different  
configuration. I'll talk to the Cascading folks and find out.

J


Quoting Todd Lipcon <to...@cloudera.com>:

> Hi Jeremy,
>
> That's a good point - we don't currently do a good job of segregating the
> configurations used for the LJR from the configs used for the TaskTracker.
> In particular I think both mapred.local.dir and mapred.system.dir are used
> by both.
>
> You run into the same issue when trying to use LJR on a system with a
> configured cluster, even if not using the LinuxTaskController features.
>
> I'd recommend making a separate hadoop conf/ directory with a different
> setting for mapred.local.dir.
>
> -Todd
>
> On Fri, May 6, 2011 at 11:45 AM, <je...@lewi.us> wrote:
>
>> Hi,
>>
>> I'm running hadoop (Cloudera release 3) in pseudo distributed mode, with
>> the linux task controller so that jobs will run as the user who submitted
>> them.
>>
>> My program (which uses hadoop cascading) fires off a job using
>> LocalJobRunner (I think to read data from the local filesystem). So far so
>> good.
>> The job creates the directory
>> /var/lib/hadoop-0.20/cache/pseudo/localRunner
>> (/var/lib/hadoop-0.20/cache/pseudo being the value of mapred.local.dir)
>>
>> The problem is that localRunner isn't owned by the user mapred. Instead its
>> owned by the user who submitted the job. The next time I restart the
>> daemons, the task tracker will fail because it can't rename
>> /var/lib/hadoop-0.20/cache/pseudo/localRunner.
>>
>> Does anybody have suggestions how to fix this?
>>
>> Thanks
>> Jeremy
>>
>>
>>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>