You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Stan Rosenberg <st...@gmail.com> on 2012/06/08 23:08:36 UTC

running pig on remote cluster

Hi,

I am trying to submit a pig job to a remote cluster by setting
mapred.job.tracker and  fs.default.name accordingly.
The job does get executed on the remote cluster, however all
intermediate output is stored on the local cluster from which
pig is run.  From job configuration I can see that that
pig.reduce.output.dirs and pig.streaming.log.dir are referencing the
local cluster.
I am supposed to set these manually or is there an alternative?

pig -version
Apache Pig version 0.10.0 (r1328203)
compiled Apr 19 2012, 22:54:12

Thanks,

stan

Re: running pig on remote cluster

Posted by Alex Rovner <al...@gmail.com>.
Make sure your output path has the full uri including the namenode and port information.  Example: instead of /tmp/output 
Hdfs://Namenode:port/tmp/output. 

Sent from my iPhone

On Jun 10, 2012, at 3:23 AM, rakesh sharma <ra...@hotmail.com> wrote:

> 
> I also would like to hear from the experts as I am also facing the same problem.
> Thanks,Rakesh
> 
>> Date: Fri, 8 Jun 2012 17:08:36 -0400
>> Subject: running pig on remote cluster
>> From: stan.rosenberg@gmail.com
>> To: user@pig.apache.org
>> 
>> Hi,
>> 
>> I am trying to submit a pig job to a remote cluster by setting
>> mapred.job.tracker and  fs.default.name accordingly.
>> The job does get executed on the remote cluster, however all
>> intermediate output is stored on the local cluster from which
>> pig is run.  From job configuration I can see that that
>> pig.reduce.output.dirs and pig.streaming.log.dir are referencing the
>> local cluster.
>> I am supposed to set these manually or is there an alternative?
>> 
>> pig -version
>> Apache Pig version 0.10.0 (r1328203)
>> compiled Apr 19 2012, 22:54:12
>> 
>> Thanks,
>> 
>> stan
>                         

RE: running pig on remote cluster

Posted by rakesh sharma <ra...@hotmail.com>.
I also would like to hear from the experts as I am also facing the same problem.
Thanks,Rakesh

> Date: Fri, 8 Jun 2012 17:08:36 -0400
> Subject: running pig on remote cluster
> From: stan.rosenberg@gmail.com
> To: user@pig.apache.org
> 
> Hi,
> 
> I am trying to submit a pig job to a remote cluster by setting
> mapred.job.tracker and  fs.default.name accordingly.
> The job does get executed on the remote cluster, however all
> intermediate output is stored on the local cluster from which
> pig is run.  From job configuration I can see that that
> pig.reduce.output.dirs and pig.streaming.log.dir are referencing the
> local cluster.
> I am supposed to set these manually or is there an alternative?
> 
> pig -version
> Apache Pig version 0.10.0 (r1328203)
> compiled Apr 19 2012, 22:54:12
> 
> Thanks,
> 
> stan