You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by Sameer Tilak <ss...@live.com> on 2013/11/05 18:17:24 UTC

Local vs mapreduce mode

Dear Pig experts,

I have the following Pig script that works perfectly in local mode. However, in the mapreduce mode I get AU as : 

$HADOOP_CONF_DIR fs -cat /scratch/AU/part-m-00000
Warning: $HADOOP_HOME is deprecated.

{}
{}
{}
{}

Both the local mode and the mapreduce mode relation A is set correctly.

Can anyone please tell me what are the recommended ways for debugging the script in mapreduce mode -- logging utilities etc. 

REGISTER /users/p529444/software/pig-0.11.1/contrib/piggybank/java/piggybank.jar;
REGISTER /users/p529444/software/pig-0.11.1/parser.jar

DEFINE SequenceFileLoader org.apache.pig.piggybank.storage.SequenceFileLoader();

A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: chararray, value: chararray);
DESCRIBE A;
STORE A into '/scratch/A';

AU
 = FOREACH A GENERATE parser.Parser(key) AS {(id: int, class: chararray,
 name: chararray, begin: int, end: int, probone: chararray, probtwo: 
chararray)};
STORE AU into '/scratch/AU';

RE: Local vs mapreduce mode

Posted by Sameer Tilak <ss...@live.com>.

Yes, the input files are on HDFS. 

> Date: Tue, 5 Nov 2013 09:37:08 -0800
> Subject: Re: Local vs mapreduce mode
> From: pradeepg26@gmail.com
> To: user@pig.apache.org
> 
> Really dumb question but... when running in MapReduce mode, is your input
> file on HDFS?
> 
> 
> On Tue, Nov 5, 2013 at 9:17 AM, Sameer Tilak <ss...@live.com> wrote:
> 
> >
> > Dear Pig experts,
> >
> > I have the following Pig script that works perfectly in local mode.
> > However, in the mapreduce mode I get AU as :
> >
> > $HADOOP_CONF_DIR fs -cat /scratch/AU/part-m-00000
> > Warning: $HADOOP_HOME is deprecated.
> >
> > {}
> > {}
> > {}
> > {}
> >
> > Both the local mode and the mapreduce mode relation A is set correctly.
> >
> > Can anyone please tell me what are the recommended ways for debugging the
> > script in mapreduce mode -- logging utilities etc.
> >
> > REGISTER
> > /users/p529444/software/pig-0.11.1/contrib/piggybank/java/piggybank.jar;
> > REGISTER /users/p529444/software/pig-0.11.1/parser.jar
> >
> > DEFINE SequenceFileLoader
> > org.apache.pig.piggybank.storage.SequenceFileLoader();
> >
> > A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: chararray,
> > value: chararray);
> > DESCRIBE A;
> > STORE A into '/scratch/A';
> >
> > AU
> >  = FOREACH A GENERATE parser.Parser(key) AS {(id: int, class: chararray,
> >  name: chararray, begin: int, end: int, probone: chararray, probtwo:
> > chararray)};
> > STORE AU into '/scratch/AU';
> >
> >
> >
> >

Re: Local vs mapreduce mode

Posted by Pradeep Gollakota <pr...@gmail.com>.

Really dumb question but... when running in MapReduce mode, is your input
file on HDFS?


On Tue, Nov 5, 2013 at 9:17 AM, Sameer Tilak <ss...@live.com> wrote:

>
> Dear Pig experts,
>
> I have the following Pig script that works perfectly in local mode.
> However, in the mapreduce mode I get AU as :
>
> $HADOOP_CONF_DIR fs -cat /scratch/AU/part-m-00000
> Warning: $HADOOP_HOME is deprecated.
>
> {}
> {}
> {}
> {}
>
> Both the local mode and the mapreduce mode relation A is set correctly.
>
> Can anyone please tell me what are the recommended ways for debugging the
> script in mapreduce mode -- logging utilities etc.
>
> REGISTER
> /users/p529444/software/pig-0.11.1/contrib/piggybank/java/piggybank.jar;
> REGISTER /users/p529444/software/pig-0.11.1/parser.jar
>
> DEFINE SequenceFileLoader
> org.apache.pig.piggybank.storage.SequenceFileLoader();
>
> A = LOAD '/scratch/file.seq' USING SequenceFileLoader AS (key: chararray,
> value: chararray);
> DESCRIBE A;
> STORE A into '/scratch/A';
>
> AU
>  = FOREACH A GENERATE parser.Parser(key) AS {(id: int, class: chararray,
>  name: chararray, begin: int, end: int, probone: chararray, probtwo:
> chararray)};
> STORE AU into '/scratch/AU';
>
>
>
>