You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@pig.apache.org by George Pang <p0...@gmail.com> on 2009/05/31 04:43:46 UTC

Error on running pig-embedded Java code

Dear users,

I compiled and ran the pig-embedded Java code from the "pig quick start"
example on Eclipse.  I got the following error:

INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
file:///
INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
sessionId=

Obviously it can't find the HDFS or Hadoop. But I have set the PIG_CLASSPATH
as
/usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
 and other environments under Run Configurations / Environment
Is there anything I forgot to do? Any idea is much appreciated!

Pig: 0.1.1
Hadoop: 0.18.3
Eclipse: 3.4.2

George

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

Thank you Jeff, I didn't use a customized Load.  But I did debug the script
line by line.  Since it can run fine on my PIgPen, there is no reason it
can't run on my  embedded program.  Did you remember what your bugs were for
the same problem?
George

2009/6/11 zjffdu <zj...@gmail.com>

> Hi George,
>
> Do you use your customed Load Func? if then you can add some log in the
> getNext() method. Maybe some exceptions happened there.
>
> I also meet this problem before.
>
> Jeff Zhang
>
>
>
> -----Original Message-----
> From: George Pang [mailto:p0941p@gmail.com]
> Sent: 2009年6月10日 13:21
> To: pig-user@hadoop.apache.org
> Subject: Re: Error on running pig-embedded Java code
>
> I think it's running at last.  I add to the Build Path/Configure Build
> Path/Add Variable "HADOOPDIR" and its value.
>
> However, something a little strange of my outcome.  This is the message
> from
> my console:
> 09/06/10 02:34:13 INFO executionengine.HExecutionEngine: Connecting to
> hadoop file system at: hdfs://localhost:9000
> 09/06/10 02:34:14 INFO executionengine.HExecutionEngine: Connecting to
> map-reduce job tracker at: localhost:9001
> 09/06/10 02:34:14 INFO
> mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
> POPackage->POForEach to POJoinPackage
> 09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
> before optimization: 2
> 09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: Merged 0 out of
> total 1 splittees.
> 09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
> after optimization: 2
> 09/06/10 02:34:16 INFO mapReduceLayer.JobControlCompiler: Setting up single
> store job
> 09/06/10 02:34:16 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 09/06/10 02:34:16 INFO mapReduceLayer.MapReduceLauncher: 0% complete
> 09/06/10 02:34:20 INFO mapReduceLayer.MapReduceLauncher: 12% complete
> 09/06/10 02:34:22 INFO
>  ........
>
> mapReduceLayer.MapReduceLauncher: 50% complete
> BYTES WRITTEN : 70731111
> 09/06/10 02:34:38 INFO mapReduceLayer.JobControlCompiler: Setting up single
> store job
> 09/06/10 02:34:39 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> BYTES WRITTEN : 009/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher:
> 100% complete
>
> 09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully
> stored
> result in: "hdfs://localhost:9000/tmp/temp-1002982376/tmp-536709268"
> 09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully
> stored
> result in: "TEST"
> 09/06/10 02:36:40 INFO *mapReduceLayer.MapReduceLauncher: Records written :
> 0
> 09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Bytes written : 0
> *09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Success!
>
> For some reason it write nothing into the output file.  It's weird, because
> I can run the same script in Grunt and get a correct result.
>
> What would you say?  Thanks.
>
> George
>
>
> 2009/6/10 Alan Gates <ga...@yahoo-inc.com>
>
> > You are running in map reduce mode, but youit are not attaching to your
> > hadoop cluster.  It's running it locally.  That's what the "Connecting to
> > hadoop file system at file:///" means.  If you were connecting to a
> cluster
> > it would saying "hdfs://yournamenode" instead of "file:///"  Is the
> > directory containing your hadoop-site.xml in your classpath when
> executing
> > the pig command?  See
> > http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html, the section
> > "Running the Pig Scripts in Hadoop Mode".
> >
> > Alan.
> >
> >
> > On Jun 9, 2009, at 11:03 PM, George Pang wrote:
> >
> >  I think it's not in mapreduce mode. Because I also found the error,
> again:
> >> INFO executionengine.HExecutionEngine: Connecting to hadoop file system
> >> at:
> >> file:///
> >> 09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> >> processName=JobTracker, sessionId=
> >>
> >> George
> >>
> >> 2009/6/9 George Pang <p0...@gmail.com>
> >>
> >>  Now I can run id.hadoop(from the official tutorial
> >>> http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an
> embedded
> >>> Java program, and I can get the result from HDFS.  But one line of the
> >>> console message before the "Success! " reads:
> >>> WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
> >>> Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
> >>> instead of Hadoop LocalExecution
> >>>
> >>> What does it mean or does it matter?  Am my program running in
> map-reduce
> >>> mode at all?  Thanks for any idea!
> >>>
> >>> George
> >>>
> >>>
> >>> 2009/6/3 George Pang <p0...@gmail.com>
> >>>
> >>> Hi Ankur,
> >>>
> >>>> Everything runs in the command line, the error only happens when I use
> >>>> Eclipse.  My Eclipse version is 4.3.2.   When I run the embedded java
> >>>> program, it gave me the error
> >>>> "INFO executionengine.HExecutionEngine: Connecting to hadoop file
> system
> >>>> at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
> >>>> processName=JobTracker, sessionId="
> >>>>
> >>>> The environment variables are set, is it something to do with where
> the
> >>>> data files are put?  Thank you.
> >>>>
> >>>> George
> >>>>
> >>>> 2009/6/3 Ankur Goel <ga...@yahoo-inc.com>
> >>>>
> >>>> Make sure you have the following parameters set:-
> >>>>
> >>>>>
> >>>>> PIGDIR=your/pig/dir
> >>>>>
> >>>>> # you will need to set this, else pig assumes the version to be 17
> >>>>> # and may not be able to find/connect your namenode/jobtracker
> >>>>> PIG_HADOOP_VERSION=18
> >>>>>
> >>>>> HADOOPDIR=your/hadoop/dir/conf
> >>>>>
> >>>>> PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR
> >>>>>
> >>>>> Also make sure you have yahoo specific lines at the bottom commented
> >>>>> out
> >>>>> in pig.properties
> >>>>> under PIGDIR/conf.
> >>>>>
> >>>>> -Ankur
> >>>>>
> >>>>> ----- Original Message -----
> >>>>> From: "George Pang" <p0...@gmail.com>
> >>>>> To: pig-user@hadoop.apache.org
> >>>>> Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
> >>>>> Mumbai, New Delhi
> >>>>> Subject: Re: Error on running pig-embedded Java code
> >>>>>
> >>>>> Any one trying to answer this one?
> >>>>> Thanks
> >>>>>
> >>>>> George
> >>>>>
> >>>>> 2009/5/30 George Pang <p0...@gmail.com>
> >>>>>
> >>>>>  Dear users,
> >>>>>>
> >>>>>> I compiled and ran the pig-embedded Java code from the "pig quick
> >>>>>>
> >>>>> start"
> >>>>>
> >>>>>> example on Eclipse.  I got the following error:
> >>>>>>
> >>>>>> INFO executionengine.HExecutionEngine: Connecting to hadoop file
> >>>>>> system
> >>>>>>
> >>>>> at:
> >>>>>
> >>>>>> file:///
> >>>>>> INFO jvm.JvmMetrics: Initializing JVM Metrics with
> >>>>>>
> >>>>> processName=JobTracker,
> >>>>>
> >>>>>> sessionId=
> >>>>>>
> >>>>>> Obviously it can't find the HDFS or Hadoop. But I have set the
> >>>>>> PIG_CLASSPATH as
> >>>>>>
> >>>>>>
>
> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/ap
> ps/hadoop-0.18.3-patched/conf
> >>>>>
> >>>>>> and other environments under Run Configurations / Environment
> >>>>>> Is there anything I forgot to do? Any idea is much appreciated!
> >>>>>>
> >>>>>> Pig: 0.1.1
> >>>>>> Hadoop: 0.18.3
> >>>>>> Eclipse: 3.4.2
> >>>>>>
> >>>>>> George
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >
>
>

RE: Error on running pig-embedded Java code

Posted by zjffdu <zj...@gmail.com>.

Hi George,

Do you use your customed Load Func? if then you can add some log in the
getNext() method. Maybe some exceptions happened there.

I also meet this problem before.

Jeff Zhang



-----Original Message-----
From: George Pang [mailto:p0941p@gmail.com] 
Sent: 2009年6月10日 13:21
To: pig-user@hadoop.apache.org
Subject: Re: Error on running pig-embedded Java code

I think it's running at last.  I add to the Build Path/Configure Build
Path/Add Variable "HADOOPDIR" and its value.

However, something a little strange of my outcome.  This is the message from
my console:
09/06/10 02:34:13 INFO executionengine.HExecutionEngine: Connecting to
hadoop file system at: hdfs://localhost:9000
09/06/10 02:34:14 INFO executionengine.HExecutionEngine: Connecting to
map-reduce job tracker at: localhost:9001
09/06/10 02:34:14 INFO
mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
POPackage->POForEach to POJoinPackage
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
before optimization: 2
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: Merged 0 out of
total 1 splittees.
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
after optimization: 2
09/06/10 02:34:16 INFO mapReduceLayer.JobControlCompiler: Setting up single
store job
09/06/10 02:34:16 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
09/06/10 02:34:16 INFO mapReduceLayer.MapReduceLauncher: 0% complete
09/06/10 02:34:20 INFO mapReduceLayer.MapReduceLauncher: 12% complete
09/06/10 02:34:22 INFO
 ........

mapReduceLayer.MapReduceLauncher: 50% complete
BYTES WRITTEN : 70731111
09/06/10 02:34:38 INFO mapReduceLayer.JobControlCompiler: Setting up single
store job
09/06/10 02:34:39 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
BYTES WRITTEN : 009/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher:
100% complete

09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
result in: "hdfs://localhost:9000/tmp/temp-1002982376/tmp-536709268"
09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
result in: "TEST"
09/06/10 02:36:40 INFO *mapReduceLayer.MapReduceLauncher: Records written :
0
09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Bytes written : 0
*09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Success!

For some reason it write nothing into the output file.  It's weird, because
I can run the same script in Grunt and get a correct result.

What would you say?  Thanks.

George


2009/6/10 Alan Gates <ga...@yahoo-inc.com>

> You are running in map reduce mode, but youit are not attaching to your
> hadoop cluster.  It's running it locally.  That's what the "Connecting to
> hadoop file system at file:///" means.  If you were connecting to a
cluster
> it would saying "hdfs://yournamenode" instead of "file:///"  Is the
> directory containing your hadoop-site.xml in your classpath when executing
> the pig command?  See
> http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html, the section
> "Running the Pig Scripts in Hadoop Mode".
>
> Alan.
>
>
> On Jun 9, 2009, at 11:03 PM, George Pang wrote:
>
>  I think it's not in mapreduce mode. Because I also found the error,
again:
>> INFO executionengine.HExecutionEngine: Connecting to hadoop file system
>> at:
>> file:///
>> 09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=
>>
>> George
>>
>> 2009/6/9 George Pang <p0...@gmail.com>
>>
>>  Now I can run id.hadoop(from the official tutorial
>>> http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
>>> Java program, and I can get the result from HDFS.  But one line of the
>>> console message before the "Success! " reads:
>>> WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
>>> Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
>>> instead of Hadoop LocalExecution
>>>
>>> What does it mean or does it matter?  Am my program running in
map-reduce
>>> mode at all?  Thanks for any idea!
>>>
>>> George
>>>
>>>
>>> 2009/6/3 George Pang <p0...@gmail.com>
>>>
>>> Hi Ankur,
>>>
>>>> Everything runs in the command line, the error only happens when I use
>>>> Eclipse.  My Eclipse version is 4.3.2.   When I run the embedded java
>>>> program, it gave me the error
>>>> "INFO executionengine.HExecutionEngine: Connecting to hadoop file
system
>>>> at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>>> processName=JobTracker, sessionId="
>>>>
>>>> The environment variables are set, is it something to do with where the
>>>> data files are put?  Thank you.
>>>>
>>>> George
>>>>
>>>> 2009/6/3 Ankur Goel <ga...@yahoo-inc.com>
>>>>
>>>> Make sure you have the following parameters set:-
>>>>
>>>>>
>>>>> PIGDIR=your/pig/dir
>>>>>
>>>>> # you will need to set this, else pig assumes the version to be 17
>>>>> # and may not be able to find/connect your namenode/jobtracker
>>>>> PIG_HADOOP_VERSION=18
>>>>>
>>>>> HADOOPDIR=your/hadoop/dir/conf
>>>>>
>>>>> PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR
>>>>>
>>>>> Also make sure you have yahoo specific lines at the bottom commented
>>>>> out
>>>>> in pig.properties
>>>>> under PIGDIR/conf.
>>>>>
>>>>> -Ankur
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "George Pang" <p0...@gmail.com>
>>>>> To: pig-user@hadoop.apache.org
>>>>> Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
>>>>> Mumbai, New Delhi
>>>>> Subject: Re: Error on running pig-embedded Java code
>>>>>
>>>>> Any one trying to answer this one?
>>>>> Thanks
>>>>>
>>>>> George
>>>>>
>>>>> 2009/5/30 George Pang <p0...@gmail.com>
>>>>>
>>>>>  Dear users,
>>>>>>
>>>>>> I compiled and ran the pig-embedded Java code from the "pig quick
>>>>>>
>>>>> start"
>>>>>
>>>>>> example on Eclipse.  I got the following error:
>>>>>>
>>>>>> INFO executionengine.HExecutionEngine: Connecting to hadoop file
>>>>>> system
>>>>>>
>>>>> at:
>>>>>
>>>>>> file:///
>>>>>> INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>>>>>
>>>>> processName=JobTracker,
>>>>>
>>>>>> sessionId=
>>>>>>
>>>>>> Obviously it can't find the HDFS or Hadoop. But I have set the
>>>>>> PIG_CLASSPATH as
>>>>>>
>>>>>>
/usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/ap
ps/hadoop-0.18.3-patched/conf
>>>>>
>>>>>> and other environments under Run Configurations / Environment
>>>>>> Is there anything I forgot to do? Any idea is much appreciated!
>>>>>>
>>>>>> Pig: 0.1.1
>>>>>> Hadoop: 0.18.3
>>>>>> Eclipse: 3.4.2
>>>>>>
>>>>>> George
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

I think it's running at last.  I add to the Build Path/Configure Build
Path/Add Variable "HADOOPDIR" and its value.

However, something a little strange of my outcome.  This is the message from
my console:
09/06/10 02:34:13 INFO executionengine.HExecutionEngine: Connecting to
hadoop file system at: hdfs://localhost:9000
09/06/10 02:34:14 INFO executionengine.HExecutionEngine: Connecting to
map-reduce job tracker at: localhost:9001
09/06/10 02:34:14 INFO
mapReduceLayer.MRCompiler$LastInputStreamingOptimizer: Rewrite:
POPackage->POForEach to POJoinPackage
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
before optimization: 2
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: Merged 0 out of
total 1 splittees.
09/06/10 02:34:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size
after optimization: 2
09/06/10 02:34:16 INFO mapReduceLayer.JobControlCompiler: Setting up single
store job
09/06/10 02:34:16 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
09/06/10 02:34:16 INFO mapReduceLayer.MapReduceLauncher: 0% complete
09/06/10 02:34:20 INFO mapReduceLayer.MapReduceLauncher: 12% complete
09/06/10 02:34:22 INFO
 ........

mapReduceLayer.MapReduceLauncher: 50% complete
BYTES WRITTEN : 70731111
09/06/10 02:34:38 INFO mapReduceLayer.JobControlCompiler: Setting up single
store job
09/06/10 02:34:39 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
BYTES WRITTEN : 009/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher:
100% complete

09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
result in: "hdfs://localhost:9000/tmp/temp-1002982376/tmp-536709268"
09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Successfully stored
result in: "TEST"
09/06/10 02:36:40 INFO *mapReduceLayer.MapReduceLauncher: Records written :
0
09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Bytes written : 0
*09/06/10 02:36:40 INFO mapReduceLayer.MapReduceLauncher: Success!

For some reason it write nothing into the output file.  It's weird, because
I can run the same script in Grunt and get a correct result.

What would you say?  Thanks.

George


2009/6/10 Alan Gates <ga...@yahoo-inc.com>

> You are running in map reduce mode, but youit are not attaching to your
> hadoop cluster.  It's running it locally.  That's what the "Connecting to
> hadoop file system at file:///" means.  If you were connecting to a cluster
> it would saying "hdfs://yournamenode" instead of "file:///"  Is the
> directory containing your hadoop-site.xml in your classpath when executing
> the pig command?  See
> http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html, the section
> "Running the Pig Scripts in Hadoop Mode".
>
> Alan.
>
>
> On Jun 9, 2009, at 11:03 PM, George Pang wrote:
>
>  I think it's not in mapreduce mode. Because I also found the error, again:
>> INFO executionengine.HExecutionEngine: Connecting to hadoop file system
>> at:
>> file:///
>> 09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId=
>>
>> George
>>
>> 2009/6/9 George Pang <p0...@gmail.com>
>>
>>  Now I can run id.hadoop(from the official tutorial
>>> http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
>>> Java program, and I can get the result from HDFS.  But one line of the
>>> console message before the "Success! " reads:
>>> WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
>>> Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
>>> instead of Hadoop LocalExecution
>>>
>>> What does it mean or does it matter?  Am my program running in map-reduce
>>> mode at all?  Thanks for any idea!
>>>
>>> George
>>>
>>>
>>> 2009/6/3 George Pang <p0...@gmail.com>
>>>
>>> Hi Ankur,
>>>
>>>> Everything runs in the command line, the error only happens when I use
>>>> Eclipse.  My Eclipse version is 4.3.2.   When I run the embedded java
>>>> program, it gave me the error
>>>> "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
>>>> at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>>> processName=JobTracker, sessionId="
>>>>
>>>> The environment variables are set, is it something to do with where the
>>>> data files are put?  Thank you.
>>>>
>>>> George
>>>>
>>>> 2009/6/3 Ankur Goel <ga...@yahoo-inc.com>
>>>>
>>>> Make sure you have the following parameters set:-
>>>>
>>>>>
>>>>> PIGDIR=your/pig/dir
>>>>>
>>>>> # you will need to set this, else pig assumes the version to be 17
>>>>> # and may not be able to find/connect your namenode/jobtracker
>>>>> PIG_HADOOP_VERSION=18
>>>>>
>>>>> HADOOPDIR=your/hadoop/dir/conf
>>>>>
>>>>> PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR
>>>>>
>>>>> Also make sure you have yahoo specific lines at the bottom commented
>>>>> out
>>>>> in pig.properties
>>>>> under PIGDIR/conf.
>>>>>
>>>>> -Ankur
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "George Pang" <p0...@gmail.com>
>>>>> To: pig-user@hadoop.apache.org
>>>>> Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
>>>>> Mumbai, New Delhi
>>>>> Subject: Re: Error on running pig-embedded Java code
>>>>>
>>>>> Any one trying to answer this one?
>>>>> Thanks
>>>>>
>>>>> George
>>>>>
>>>>> 2009/5/30 George Pang <p0...@gmail.com>
>>>>>
>>>>>  Dear users,
>>>>>>
>>>>>> I compiled and ran the pig-embedded Java code from the "pig quick
>>>>>>
>>>>> start"
>>>>>
>>>>>> example on Eclipse.  I got the following error:
>>>>>>
>>>>>> INFO executionengine.HExecutionEngine: Connecting to hadoop file
>>>>>> system
>>>>>>
>>>>> at:
>>>>>
>>>>>> file:///
>>>>>> INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>>>>>
>>>>> processName=JobTracker,
>>>>>
>>>>>> sessionId=
>>>>>>
>>>>>> Obviously it can't find the HDFS or Hadoop. But I have set the
>>>>>> PIG_CLASSPATH as
>>>>>>
>>>>>> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
>>>>>
>>>>>> and other environments under Run Configurations / Environment
>>>>>> Is there anything I forgot to do? Any idea is much appreciated!
>>>>>>
>>>>>> Pig: 0.1.1
>>>>>> Hadoop: 0.18.3
>>>>>> Eclipse: 3.4.2
>>>>>>
>>>>>> George
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>

Re: Error on running pig-embedded Java code

Posted by Alan Gates <ga...@yahoo-inc.com>.

You are running in map reduce mode, but you are not attaching to your  
hadoop cluster.  It's running it locally.  That's what the "Connecting  
to hadoop file system at file:///" means.  If you were connecting to a  
cluster it would saying "hdfs://yournamenode" instead of "file:///"   
Is the directory containing your hadoop-site.xml in your classpath  
when executing the pig command?  See http://hadoop.apache.org/pig/docs/r0.2.0/tutorial.html 
, the section "Running the Pig Scripts in Hadoop Mode".

Alan.

On Jun 9, 2009, at 11:03 PM, George Pang wrote:

> I think it's not in mapreduce mode. Because I also found the error,  
> again:
> INFO executionengine.HExecutionEngine: Connecting to hadoop file  
> system at:
> file:///
> 09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId=
>
> George
>
> 2009/6/9 George Pang <p0...@gmail.com>
>
>> Now I can run id.hadoop(from the official tutorial
>> http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an  
>> embedded
>> Java program, and I can get the result from HDFS.  But one line of  
>> the
>> console message before the "Success! " reads:
>> WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the  
>> JobClient.
>> Please try to use Local, Hadoop Distributed or Hadoop MiniCluster  
>> modes
>> instead of Hadoop LocalExecution
>>
>> What does it mean or does it matter?  Am my program running in map- 
>> reduce
>> mode at all?  Thanks for any idea!
>>
>> George
>>
>>
>> 2009/6/3 George Pang <p0...@gmail.com>
>>
>> Hi Ankur,
>>> Everything runs in the command line, the error only happens when I  
>>> use
>>> Eclipse.  My Eclipse version is 4.3.2.   When I run the embedded  
>>> java
>>> program, it gave me the error
>>> "INFO executionengine.HExecutionEngine: Connecting to hadoop file  
>>> system
>>> at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>> processName=JobTracker, sessionId="
>>>
>>> The environment variables are set, is it something to do with  
>>> where the
>>> data files are put?  Thank you.
>>>
>>> George
>>>
>>> 2009/6/3 Ankur Goel <ga...@yahoo-inc.com>
>>>
>>> Make sure you have the following parameters set:-
>>>>
>>>> PIGDIR=your/pig/dir
>>>>
>>>> # you will need to set this, else pig assumes the version to be 17
>>>> # and may not be able to find/connect your namenode/jobtracker
>>>> PIG_HADOOP_VERSION=18
>>>>
>>>> HADOOPDIR=your/hadoop/dir/conf
>>>>
>>>> PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR
>>>>
>>>> Also make sure you have yahoo specific lines at the bottom  
>>>> commented out
>>>> in pig.properties
>>>> under PIGDIR/conf.
>>>>
>>>> -Ankur
>>>>
>>>> ----- Original Message -----
>>>> From: "George Pang" <p0...@gmail.com>
>>>> To: pig-user@hadoop.apache.org
>>>> Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai,  
>>>> Kolkata,
>>>> Mumbai, New Delhi
>>>> Subject: Re: Error on running pig-embedded Java code
>>>>
>>>> Any one trying to answer this one?
>>>> Thanks
>>>>
>>>> George
>>>>
>>>> 2009/5/30 George Pang <p0...@gmail.com>
>>>>
>>>>> Dear users,
>>>>>
>>>>> I compiled and ran the pig-embedded Java code from the "pig quick
>>>> start"
>>>>> example on Eclipse.  I got the following error:
>>>>>
>>>>> INFO executionengine.HExecutionEngine: Connecting to hadoop file  
>>>>> system
>>>> at:
>>>>> file:///
>>>>> INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>>> processName=JobTracker,
>>>>> sessionId=
>>>>>
>>>>> Obviously it can't find the HDFS or Hadoop. But I have set the
>>>>> PIG_CLASSPATH as
>>>>>
>>>> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/ 
>>>> cloudera/apps/hadoop-0.18.3-patched/conf
>>>>> and other environments under Run Configurations / Environment
>>>>> Is there anything I forgot to do? Any idea is much appreciated!
>>>>>
>>>>> Pig: 0.1.1
>>>>> Hadoop: 0.18.3
>>>>> Eclipse: 3.4.2
>>>>>
>>>>> George
>>>>>
>>>>
>>>
>>>
>>

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

I think it's not in mapreduce mode. Because I also found the error, again:
INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
file:///
09/06/09 21:18:00 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=

George

2009/6/9 George Pang <p0...@gmail.com>

> Now I can run id.hadoop(from the official tutorial
> http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
> Java program, and I can get the result from HDFS.  But one line of the
> console message before the "Success! " reads:
> WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
> Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
> instead of Hadoop LocalExecution
>
> What does it mean or does it matter?  Am my program running in map-reduce
> mode at all?  Thanks for any idea!
>
> George
>
>
> 2009/6/3 George Pang <p0...@gmail.com>
>
> Hi Ankur,
>> Everything runs in the command line, the error only happens when I use
>> Eclipse.  My Eclipse version is 4.3.2.   When I run the embedded java
>> program, it gave me the error
>>  "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
>> at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker, sessionId="
>>
>> The environment variables are set, is it something to do with where the
>> data files are put?  Thank you.
>>
>> George
>>
>> 2009/6/3 Ankur Goel <ga...@yahoo-inc.com>
>>
>> Make sure you have the following parameters set:-
>>>
>>> PIGDIR=your/pig/dir
>>>
>>> # you will need to set this, else pig assumes the version to be 17
>>> # and may not be able to find/connect your namenode/jobtracker
>>> PIG_HADOOP_VERSION=18
>>>
>>> HADOOPDIR=your/hadoop/dir/conf
>>>
>>> PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR
>>>
>>> Also make sure you have yahoo specific lines at the bottom commented out
>>> in pig.properties
>>> under PIGDIR/conf.
>>>
>>> -Ankur
>>>
>>> ----- Original Message -----
>>> From: "George Pang" <p0...@gmail.com>
>>> To: pig-user@hadoop.apache.org
>>> Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
>>> Mumbai, New Delhi
>>> Subject: Re: Error on running pig-embedded Java code
>>>
>>> Any one trying to answer this one?
>>> Thanks
>>>
>>> George
>>>
>>> 2009/5/30 George Pang <p0...@gmail.com>
>>>
>>> > Dear users,
>>> >
>>> > I compiled and ran the pig-embedded Java code from the "pig quick
>>> start"
>>> > example on Eclipse.  I got the following error:
>>> >
>>> > INFO executionengine.HExecutionEngine: Connecting to hadoop file system
>>> at:
>>> > file:///
>>> > INFO jvm.JvmMetrics: Initializing JVM Metrics with
>>> processName=JobTracker,
>>> > sessionId=
>>> >
>>> > Obviously it can't find the HDFS or Hadoop. But I have set the
>>> > PIG_CLASSPATH as
>>> >
>>> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
>>> >  and other environments under Run Configurations / Environment
>>> > Is there anything I forgot to do? Any idea is much appreciated!
>>> >
>>> > Pig: 0.1.1
>>> > Hadoop: 0.18.3
>>> > Eclipse: 3.4.2
>>> >
>>> > George
>>> >
>>>
>>
>>
>

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

Now I can run id.hadoop(from the official tutorial
http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html) as an embedded
Java program, and I can get the result from HDFS.  But one line of the
console message before the "Success! " reads:
WARN mapReduceLayer.MapReduceLauncher: Jobs not found in the JobClient.
Please try to use Local, Hadoop Distributed or Hadoop MiniCluster modes
instead of Hadoop LocalExecution

What does it mean or does it matter?  Am my program running in map-reduce
mode at all?  Thanks for any idea!

George


2009/6/3 George Pang <p0...@gmail.com>

> Hi Ankur,
> Everything runs in the command line, the error only happens when I use
> Eclipse.  My Eclipse version is 4.3.2.   When I run the embedded java
> program, it gave me the error
>  "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
> at: file:/// INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker, sessionId="
>
> The environment variables are set, is it something to do with where the
> data files are put?  Thank you.
>
> George
>
> 2009/6/3 Ankur Goel <ga...@yahoo-inc.com>
>
> Make sure you have the following parameters set:-
>>
>> PIGDIR=your/pig/dir
>>
>> # you will need to set this, else pig assumes the version to be 17
>> # and may not be able to find/connect your namenode/jobtracker
>> PIG_HADOOP_VERSION=18
>>
>> HADOOPDIR=your/hadoop/dir/conf
>>
>> PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR
>>
>> Also make sure you have yahoo specific lines at the bottom commented out
>> in pig.properties
>> under PIGDIR/conf.
>>
>> -Ankur
>>
>> ----- Original Message -----
>> From: "George Pang" <p0...@gmail.com>
>> To: pig-user@hadoop.apache.org
>> Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
>> Mumbai, New Delhi
>> Subject: Re: Error on running pig-embedded Java code
>>
>> Any one trying to answer this one?
>> Thanks
>>
>> George
>>
>> 2009/5/30 George Pang <p0...@gmail.com>
>>
>> > Dear users,
>> >
>> > I compiled and ran the pig-embedded Java code from the "pig quick start"
>> > example on Eclipse.  I got the following error:
>> >
>> > INFO executionengine.HExecutionEngine: Connecting to hadoop file system
>> at:
>> > file:///
>> > INFO jvm.JvmMetrics: Initializing JVM Metrics with
>> processName=JobTracker,
>> > sessionId=
>> >
>> > Obviously it can't find the HDFS or Hadoop. But I have set the
>> > PIG_CLASSPATH as
>> >
>> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
>> >  and other environments under Run Configurations / Environment
>> > Is there anything I forgot to do? Any idea is much appreciated!
>> >
>> > Pig: 0.1.1
>> > Hadoop: 0.18.3
>> > Eclipse: 3.4.2
>> >
>> > George
>> >
>>
>
>

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

Hi Ankur,
Everything runs in the command line, the error only happens when I use
Eclipse.  My Eclipse version is 4.3.2.   When I run the embedded java
program, it gave me the error
 "INFO executionengine.HExecutionEngine: Connecting to hadoop file system
at: file:///INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId="

The environment variables are set, is it something to do with where the data
files are put?  Thank you.

George

2009/6/3 Ankur Goel <ga...@yahoo-inc.com>

> Make sure you have the following parameters set:-
>
> PIGDIR=your/pig/dir
>
> # you will need to set this, else pig assumes the version to be 17
> # and may not be able to find/connect your namenode/jobtracker
> PIG_HADOOP_VERSION=18
>
> HADOOPDIR=your/hadoop/dir/conf
>
> PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR
>
> Also make sure you have yahoo specific lines at the bottom commented out in
> pig.properties
> under PIGDIR/conf.
>
> -Ankur
>
> ----- Original Message -----
> From: "George Pang" <p0...@gmail.com>
> To: pig-user@hadoop.apache.org
> Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata,
> Mumbai, New Delhi
> Subject: Re: Error on running pig-embedded Java code
>
> Any one trying to answer this one?
> Thanks
>
> George
>
> 2009/5/30 George Pang <p0...@gmail.com>
>
> > Dear users,
> >
> > I compiled and ran the pig-embedded Java code from the "pig quick start"
> > example on Eclipse.  I got the following error:
> >
> > INFO executionengine.HExecutionEngine: Connecting to hadoop file system
> at:
> > file:///
> > INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker,
> > sessionId=
> >
> > Obviously it can't find the HDFS or Hadoop. But I have set the
> > PIG_CLASSPATH as
> >
> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
> >  and other environments under Run Configurations / Environment
> > Is there anything I forgot to do? Any idea is much appreciated!
> >
> > Pig: 0.1.1
> > Hadoop: 0.18.3
> > Eclipse: 3.4.2
> >
> > George
> >
>

Re: Error on running pig-embedded Java code

Posted by Ankur Goel <ga...@yahoo-inc.com>.

Make sure you have the following parameters set:-

PIGDIR=your/pig/dir

# you will need to set this, else pig assumes the version to be 17
# and may not be able to find/connect your namenode/jobtracker
PIG_HADOOP_VERSION=18

HADOOPDIR=your/hadoop/dir/conf

PIG_CLASSPATH=$PIGDIR/pig.jar:$HADOOPDIR

Also make sure you have yahoo specific lines at the bottom commented out in pig.properties 
under PIGDIR/conf. 

-Ankur

----- Original Message -----
From: "George Pang" <p0...@gmail.com>
To: pig-user@hadoop.apache.org
Sent: Wednesday, June 3, 2009 6:05:22 AM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
Subject: Re: Error on running pig-embedded Java code

Any one trying to answer this one?
Thanks

George

2009/5/30 George Pang <p0...@gmail.com>

> Dear users,
>
> I compiled and ran the pig-embedded Java code from the "pig quick start"
> example on Eclipse.  I got the following error:
>
> INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
> file:///
> INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
> sessionId=
>
> Obviously it can't find the HDFS or Hadoop. But I have set the
> PIG_CLASSPATH as
> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
>  and other environments under Run Configurations / Environment
> Is there anything I forgot to do? Any idea is much appreciated!
>
> Pig: 0.1.1
> Hadoop: 0.18.3
> Eclipse: 3.4.2
>
> George
>

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

Any one trying to answer this one?
Thanks

George

2009/5/30 George Pang <p0...@gmail.com>

> Dear users,
>
> I compiled and ran the pig-embedded Java code from the "pig quick start"
> example on Eclipse.  I got the following error:
>
> INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
> file:///
> INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
> sessionId=
>
> Obviously it can't find the HDFS or Hadoop. But I have set the
> PIG_CLASSPATH as
> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
>  and other environments under Run Configurations / Environment
> Is there anything I forgot to do? Any idea is much appreciated!
>
> Pig: 0.1.1
> Hadoop: 0.18.3
> Eclipse: 3.4.2
>
> George
>

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

My java code is the example "idhadoop.java"  from
http://hadoop.apache.org/pig/docs/r0.2.0/quickstart.html
So I think it  ran in hadoop mode.

George


2009/5/30 zhang jianfeng <zj...@gmail.com>

> From your logs I can see you run the pig in local model rather than hadoop
> model
>
>
>
> On Sun, May 31, 2009 at 10:43 AM, George Pang <p0...@gmail.com> wrote:
>
> > Dear users,
> >
> > I compiled and ran the pig-embedded Java code from the "pig quick start"
> > example on Eclipse.  I got the following error:
> >
> > INFO executionengine.HExecutionEngine: Connecting to hadoop file system
> at:
> > file:///
> > INFO jvm.JvmMetrics: Initializing JVM Metrics with
> processName=JobTracker,
> > sessionId=
> >
> > Obviously it can't find the HDFS or Hadoop. But I have set the
> > PIG_CLASSPATH
> > as
> >
> >
> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
> >  and other environments under Run Configurations / Environment
> > Is there anything I forgot to do? Any idea is much appreciated!
> >
> > Pig: 0.1.1
> > Hadoop: 0.18.3
> > Eclipse: 3.4.2
> >
> > George
> >
>

Re: Error on running pig-embedded Java code

Posted by zhang jianfeng <zj...@gmail.com>.

>From your logs I can see you run the pig in local model rather than hadoop
model



On Sun, May 31, 2009 at 10:43 AM, George Pang <p0...@gmail.com> wrote:

> Dear users,
>
> I compiled and ran the pig-embedded Java code from the "pig quick start"
> example on Eclipse.  I got the following error:
>
> INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
> file:///
> INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
> sessionId=
>
> Obviously it can't find the HDFS or Hadoop. But I have set the
> PIG_CLASSPATH
> as
>
> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
>  and other environments under Run Configurations / Environment
> Is there anything I forgot to do? Any idea is much appreciated!
>
> Pig: 0.1.1
> Hadoop: 0.18.3
> Eclipse: 3.4.2
>
> George
>

Re: Error on running pig-embedded Java code

Posted by George Pang <p0...@gmail.com>.

The way I run it on eclipse is like,
1) Create a project and under it have idhadoop.java
2) use Build Path/Configure BuildPath/Add External Jar/($PIGDIR )
3) Run Configurations/  under "Main": Main Class: idhadoop

Then I get the error message as described.
Please help If you see anything wrong or lack in this process, thank you.

George

2009/5/30 George Pang <p0...@gmail.com>

> Dear users,
>
> I compiled and ran the pig-embedded Java code from the "pig quick start"
> example on Eclipse.  I got the following error:
>
> INFO executionengine.HExecutionEngine: Connecting to hadoop file system at:
> file:///
> INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker,
> sessionId=
>
> Obviously it can't find the HDFS or Hadoop. But I have set the
> PIG_CLASSPATH as
> /usr/share/cloudera/apps/pig-0.1.1/pig-0.1.1-core.jar:/usr/share/cloudera/apps/hadoop-0.18.3-patched/conf
>  and other environments under Run Configurations / Environment
> Is there anything I forgot to do? Any idea is much appreciated!
>
> Pig: 0.1.1
> Hadoop: 0.18.3
> Eclipse: 3.4.2
>
> George
>