You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Lance Riedel <la...@dotspots.com> on 2009/04/29 21:20:46 UTC

Hadoop 19.1 compatibility

Hi,
I'm trying to find information on this, and sorry if I'm missing an  
obvious one here, but my searches are coming up with nothing (except  
on mention of a patch in an earlier thread, but no location, etc) .  
Also, I checked the jira for patches.

Here is the earlier thread: http://mail-archives.apache.org/mod_mbox/hadoop-pig-user/200904.mbox/%3C687A1490-E8C8-47C5-A4D4-2E50F03C1E9D@yahoo-inc.com%3E

Thanks!
Lance

Re: Hadoop 19.1 compatibility

Posted by Lance Riedel <la...@dotspots.com>.
Thanks Olga-

Have you had any issues using this patch with a static hadoop cluster  
(not using HOD)? I'm getting an infinite loop starting pig (grunt or  
passing it a script, doesn't matter).

I have tried to get HOD out of the configurations per instructions, so  
I'm a little confused as the references to HOD being shown



 >>>>> Infinite loop ->
When I start pig, I get the following  (infinite loop):

2009-04-29 16:55:27,220 [main] WARN   
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
Failed to create HOD configuration directory - /tmp/ 
PigHod.domU-12-31-38-00-C4-31.dotspots.3204342197969613Retrying...
2009-04-29 16:55:27,267 [main] WARN   
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
Failed to create HOD configuration directory - /tmp/ 
PigHod.domU-12-31-38-00-C4-31.dotspots.3204342261033613Retrying...
2009-04-29 16:55:27,312 [main] WARN   
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
Failed to create HOD configuration directory - /tmp/ 
PigHod.domU-12-31-38-00-C4-31.dotspots.3204342307923613Retrying...
2009-04-29 16:55:27,357 [main] WARN   
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -  
Failed to create HOD configuration directory - /tmp/ 
PigHod.domU-12-31-38-00-C4-31.dotspots.3204342352914613Retrying...
.. And on and on (infinte loop)


More info:
I am using the hadoop19.jar from the jira in my pig/lib dir.

/////////////////Environment:
export JAVA_HOME=/usr/java/devalt
export HADOOP_HOME=/mnt/dist/app/hadoop-0.19.1
export HADOOPDIR=/mnt/dist/app/hadoop-0.19.1/conf
export PIGDIR=/dist/app/pig
export PIG_CLASSPATH=/dist/app/pig/pig.jar
export PIG_HADOOP_VERSION=19
export PATH=/dist/app/apache-ant-1.7.1/bin:$PATH:/mnt/dist/app/ 
hadoop-0.19.1/bin:$PIGDIR/bin


////////////////// pig.properties

# Pig configuration file. All values can be overwritten by command  
line arguments.
# see bin/pig -help

# log4jconf log4j configuration file
# log4jconf=./conf/log4j.properties

# brief logging (no timestamps)
brief=false

# clustername, name of the hadoop jobtracker. If no port is defined  
port 50020 will be used.
cluster=ec2-xx-xx-xx-xx.compute-1.amazonaws.com:54311    # added this  
later, nothing changed
fs.default.name=hdfs://ec2-xx-xx-xx-xx.compute-1.amazonaws.com:54310   
# added this later, nothing changed
mapred.job.tracker=ec2-xx-xx-xx-xx.compute-1.amazonaws.com:54311   #  
added this later nothing changed
#debug level, INFO is default
debug=DEBUG

# a file that contains pig script
#file=

# load jarfile, colon separated
#jar=

#verbose print all log messages to screen (default to print only INFO  
and above to screen)
verbose=false

#exectype local|mapreduce, mapreduce is default
exectype=mapreduce
# hod realted properties
#ssh.gateway
#hod.expect.root
#hod.expect.uselatest
#hod.command
#hod.config.dir
#hod.param


#Do not spill temp files smaller than this size (bytes)
pig.spill.size.threshold=5000000
#EXPERIMENT: Activate garbage collection when spilling a file bigger  
than this size (bytes)
#This should help reduce the number of files being spilled.
pig.spill.gc.activation.size=40000000
On Apr 29, 2009, at 12:57 PM, Olga Natkovich wrote:

> Hi,
>
> We have patches available for both Hadoop 19 and Hadoop 20 that you  
> can
> apply to the code in SVN and build your own pig.jar
>
> https://issues.apache.org/jira/browse/PIG-573
> https://issues.apache.org/jira/browse/PIG-660
>
> Olga
>
>> -----Original Message-----
>> From: Lance Riedel [mailto:lance@dotspots.com]
>> Sent: Wednesday, April 29, 2009 12:26 PM
>> To: Lance Riedel
>> Cc: pig-user@hadoop.apache.org
>> Subject: Re: Hadoop 19.1 compatibility
>>
>> Found more info-
>> Jira PIG-573
>>
>> The last comment was:
>> Kevin Weil added a comment - 14/Apr/09 01:20 PM What is the
>> current status of this patch with pig 0.2? Since PIG-563 went
>> in to 0.20, all that should be necessary is applying this
>> single patch to the 0.20 release source, right?
>>
>> with no reponse, so if anyone has more info, thanks!
>> Lance
>>
>> On Apr 29, 2009, at 12:20 PM, Lance Riedel wrote:
>>
>>> Hi,
>>> I'm trying to find information on this, and sorry if I'm missing an
>>> obvious one here, but my searches are coming up with
>> nothing (except
>>> on mention of a patch in an earlier thread, but no location, etc) .
>>> Also, I checked the jira for patches.
>>>
>>> Here is the earlier thread:
>>>
>> http://mail-archives.apache.org/mod_mbox/hadoop-pig-user/ 
>> 200904.mbox/%
>>> 3C687A1490-E8C8-47C5-A4D4-2E50F03C1E9D@yahoo-inc.com%3E
>>>
>>> Thanks!
>>> Lance
>>
>>


RE: Hadoop 19.1 compatibility

Posted by Olga Natkovich <ol...@yahoo-inc.com>.
Hi,

We have patches available for both Hadoop 19 and Hadoop 20 that you can
apply to the code in SVN and build your own pig.jar

https://issues.apache.org/jira/browse/PIG-573
https://issues.apache.org/jira/browse/PIG-660

Olga

> -----Original Message-----
> From: Lance Riedel [mailto:lance@dotspots.com] 
> Sent: Wednesday, April 29, 2009 12:26 PM
> To: Lance Riedel
> Cc: pig-user@hadoop.apache.org
> Subject: Re: Hadoop 19.1 compatibility
> 
> Found more info-
> Jira PIG-573
> 
> The last comment was:
> Kevin Weil added a comment - 14/Apr/09 01:20 PM What is the 
> current status of this patch with pig 0.2? Since PIG-563 went 
> in to 0.20, all that should be necessary is applying this 
> single patch to the 0.20 release source, right?
> 
> with no reponse, so if anyone has more info, thanks!
> Lance
> 
> On Apr 29, 2009, at 12:20 PM, Lance Riedel wrote:
> 
> > Hi,
> > I'm trying to find information on this, and sorry if I'm missing an 
> > obvious one here, but my searches are coming up with 
> nothing (except 
> > on mention of a patch in an earlier thread, but no location, etc) .
> > Also, I checked the jira for patches.
> >
> > Here is the earlier thread: 
> > 
> http://mail-archives.apache.org/mod_mbox/hadoop-pig-user/200904.mbox/%
> > 3C687A1490-E8C8-47C5-A4D4-2E50F03C1E9D@yahoo-inc.com%3E
> >
> > Thanks!
> > Lance
> 
> 

Re: Hadoop 19.1 compatibility

Posted by Lance Riedel <la...@dotspots.com>.
Found more info-
Jira PIG-573

The last comment was:
Kevin Weil added a comment - 14/Apr/09 01:20 PM
What is the current status of this patch with pig 0.2? Since PIG-563  
went in to 0.20, all that should be necessary is applying this single  
patch to the 0.20 release source, right?

with no reponse, so if anyone has more info, thanks!
Lance

On Apr 29, 2009, at 12:20 PM, Lance Riedel wrote:

> Hi,
> I'm trying to find information on this, and sorry if I'm missing an  
> obvious one here, but my searches are coming up with nothing (except  
> on mention of a patch in an earlier thread, but no location, etc) .  
> Also, I checked the jira for patches.
>
> Here is the earlier thread: http://mail-archives.apache.org/mod_mbox/hadoop-pig-user/200904.mbox/%3C687A1490-E8C8-47C5-A4D4-2E50F03C1E9D@yahoo-inc.com%3E
>
> Thanks!
> Lance