You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Craig Macdonald <cr...@dcs.gla.ac.uk> on 2007/11/28 20:38:09 UTC
How to HOD?
Hi all,
I've been trying to setup Pig using Hadoop on Demand. Using some
hackery, my incantation now looks like
PATH=/users/tr.craigm/OF_tools/python/bin/:$PATH ROOT=$PWD
scripts/pig.pl -Dlog4j.level=debug -Dhod.server=local
-Dhod.expect.root=$PWD -Dhod.command=hod/bin/hod
-Dhod.expect.uselatest=hodrc/released -Dyinst.cluster=
-Dhadoop.root.logger=DEBUG,console --cluster hodrc
(the name of my hodrc file is hodrc).
However, the HOD connection code in PigContext mystifies me. Does it
correspond to any released version of HOD?
It seems to connect to HOD, and parse the response.
PIG-18 (https://issues.apache.org/jira/browse/PIG-18) states that Pig
needs to be fixed to work with hod 4.
So I presume that Pig does not worth with the HOD version
hod-open-4.tar.gz attached to
https://issues.apache.org/jira/browse/HADOOP-1301
However, it doesnt look like Pig works with the other version of Hod
attached to the same JIRA issue: hod.0.2.2.tar.gz
PigContent.java looks for output from HOD in the form of lines starting:
hdfsUI:
hdfs:
mapredUI:
mapred:
hadoopConf:
I cant find any source in either versions of HOD that resemble this.
Does anyone know if Pig will currently work with any currently openly
available version of HOD?
Thanks in advance
Craig
Re: How to HOD?
Posted by Craig Macdonald <cr...@dcs.gla.ac.uk>.
Comments on the expect script:
At the spawning of hod
spawn -ignore {SIGHUP}
/users/grad/craigm/src/pig/FROMApache/hod0.2/hod.0.2.2/bin/hod -n
[lindex $args 0 ] [lindex $args 1] [lindex $args 2] [lindex $args 3]
[lindex $args 4] [lindex $args 5] [lindex $args 6 ] [lindex $args 7]
[lindex $args 8] [lindex $args 9] [lindex $args 10]
If there are less than 11 arguments to the expect script, expect will
still create the extra argv entries when calling exec(3). This confuses
the python command line parser, as it expects there to be 0 leftover
commandline arguments. Empty command line entries still count as
"len(args) > 0". I fixed this by patching around line 71 of
hod.0.2.2/hodlib/Common/cfg.py
options, args = op.parse_args(argv[1:])
+ argsNoblanks = []
+ for a in args:
+ if len(a) > 0:
+ argsnoblanks.append(a)
- if len(args) > 0
+ if len(argsNoblanks) > 0:
+ print "\nunrecognised argument(s): "
+ print argsNoblanks
op.print_help()
sys.exit(1)
A better solution could probably be made by fixing the expect script,
but I tried and failed.
There's some rather odd yahoo specific bits in places:
@@ -349,9 +350,9 @@
}
private String fixUpDomain(String hostPort) throws
UnknownHostException {
String parts[] = hostPort.split(":");
- if (parts[0].indexOf('.') == -1) {
- parts[0] = parts[0] + ".inktomisearch.com";
- }
+ //if (parts[0].indexOf('.') == -1) {
+ // parts[0] = parts[0] + ".inktomisearch.com";
+ //}
InetAddress.getByName(parts[0]);
return parts[0] + ":" + parts[1];
}
Also a NullPointerException occurs at
@@ -250,7 +250,7 @@
cmd.append('/');
cmd.append(System.getProperty("hod.command"));
//String cmd = System.getProperty("hod.command",
"/home/breed/startHOD.expect");
- String cluster =
System.getProperty("yinst.cluster");
+ String cluster =
System.getProperty("yinst.cluster"); //NPE here if property not set
if (cluster.length() > 0 &&
!cluster.startsWith("kryptonite")) {
cmd.append(" --config=");
cmd.append(System.getProperty("hod.config.dir"));
Thanks
Craig
Benjamin Reed wrote:
> Ah yes, sorry about that. We had a problem with HOD not working well with
> piped inputs and outputs, so we actually use an expect script to interface to
> hod. (We should open an issue on this.)
>
> I'm attaching the script that we use.
>
> ben
>
> On Wednesday 28 November 2007 11:38:09 Craig Macdonald wrote:
>
>> Hi all,
>>
>> I've been trying to setup Pig using Hadoop on Demand. Using some
>> hackery, my incantation now looks like
>>
>> PATH=/users/tr.craigm/OF_tools/python/bin/:$PATH ROOT=$PWD
>> scripts/pig.pl -Dlog4j.level=debug -Dhod.server=local
>> -Dhod.expect.root=$PWD -Dhod.command=hod/bin/hod
>> -Dhod.expect.uselatest=hodrc/released -Dyinst.cluster=
>> -Dhadoop.root.logger=DEBUG,console --cluster hodrc
>>
>> (the name of my hodrc file is hodrc).
>>
>> However, the HOD connection code in PigContext mystifies me. Does it
>> correspond to any released version of HOD?
>> It seems to connect to HOD, and parse the response.
>>
>> PIG-18 (https://issues.apache.org/jira/browse/PIG-18) states that Pig
>> needs to be fixed to work with hod 4.
>> So I presume that Pig does not worth with the HOD version
>> hod-open-4.tar.gz attached to
>> https://issues.apache.org/jira/browse/HADOOP-1301
>>
>> However, it doesnt look like Pig works with the other version of Hod
>> attached to the same JIRA issue: hod.0.2.2.tar.gz
>>
>> PigContent.java looks for output from HOD in the form of lines starting:
>> hdfsUI:
>> hdfs:
>> mapredUI:
>> mapred:
>> hadoopConf:
>>
>> I cant find any source in either versions of HOD that resemble this.
>> Does anyone know if Pig will currently work with any currently openly
>> available version of HOD?
>>
>> Thanks in advance
>>
>> Craig
>>
>
>
>
RE: How to HOD?
Posted by Olga Natkovich <ol...@yahoo-inc.com>.
Yes, that's the plan. HOD 0.4 has a completely different interface that
would not requires us to use expect. Once HOD 0.4 is released, the plan
is to upgrade Pig to work with it as the bug indicates. This should
happen in January.
Olga
-----Original Message-----
From: Craig Macdonald [mailto:craigm@dcs.gla.ac.uk]
Sent: Wednesday, November 28, 2007 12:04 PM
To: Benjamin Reed
Cc: pig-dev@incubator.apache.org
Subject: Re: How to HOD?
Hi Ben,
Ok, Doh moment from me. Thanks for the script, but the hint was enough
to remind me that there's a similar version already in SVN trunk.
I dont think there's no need for a separate issue, as upgrading to work
on Hod4 is already an unresolved issue. Presumably an upgraded Pig would
no longer required the expect script (though notably, I dont think hod 4
produces all the required output ;-)
Ta muchly
Craig
Benjamin Reed wrote:
> Ah yes, sorry about that. We had a problem with HOD not working well
> with piped inputs and outputs, so we actually use an expect script to
> interface to hod. (We should open an issue on this.)
>
> I'm attaching the script that we use.
>
> ben
>
> On Wednesday 28 November 2007 11:38:09 Craig Macdonald wrote:
>
>> Hi all,
>>
>> I've been trying to setup Pig using Hadoop on Demand. Using some
>> hackery, my incantation now looks like
>>
>> PATH=/users/tr.craigm/OF_tools/python/bin/:$PATH ROOT=$PWD
>> scripts/pig.pl -Dlog4j.level=debug -Dhod.server=local
>> -Dhod.expect.root=$PWD -Dhod.command=hod/bin/hod
>> -Dhod.expect.uselatest=hodrc/released -Dyinst.cluster=
>> -Dhadoop.root.logger=DEBUG,console --cluster hodrc
>>
>> (the name of my hodrc file is hodrc).
>>
>> However, the HOD connection code in PigContext mystifies me. Does it
>> correspond to any released version of HOD?
>> It seems to connect to HOD, and parse the response.
>>
>> PIG-18 (https://issues.apache.org/jira/browse/PIG-18) states that Pig
>> needs to be fixed to work with hod 4.
>> So I presume that Pig does not worth with the HOD version
>> hod-open-4.tar.gz attached to
>> https://issues.apache.org/jira/browse/HADOOP-1301
>>
>> However, it doesnt look like Pig works with the other version of Hod
>> attached to the same JIRA issue: hod.0.2.2.tar.gz
>>
>> PigContent.java looks for output from HOD in the form of lines
starting:
>> hdfsUI:
>> hdfs:
>> mapredUI:
>> mapred:
>> hadoopConf:
>>
>> I cant find any source in either versions of HOD that resemble this.
>> Does anyone know if Pig will currently work with any currently openly
>> available version of HOD?
>>
>> Thanks in advance
>>
>> Craig
>>
>
>
>
Re: How to HOD?
Posted by Craig Macdonald <cr...@dcs.gla.ac.uk>.
Hi Ben,
Ok, Doh moment from me. Thanks for the script, but the hint was enough
to remind me that there's a similar version already in SVN trunk.
I dont think there's no need for a separate issue, as upgrading to work
on Hod4 is already an unresolved issue. Presumably an upgraded Pig
would no longer required the expect script (though notably, I dont think
hod 4 produces all the required output ;-)
Ta muchly
Craig
Benjamin Reed wrote:
> Ah yes, sorry about that. We had a problem with HOD not working well with
> piped inputs and outputs, so we actually use an expect script to interface to
> hod. (We should open an issue on this.)
>
> I'm attaching the script that we use.
>
> ben
>
> On Wednesday 28 November 2007 11:38:09 Craig Macdonald wrote:
>
>> Hi all,
>>
>> I've been trying to setup Pig using Hadoop on Demand. Using some
>> hackery, my incantation now looks like
>>
>> PATH=/users/tr.craigm/OF_tools/python/bin/:$PATH ROOT=$PWD
>> scripts/pig.pl -Dlog4j.level=debug -Dhod.server=local
>> -Dhod.expect.root=$PWD -Dhod.command=hod/bin/hod
>> -Dhod.expect.uselatest=hodrc/released -Dyinst.cluster=
>> -Dhadoop.root.logger=DEBUG,console --cluster hodrc
>>
>> (the name of my hodrc file is hodrc).
>>
>> However, the HOD connection code in PigContext mystifies me. Does it
>> correspond to any released version of HOD?
>> It seems to connect to HOD, and parse the response.
>>
>> PIG-18 (https://issues.apache.org/jira/browse/PIG-18) states that Pig
>> needs to be fixed to work with hod 4.
>> So I presume that Pig does not worth with the HOD version
>> hod-open-4.tar.gz attached to
>> https://issues.apache.org/jira/browse/HADOOP-1301
>>
>> However, it doesnt look like Pig works with the other version of Hod
>> attached to the same JIRA issue: hod.0.2.2.tar.gz
>>
>> PigContent.java looks for output from HOD in the form of lines starting:
>> hdfsUI:
>> hdfs:
>> mapredUI:
>> mapred:
>> hadoopConf:
>>
>> I cant find any source in either versions of HOD that resemble this.
>> Does anyone know if Pig will currently work with any currently openly
>> available version of HOD?
>>
>> Thanks in advance
>>
>> Craig
>>
>
>
>
Re: How to HOD?
Posted by Benjamin Reed <br...@yahoo-inc.com>.
Ah yes, sorry about that. We had a problem with HOD not working well with
piped inputs and outputs, so we actually use an expect script to interface to
hod. (We should open an issue on this.)
I'm attaching the script that we use.
ben
On Wednesday 28 November 2007 11:38:09 Craig Macdonald wrote:
> Hi all,
>
> I've been trying to setup Pig using Hadoop on Demand. Using some
> hackery, my incantation now looks like
>
> PATH=/users/tr.craigm/OF_tools/python/bin/:$PATH ROOT=$PWD
> scripts/pig.pl -Dlog4j.level=debug -Dhod.server=local
> -Dhod.expect.root=$PWD -Dhod.command=hod/bin/hod
> -Dhod.expect.uselatest=hodrc/released -Dyinst.cluster=
> -Dhadoop.root.logger=DEBUG,console --cluster hodrc
>
> (the name of my hodrc file is hodrc).
>
> However, the HOD connection code in PigContext mystifies me. Does it
> correspond to any released version of HOD?
> It seems to connect to HOD, and parse the response.
>
> PIG-18 (https://issues.apache.org/jira/browse/PIG-18) states that Pig
> needs to be fixed to work with hod 4.
> So I presume that Pig does not worth with the HOD version
> hod-open-4.tar.gz attached to
> https://issues.apache.org/jira/browse/HADOOP-1301
>
> However, it doesnt look like Pig works with the other version of Hod
> attached to the same JIRA issue: hod.0.2.2.tar.gz
>
> PigContent.java looks for output from HOD in the form of lines starting:
> hdfsUI:
> hdfs:
> mapredUI:
> mapred:
> hadoopConf:
>
> I cant find any source in either versions of HOD that resemble this.
> Does anyone know if Pig will currently work with any currently openly
> available version of HOD?
>
> Thanks in advance
>
> Craig