You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "kulkarni.swarnim@gmail.com" <ku...@gmail.com> on 2012/07/18 18:31:02 UTC

HADOOP_HOME requirement

Hello,

The hive documentation states that either HADOOP_HOME should be set or
hadoop should be on the path. However for some cases, where HADOOP_HOME was
not set but hadoop was on path, I have seen this error pop up:

java.io.IOException: *Cannot run program "null/bin/hadoop" *(in directory
"/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException: error=2,
No such file or directory
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
at java.lang.Runtime.exec(Runtime.java:593)
 at java.lang.Runtime.exec(Runtime.java:431)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:208)

Digging into the code in MapRedTask.java, I found the following
(simplified):

String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
"/bin/hadoop");
...

Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));

Clearly, if HADOOP_HOME is not set, the command that it would try to
execute is "null/bin/hadoop" which is exactly the exception I am getting.

Has anyone else run into this before? Is this a bug?

Thanks,
-- 
Swarnim

Re: HADOOP_HOME requirement

Posted by "kulkarni.swarnim@gmail.com" <ku...@gmail.com>.
My main concern here was that HADOOP_HOME is deprecated since hadoop 0.23.
So I was hoping it could actually function as documented.

FWIW, I found this bug[1] that addresses exactly this issue. The attached
patch makes HADOOP_HOME not required and auto-detects hadoop from the path.
This seems to have ben patched to 0.10.0.

[1] https://issues.apache.org/jira/browse/HIVE-2757

On Wed, Jul 18, 2012 at 12:50 PM, kulkarni.swarnim@gmail.com <
kulkarni.swarnim@gmail.com> wrote:

> Hm. Yeah I tried out with a few version 0.7 -> 0.9 and seems like they all
> do. May be we should just update the documentation then?
>
>
> On Wed, Jul 18, 2012 at 12:34 PM, Vinod Singh <vi...@vinodsingh.com>wrote:
>
>> We are using Hive 0.7.1 and there  HADOOP_HOME must be exported so that
>> it is available as environment variable.
>>
>> Thanks,
>> Vinod
>>
>>
>> On Wed, Jul 18, 2012 at 10:48 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>
>>> from hive trunk i can only see this
>>> I am not sure I am 100% sure but I remember setting up HADOOP_HOME always
>>>
>>>
>>> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
>>>
>>>
>>>       String hadoopExec = conf.getVar(HiveConf.ConfVars.HADOOPBIN);
>>>
>>> this change was introduced in 0.8
>>>
>>> from http://svn.apache.org/repos/asf/hive/branches/branch-0.9/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java <http://svn.apache.org/repos/asf/hive/branches/branch-0.8/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java>
>>>
>>> HADOOPBIN("hadoop.bin.path", System.getenv("HADOOP_HOME") + "/bin/hadoop"),
>>>
>>> On Wed, Jul 18, 2012 at 10:38 PM, kulkarni.swarnim@gmail.com <
>>> kulkarni.swarnim@gmail.com> wrote:
>>>
>>>> 0.9
>>>>
>>>>
>>>> On Wed, Jul 18, 2012 at 12:04 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>>>
>>>>> this also depends on what version of hive you are using
>>>>>
>>>>>
>>>>> On Wed, Jul 18, 2012 at 10:33 PM, kulkarni.swarnim@gmail.com <
>>>>> kulkarni.swarnim@gmail.com> wrote:
>>>>>
>>>>>> Thanks for your reply nitin.
>>>>>>
>>>>>> Ok. So you mean we always need to set HADOOP_HOME irrespective of
>>>>>> "hadoop" is on the path or not. Correct?
>>>>>>
>>>>>> Little confused because that contradicts what's mentioned here[1].
>>>>>>
>>>>>> [1]
>>>>>> https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar <
>>>>>> nitinpawar432@gmail.com> wrote:
>>>>>>
>>>>>>> This is not a bug.
>>>>>>>
>>>>>>> even if hadoop was path, hive does not use it.
>>>>>>> Hive internally uses HADOOP_HOME in the code base. So you will
>>>>>>> always need to set that for hive.
>>>>>>> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive
>>>>>>> still needs it.
>>>>>>>
>>>>>>> Don't know if that answers your question
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nitin
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
>>>>>>> kulkarni.swarnim@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> The hive documentation states that either HADOOP_HOME should be set
>>>>>>>> or hadoop should be on the path. However for some cases, where HADOOP_HOME
>>>>>>>> was not set but hadoop was on path, I have seen this error pop up:
>>>>>>>>
>>>>>>>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in
>>>>>>>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException:
>>>>>>>> error=2, No such file or directory
>>>>>>>>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>>>>>>>> at java.lang.Runtime.exec(Runtime.java:593)
>>>>>>>>  at java.lang.Runtime.exec(Runtime.java:431)
>>>>>>>> at
>>>>>>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>>>>>>>>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
>>>>>>>> at
>>>>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>>>>>>>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
>>>>>>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>>>>>>>>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
>>>>>>>> at
>>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>>>>>>>>  at
>>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>>>>>>>> at
>>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>>>>>>>>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
>>>>>>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>>>>>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>> at
>>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>>  at
>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>>>>>>>
>>>>>>>> Digging into the code in MapRedTask.java, I found the following
>>>>>>>> (simplified):
>>>>>>>>
>>>>>>>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
>>>>>>>> "/bin/hadoop");
>>>>>>>> ...
>>>>>>>>
>>>>>>>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>>>>>>>>
>>>>>>>> Clearly, if HADOOP_HOME is not set, the command that it would try
>>>>>>>> to execute is "null/bin/hadoop" which is exactly the exception I am getting.
>>>>>>>>
>>>>>>>> Has anyone else run into this before? Is this a bug?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> --
>>>>>>>> Swarnim
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Nitin Pawar
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Swarnim
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nitin Pawar
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Swarnim
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>>
>>
>
>
> --
> Swarnim
>



-- 
Swarnim

Re: HADOOP_HOME requirement

Posted by "kulkarni.swarnim@gmail.com" <ku...@gmail.com>.
Hm. Yeah I tried out with a few version 0.7 -> 0.9 and seems like they all
do. May be we should just update the documentation then?

On Wed, Jul 18, 2012 at 12:34 PM, Vinod Singh <vi...@vinodsingh.com> wrote:

> We are using Hive 0.7.1 and there  HADOOP_HOME must be exported so that
> it is available as environment variable.
>
> Thanks,
> Vinod
>
>
> On Wed, Jul 18, 2012 at 10:48 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> from hive trunk i can only see this
>> I am not sure I am 100% sure but I remember setting up HADOOP_HOME always
>>
>>
>> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
>>
>>
>>       String hadoopExec = conf.getVar(HiveConf.ConfVars.HADOOPBIN);
>>
>> this change was introduced in 0.8
>>
>> from http://svn.apache.org/repos/asf/hive/branches/branch-0.9/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java <http://svn.apache.org/repos/asf/hive/branches/branch-0.8/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java>
>>
>> HADOOPBIN("hadoop.bin.path", System.getenv("HADOOP_HOME") + "/bin/hadoop"),
>>
>> On Wed, Jul 18, 2012 at 10:38 PM, kulkarni.swarnim@gmail.com <
>> kulkarni.swarnim@gmail.com> wrote:
>>
>>> 0.9
>>>
>>>
>>> On Wed, Jul 18, 2012 at 12:04 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>>
>>>> this also depends on what version of hive you are using
>>>>
>>>>
>>>> On Wed, Jul 18, 2012 at 10:33 PM, kulkarni.swarnim@gmail.com <
>>>> kulkarni.swarnim@gmail.com> wrote:
>>>>
>>>>> Thanks for your reply nitin.
>>>>>
>>>>> Ok. So you mean we always need to set HADOOP_HOME irrespective of
>>>>> "hadoop" is on the path or not. Correct?
>>>>>
>>>>> Little confused because that contradicts what's mentioned here[1].
>>>>>
>>>>> [1]
>>>>> https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar <nitinpawar432@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> This is not a bug.
>>>>>>
>>>>>> even if hadoop was path, hive does not use it.
>>>>>> Hive internally uses HADOOP_HOME in the code base. So you will always
>>>>>> need to set that for hive.
>>>>>> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive
>>>>>> still needs it.
>>>>>>
>>>>>> Don't know if that answers your question
>>>>>>
>>>>>> Thanks,
>>>>>> Nitin
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
>>>>>> kulkarni.swarnim@gmail.com> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> The hive documentation states that either HADOOP_HOME should be set
>>>>>>> or hadoop should be on the path. However for some cases, where HADOOP_HOME
>>>>>>> was not set but hadoop was on path, I have seen this error pop up:
>>>>>>>
>>>>>>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in
>>>>>>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException:
>>>>>>> error=2, No such file or directory
>>>>>>>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>>>>>>> at java.lang.Runtime.exec(Runtime.java:593)
>>>>>>>  at java.lang.Runtime.exec(Runtime.java:431)
>>>>>>> at
>>>>>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>>>>>>>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
>>>>>>> at
>>>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>>>>>>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
>>>>>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>>>>>>>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
>>>>>>> at
>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>>>>>>>  at
>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>>>>>>> at
>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>>>>>>>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
>>>>>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>>>>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>> at
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>  at
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>>>>>>
>>>>>>> Digging into the code in MapRedTask.java, I found the following
>>>>>>> (simplified):
>>>>>>>
>>>>>>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
>>>>>>> "/bin/hadoop");
>>>>>>> ...
>>>>>>>
>>>>>>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>>>>>>>
>>>>>>> Clearly, if HADOOP_HOME is not set, the command that it would try to
>>>>>>> execute is "null/bin/hadoop" which is exactly the exception I am getting.
>>>>>>>
>>>>>>> Has anyone else run into this before? Is this a bug?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> --
>>>>>>> Swarnim
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nitin Pawar
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Swarnim
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nitin Pawar
>>>>
>>>>
>>>
>>>
>>> --
>>> Swarnim
>>>
>>
>>
>>
>> --
>> Nitin Pawar
>>
>>
>


-- 
Swarnim

Re: HADOOP_HOME requirement

Posted by Vinod Singh <vi...@vinodsingh.com>.
We are using Hive 0.7.1 and there  HADOOP_HOME must be exported so that it
is available as environment variable.

Thanks,
Vinod

On Wed, Jul 18, 2012 at 10:48 PM, Nitin Pawar <ni...@gmail.com>wrote:

> from hive trunk i can only see this
> I am not sure I am 100% sure but I remember setting up HADOOP_HOME always
>
>
> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java
>
>
>       String hadoopExec = conf.getVar(HiveConf.ConfVars.HADOOPBIN);
>
> this change was introduced in 0.8
>
> from http://svn.apache.org/repos/asf/hive/branches/branch-0.9/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java <http://svn.apache.org/repos/asf/hive/branches/branch-0.8/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java>
>
> HADOOPBIN("hadoop.bin.path", System.getenv("HADOOP_HOME") + "/bin/hadoop"),
>
> On Wed, Jul 18, 2012 at 10:38 PM, kulkarni.swarnim@gmail.com <
> kulkarni.swarnim@gmail.com> wrote:
>
>> 0.9
>>
>>
>> On Wed, Jul 18, 2012 at 12:04 PM, Nitin Pawar <ni...@gmail.com>wrote:
>>
>>> this also depends on what version of hive you are using
>>>
>>>
>>> On Wed, Jul 18, 2012 at 10:33 PM, kulkarni.swarnim@gmail.com <
>>> kulkarni.swarnim@gmail.com> wrote:
>>>
>>>> Thanks for your reply nitin.
>>>>
>>>> Ok. So you mean we always need to set HADOOP_HOME irrespective of
>>>> "hadoop" is on the path or not. Correct?
>>>>
>>>> Little confused because that contradicts what's mentioned here[1].
>>>>
>>>> [1]
>>>> https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar <ni...@gmail.com>wrote:
>>>>
>>>>> This is not a bug.
>>>>>
>>>>> even if hadoop was path, hive does not use it.
>>>>> Hive internally uses HADOOP_HOME in the code base. So you will always
>>>>> need to set that for hive.
>>>>> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive still
>>>>> needs it.
>>>>>
>>>>> Don't know if that answers your question
>>>>>
>>>>> Thanks,
>>>>> Nitin
>>>>>
>>>>>
>>>>> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
>>>>> kulkarni.swarnim@gmail.com> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> The hive documentation states that either HADOOP_HOME should be set
>>>>>> or hadoop should be on the path. However for some cases, where HADOOP_HOME
>>>>>> was not set but hadoop was on path, I have seen this error pop up:
>>>>>>
>>>>>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in
>>>>>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException:
>>>>>> error=2, No such file or directory
>>>>>>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>>>>>> at java.lang.Runtime.exec(Runtime.java:593)
>>>>>>  at java.lang.Runtime.exec(Runtime.java:431)
>>>>>> at
>>>>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>>>>>>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
>>>>>> at
>>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>>>>>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
>>>>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>>>>>>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
>>>>>> at
>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>>>>>>  at
>>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>>>>>> at
>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>>>>>>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
>>>>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>>>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>  at
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>>>>>
>>>>>> Digging into the code in MapRedTask.java, I found the following
>>>>>> (simplified):
>>>>>>
>>>>>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
>>>>>> "/bin/hadoop");
>>>>>> ...
>>>>>>
>>>>>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>>>>>>
>>>>>> Clearly, if HADOOP_HOME is not set, the command that it would try to
>>>>>> execute is "null/bin/hadoop" which is exactly the exception I am getting.
>>>>>>
>>>>>> Has anyone else run into this before? Is this a bug?
>>>>>>
>>>>>> Thanks,
>>>>>> --
>>>>>> Swarnim
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Nitin Pawar
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Swarnim
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>>
>>
>>
>> --
>> Swarnim
>>
>
>
>
> --
> Nitin Pawar
>
>

Re: HADOOP_HOME requirement

Posted by Nitin Pawar <ni...@gmail.com>.
from hive trunk i can only see this
I am not sure I am 100% sure but I remember setting up HADOOP_HOME always

http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java


      String hadoopExec = conf.getVar(HiveConf.ConfVars.HADOOPBIN);

this change was introduced in 0.8

from http://svn.apache.org/repos/asf/hive/branches/branch-0.9/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<http://svn.apache.org/repos/asf/hive/branches/branch-0.8/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java>

HADOOPBIN("hadoop.bin.path", System.getenv("HADOOP_HOME") + "/bin/hadoop"),

On Wed, Jul 18, 2012 at 10:38 PM, kulkarni.swarnim@gmail.com <
kulkarni.swarnim@gmail.com> wrote:

> 0.9
>
>
> On Wed, Jul 18, 2012 at 12:04 PM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> this also depends on what version of hive you are using
>>
>>
>> On Wed, Jul 18, 2012 at 10:33 PM, kulkarni.swarnim@gmail.com <
>> kulkarni.swarnim@gmail.com> wrote:
>>
>>> Thanks for your reply nitin.
>>>
>>> Ok. So you mean we always need to set HADOOP_HOME irrespective of
>>> "hadoop" is on the path or not. Correct?
>>>
>>> Little confused because that contradicts what's mentioned here[1].
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive
>>>
>>>
>>> Thanks,
>>>
>>> On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar <ni...@gmail.com>wrote:
>>>
>>>> This is not a bug.
>>>>
>>>> even if hadoop was path, hive does not use it.
>>>> Hive internally uses HADOOP_HOME in the code base. So you will always
>>>> need to set that for hive.
>>>> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive still
>>>> needs it.
>>>>
>>>> Don't know if that answers your question
>>>>
>>>> Thanks,
>>>> Nitin
>>>>
>>>>
>>>> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
>>>> kulkarni.swarnim@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> The hive documentation states that either HADOOP_HOME should be set or
>>>>> hadoop should be on the path. However for some cases, where HADOOP_HOME was
>>>>> not set but hadoop was on path, I have seen this error pop up:
>>>>>
>>>>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in
>>>>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException:
>>>>> error=2, No such file or directory
>>>>>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>>>>> at java.lang.Runtime.exec(Runtime.java:593)
>>>>>  at java.lang.Runtime.exec(Runtime.java:431)
>>>>> at
>>>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>>>>>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
>>>>> at
>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>>>>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
>>>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>>>>>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
>>>>> at
>>>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>>>>>  at
>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>>>>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>>>>>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
>>>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>  at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>>>>
>>>>> Digging into the code in MapRedTask.java, I found the following
>>>>> (simplified):
>>>>>
>>>>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
>>>>> "/bin/hadoop");
>>>>> ...
>>>>>
>>>>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>>>>>
>>>>> Clearly, if HADOOP_HOME is not set, the command that it would try to
>>>>> execute is "null/bin/hadoop" which is exactly the exception I am getting.
>>>>>
>>>>> Has anyone else run into this before? Is this a bug?
>>>>>
>>>>> Thanks,
>>>>> --
>>>>> Swarnim
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nitin Pawar
>>>>
>>>>
>>>
>>>
>>> --
>>> Swarnim
>>>
>>
>>
>>
>> --
>> Nitin Pawar
>>
>>
>
>
> --
> Swarnim
>



-- 
Nitin Pawar

Re: HADOOP_HOME requirement

Posted by "kulkarni.swarnim@gmail.com" <ku...@gmail.com>.
0.9

On Wed, Jul 18, 2012 at 12:04 PM, Nitin Pawar <ni...@gmail.com>wrote:

> this also depends on what version of hive you are using
>
>
> On Wed, Jul 18, 2012 at 10:33 PM, kulkarni.swarnim@gmail.com <
> kulkarni.swarnim@gmail.com> wrote:
>
>> Thanks for your reply nitin.
>>
>> Ok. So you mean we always need to set HADOOP_HOME irrespective of
>> "hadoop" is on the path or not. Correct?
>>
>> Little confused because that contradicts what's mentioned here[1].
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive
>>
>>
>> Thanks,
>>
>> On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar <ni...@gmail.com>wrote:
>>
>>> This is not a bug.
>>>
>>> even if hadoop was path, hive does not use it.
>>> Hive internally uses HADOOP_HOME in the code base. So you will always
>>> need to set that for hive.
>>> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive still
>>> needs it.
>>>
>>> Don't know if that answers your question
>>>
>>> Thanks,
>>> Nitin
>>>
>>>
>>> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
>>> kulkarni.swarnim@gmail.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> The hive documentation states that either HADOOP_HOME should be set or
>>>> hadoop should be on the path. However for some cases, where HADOOP_HOME was
>>>> not set but hadoop was on path, I have seen this error pop up:
>>>>
>>>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in
>>>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException:
>>>> error=2, No such file or directory
>>>>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>>>> at java.lang.Runtime.exec(Runtime.java:593)
>>>>  at java.lang.Runtime.exec(Runtime.java:431)
>>>> at
>>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>>>>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
>>>> at
>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>>>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
>>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>>>>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
>>>> at
>>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>>>>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>>>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>>>>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
>>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>> at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>  at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>>>
>>>> Digging into the code in MapRedTask.java, I found the following
>>>> (simplified):
>>>>
>>>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
>>>> "/bin/hadoop");
>>>> ...
>>>>
>>>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>>>>
>>>> Clearly, if HADOOP_HOME is not set, the command that it would try to
>>>> execute is "null/bin/hadoop" which is exactly the exception I am getting.
>>>>
>>>> Has anyone else run into this before? Is this a bug?
>>>>
>>>> Thanks,
>>>> --
>>>> Swarnim
>>>>
>>>
>>>
>>>
>>> --
>>> Nitin Pawar
>>>
>>>
>>
>>
>> --
>> Swarnim
>>
>
>
>
> --
> Nitin Pawar
>
>


-- 
Swarnim

Re: HADOOP_HOME requirement

Posted by Nitin Pawar <ni...@gmail.com>.
this also depends on what version of hive you are using

On Wed, Jul 18, 2012 at 10:33 PM, kulkarni.swarnim@gmail.com <
kulkarni.swarnim@gmail.com> wrote:

> Thanks for your reply nitin.
>
> Ok. So you mean we always need to set HADOOP_HOME irrespective of "hadoop"
> is on the path or not. Correct?
>
> Little confused because that contradicts what's mentioned here[1].
>
> [1]
> https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive
>
>
> Thanks,
>
> On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar <ni...@gmail.com>wrote:
>
>> This is not a bug.
>>
>> even if hadoop was path, hive does not use it.
>> Hive internally uses HADOOP_HOME in the code base. So you will always
>> need to set that for hive.
>> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive still
>> needs it.
>>
>> Don't know if that answers your question
>>
>> Thanks,
>> Nitin
>>
>>
>> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
>> kulkarni.swarnim@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> The hive documentation states that either HADOOP_HOME should be set or
>>> hadoop should be on the path. However for some cases, where HADOOP_HOME was
>>> not set but hadoop was on path, I have seen this error pop up:
>>>
>>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in
>>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException:
>>> error=2, No such file or directory
>>>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>>> at java.lang.Runtime.exec(Runtime.java:593)
>>>  at java.lang.Runtime.exec(Runtime.java:431)
>>> at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>>>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
>>> at
>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>>>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
>>> at
>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>>>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>>>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>  at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>>
>>> Digging into the code in MapRedTask.java, I found the following
>>> (simplified):
>>>
>>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
>>> "/bin/hadoop");
>>> ...
>>>
>>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>>>
>>> Clearly, if HADOOP_HOME is not set, the command that it would try to
>>> execute is "null/bin/hadoop" which is exactly the exception I am getting.
>>>
>>> Has anyone else run into this before? Is this a bug?
>>>
>>> Thanks,
>>> --
>>> Swarnim
>>>
>>
>>
>>
>> --
>> Nitin Pawar
>>
>>
>
>
> --
> Swarnim
>



-- 
Nitin Pawar

Re: HADOOP_HOME requirement

Posted by "kulkarni.swarnim@gmail.com" <ku...@gmail.com>.
Thanks for your reply nitin.

Ok. So you mean we always need to set HADOOP_HOME irrespective of "hadoop"
is on the path or not. Correct?

Little confused because that contradicts what's mentioned here[1].

[1]
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive


Thanks,

On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar <ni...@gmail.com>wrote:

> This is not a bug.
>
> even if hadoop was path, hive does not use it.
> Hive internally uses HADOOP_HOME in the code base. So you will always need
> to set that for hive.
> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive still
> needs it.
>
> Don't know if that answers your question
>
> Thanks,
> Nitin
>
>
> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
> kulkarni.swarnim@gmail.com> wrote:
>
>> Hello,
>>
>> The hive documentation states that either HADOOP_HOME should be set or
>> hadoop should be on the path. However for some cases, where HADOOP_HOME was
>> not set but hadoop was on path, I have seen this error pop up:
>>
>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in
>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException:
>> error=2, No such file or directory
>>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>> at java.lang.Runtime.exec(Runtime.java:593)
>>  at java.lang.Runtime.exec(Runtime.java:431)
>> at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
>> at
>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
>> at
>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
>> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>  at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>>
>> Digging into the code in MapRedTask.java, I found the following
>> (simplified):
>>
>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
>> "/bin/hadoop");
>> ...
>>
>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>>
>> Clearly, if HADOOP_HOME is not set, the command that it would try to
>> execute is "null/bin/hadoop" which is exactly the exception I am getting.
>>
>> Has anyone else run into this before? Is this a bug?
>>
>> Thanks,
>> --
>> Swarnim
>>
>
>
>
> --
> Nitin Pawar
>
>


-- 
Swarnim

Re: HADOOP_HOME requirement

Posted by Nitin Pawar <ni...@gmail.com>.
This is not a bug.

even if hadoop was path, hive does not use it.
Hive internally uses HADOOP_HOME in the code base. So you will always need
to set that for hive.
Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive still
needs it.

Don't know if that answers your question

Thanks,
Nitin

On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swarnim@gmail.com <
kulkarni.swarnim@gmail.com> wrote:

> Hello,
>
> The hive documentation states that either HADOOP_HOME should be set or
> hadoop should be on the path. However for some cases, where HADOOP_HOME was
> not set but hadoop was on path, I have seen this error pop up:
>
> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in directory
> "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): java.io.IOException: error=2,
> No such file or directory
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
> at java.lang.Runtime.exec(Runtime.java:593)
>  at java.lang.Runtime.exec(Runtime.java:431)
> at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268)
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)
> at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
>  at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
> at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
>  at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
>
> Digging into the code in MapRedTask.java, I found the following
> (simplified):
>
> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") +
> "/bin/hadoop");
> ...
>
> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir));
>
> Clearly, if HADOOP_HOME is not set, the command that it would try to
> execute is "null/bin/hadoop" which is exactly the exception I am getting.
>
> Has anyone else run into this before? Is this a bug?
>
> Thanks,
> --
> Swarnim
>



-- 
Nitin Pawar