You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Nirmal Kumar <ni...@impetus.co.in> on 2015/09/29 11:18:36 UTC
Unable to run JOIN query from Hive Action in Oozie
Hi All,
I am unable to run a JOIN query via Hive Action however the same JOIN query is running from the Hive CLI.
I tried a simple HQL via the Hive Action and was able to run the same from Oozie workflow.
I am using:
* oozie-4.1.0
* Hive 1.0.0
* Hadoop 2.6.0
Here is snippet of my Hive Action:
<action name="WF_4">
<hive xmlns="uri:oozie:hive-action:0.2">
<job-tracker>xxx.xxx.xxx.xxx:8032</job-tracker>
<name-node>hdfs:// xxx.xxx.xxx.xxx:8020</name-node>
<job-xml>hdfs:// xxx.xxx.xxx.xxx:8020/user/root/db/WorkFlow12333/hive-site.xml</job-xml>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<script>testhql.q</script>
</hive>
<ok to="WF_5"/>
<error to=" Failure "/>
</action>
I am getting the following logs on Hadoop:
1899 [main] INFO org.apache.hadoop.hive.ql.Driver - Total jobs = 1
1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=TimeToSubmit start=1443451614701 end=1443451615873 duration=1172 from=org.apache.hadoop.hive.ql.Driver>
1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=task.MAPREDLOCAL.Stage-5 from=org.apache.hadoop.hive.ql.Driver>
1909 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Generating plan file file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml
1909 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
1909 [main] INFO org.apache.hadoop.hive.ql.exec.Utilities - Serializing MapredLocalWork via kryo
1975 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=serializePlan start=1443451615882 end=1443451615948 duration=66 from=org.apache.hadoop.hive.ql.exec.Utilities>
2079 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Executing: /opt/IdwPlatform/hadoop/hadoop-2.6.0//bin/hadoop jar /opt/IdwPlatform/hadoop/hadoopDirs/hadooptmp/nm-local-dir/filecache/761/Clickstream-0.0.1-SNAPSHOT-driver.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml -jobconffile file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10006/jobconf.xml
12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Execution failed with exit status: 1
12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Obtaining error information
12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task -
Task failed!
Task ID:
Stage-5
Logs:
12290 [main] ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Execution failed with exit status: 1
12291 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=Driver.execute start=1443451615872 end=1443451626264 duration=10392 from=org.apache.hadoop.hive.ql.Driver>
12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
12417 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=releaseLocks start=1443451626264 end=1443451626390 duration=126 from=org.apache.hadoop.hive.ql.Driver>
12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=releaseLocks start=1443451626422 end=1443451626422 duration=0 from=org.apache.hadoop.hive.ql.Driver>
<<< Invocation of Hive command completed <<<
Hadoop Job IDs executed by Hive:
Intercepting System.exit(1)
<<< Invocation of Main class completed <<<
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
Oozie Launcher failed, finishing Hadoop job gracefully
Oozie Launcher, uploading action data to HDFS sequence file: hdfs://192.168.145.191:8020/user/root/oozie-root/0000037-150925141744978-oozie-root-W/WF9--hive/action-data.seq
Oozie Launcher ends
I am not able to get any useful info from the logs.
Not preety sure if this is somewhat related to some Hive related config settings because have not done any.
Any pointers will be great.
Thanks,
-Nirmal
________________________________
NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: Unable to run JOIN query from Hive Action in Oozie
Posted by Nirmal Kumar <ni...@impetus.co.in>.
I got this resolved by adding some properties in hive-site.xml
<property>
<name>hive.optimize.correlation</name>
<value>true</value>
</property>
<property>
<name>hive.auto.convert.join.use.nonstaged</name>
<value>true</value>
</property>
<property>
<name>hive.optimize.bucketmapjoin</name>
<value>true</value>
</property>
<property>
<name>hive.optimize.bucketmapjoin.sortedmerge</name>
<value>true</value>
</property>
<property>
<name>hive.exec.max.created.files</name>
<value>1000000</value>
</property>
<property>
<name>hive.exec.max.dynamic.partitions</name>
<value>100000</value>
</property>
I believe that were needed for some optimization on Hive side for JOIN case.
Thanks,
-Nirmal
-----Original Message-----
From: Oussama Chougna [mailto:oussama1983@hotmail.com]
Sent: Tuesday, September 29, 2015 4:13 PM
To: user@oozie.apache.org
Subject: RE: Unable to run JOIN query from Hive Action in Oozie
Ok, try this:
1. Lookup the job id that was assigned to you oozie action. Should be something like job_XXXXXXX_XXX. The best tool for this is the job history server of hadoop. All jobs are listed there.2. Next on the command line on one of your nodes type "yarn logs -applicationId <job -id from step 1>"
This will print out all logs.
> From: nirmal.kumar@impetus.co.in
> To: user@oozie.apache.org
> Subject: RE: Unable to run JOIN query from Hive Action in Oozie
> Date: Tue, 29 Sep 2015 10:27:56 +0000
>
> I am using YARN.
>
> -Nirmal
>
> -----Original Message-----
> From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> Sent: Tuesday, September 29, 2015 3:32 PM
> To: user@oozie.apache.org
> Subject: RE: Unable to run JOIN query from Hive Action in Oozie
>
> Hi Nirmal,
> It is often hard to find useful log info when running oozie. What processing framework are you using MRv1 or YARN?
> Best,
> Oussama
> > From: nirmal.kumar@impetus.co.in
> > To: user@oozie.apache.org
> > Subject: Unable to run JOIN query from Hive Action in Oozie
> > Date: Tue, 29 Sep 2015 09:18:36 +0000
> >
> > Hi All,
> >
> > I am unable to run a JOIN query via Hive Action however the same JOIN query is running from the Hive CLI.
> > I tried a simple HQL via the Hive Action and was able to run the same from Oozie workflow.
> >
> > I am using:
> >
> > * oozie-4.1.0
> >
> > * Hive 1.0.0
> >
> > * Hadoop 2.6.0
> >
> >
> > Here is snippet of my Hive Action:
> > <action name="WF_4">
> > <hive xmlns="uri:oozie:hive-action:0.2">
> > <job-tracker>xxx.xxx.xxx.xxx:8032</job-tracker>
> > <name-node>hdfs:// xxx.xxx.xxx.xxx:8020</name-node>
> > <job-xml>hdfs:// xxx.xxx.xxx.xxx:8020/user/root/db/WorkFlow12333/hive-site.xml</job-xml>
> > <configuration>
> > <property>
> > <name>mapred.job.queue.name</name>
> > <value>default</value>
> > </property>
> > </configuration>
> > <script>testhql.q</script>
> > </hive>
> > <ok to="WF_5"/>
> > <error to=" Failure "/>
> > </action>
> >
> >
> > I am getting the following logs on Hadoop:
> >
> > 1899 [main] INFO org.apache.hadoop.hive.ql.Driver - Total jobs = 1
> > 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=TimeToSubmit start=1443451614701 end=1443451615873
> > duration=1172 from=org.apache.hadoop.hive.ql.Driver>
> > 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> > 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > <PERFLOG
> > method=task.MAPREDLOCAL.Stage-5
> > from=org.apache.hadoop.hive.ql.Driver>
> > 1909 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > - Generating plan file
> > file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_
> > 20 -16-54_706_5027639748153132622-1/-local-10005/plan.xml
> > 1909 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > <PERFLOG method=serializePlan
> > from=org.apache.hadoop.hive.ql.exec.Utilities>
> > 1909 [main] INFO org.apache.hadoop.hive.ql.exec.Utilities -
> > Serializing MapredLocalWork via kryo
> > 1975 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=serializePlan start=1443451615882 end=1443451615948 duration=66 from=org.apache.hadoop.hive.ql.exec.Utilities>
> > 2079 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Executing: /opt/IdwPlatform/hadoop/hadoop-2.6.0//bin/hadoop jar /opt/IdwPlatform/hadoop/hadoopDirs/hadooptmp/nm-local-dir/filecache/761/Clickstream-0.0.1-SNAPSHOT-driver.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml -jobconffile file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10006/jobconf.xml
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Execution
> > failed with exit status: 1
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Obtaining
> > error information
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Task failed!
> > Task ID:
> > Stage-5
> >
> > Logs:
> >
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > - Execution failed with exit status: 1
> > 12291 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED:
> > Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=Driver.execute start=1443451615872
> > end=1443451626264
> > duration=10392 from=org.apache.hadoop.hive.ql.Driver>
> > 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> > 12417 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=releaseLocks start=1443451626264 end=1443451626390
> > duration=126 from=org.apache.hadoop.hive.ql.Driver>
> > 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> > 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=releaseLocks start=1443451626422 end=1443451626422
> > duration=0 from=org.apache.hadoop.hive.ql.Driver>
> >
> > <<< Invocation of Hive command completed <<<
> >
> > Hadoop Job IDs executed by Hive:
> >
> > Intercepting System.exit(1)
> >
> > <<< Invocation of Main class completed <<<
> >
> > Failing Oozie Launcher, Main class
> > [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
> >
> > Oozie Launcher failed, finishing Hadoop job gracefully
> >
> > Oozie Launcher, uploading action data to HDFS sequence file:
> > hdfs://192.168.145.191:8020/user/root/oozie-root/0000037-15092514174
> > 49 78-oozie-root-W/WF9--hive/action-data.seq
> >
> > Oozie Launcher ends
> >
> > I am not able to get any useful info from the logs.
> > Not preety sure if this is somewhat related to some Hive related config settings because have not done any.
> > Any pointers will be great.
> >
> > Thanks,
> > -Nirmal
> >
> >
> > ________________________________
> >
> >
> >
> >
> >
> >
> > NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
________________________________
NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: Unable to run JOIN query from Hive Action in Oozie
Posted by Oussama Chougna <ou...@hotmail.com>.
Ok, try this:
1. Lookup the job id that was assigned to you oozie action. Should be something like job_XXXXXXX_XXX. The best tool for this is the job history server of hadoop. All jobs are listed there.2. Next on the command line on one of your nodes type "yarn logs -applicationId <job -id from step 1>"
This will print out all logs.
> From: nirmal.kumar@impetus.co.in
> To: user@oozie.apache.org
> Subject: RE: Unable to run JOIN query from Hive Action in Oozie
> Date: Tue, 29 Sep 2015 10:27:56 +0000
>
> I am using YARN.
>
> -Nirmal
>
> -----Original Message-----
> From: Oussama Chougna [mailto:oussama1983@hotmail.com]
> Sent: Tuesday, September 29, 2015 3:32 PM
> To: user@oozie.apache.org
> Subject: RE: Unable to run JOIN query from Hive Action in Oozie
>
> Hi Nirmal,
> It is often hard to find useful log info when running oozie. What processing framework are you using MRv1 or YARN?
> Best,
> Oussama
> > From: nirmal.kumar@impetus.co.in
> > To: user@oozie.apache.org
> > Subject: Unable to run JOIN query from Hive Action in Oozie
> > Date: Tue, 29 Sep 2015 09:18:36 +0000
> >
> > Hi All,
> >
> > I am unable to run a JOIN query via Hive Action however the same JOIN query is running from the Hive CLI.
> > I tried a simple HQL via the Hive Action and was able to run the same from Oozie workflow.
> >
> > I am using:
> >
> > * oozie-4.1.0
> >
> > * Hive 1.0.0
> >
> > * Hadoop 2.6.0
> >
> >
> > Here is snippet of my Hive Action:
> > <action name="WF_4">
> > <hive xmlns="uri:oozie:hive-action:0.2">
> > <job-tracker>xxx.xxx.xxx.xxx:8032</job-tracker>
> > <name-node>hdfs:// xxx.xxx.xxx.xxx:8020</name-node>
> > <job-xml>hdfs:// xxx.xxx.xxx.xxx:8020/user/root/db/WorkFlow12333/hive-site.xml</job-xml>
> > <configuration>
> > <property>
> > <name>mapred.job.queue.name</name>
> > <value>default</value>
> > </property>
> > </configuration>
> > <script>testhql.q</script>
> > </hive>
> > <ok to="WF_5"/>
> > <error to=" Failure "/>
> > </action>
> >
> >
> > I am getting the following logs on Hadoop:
> >
> > 1899 [main] INFO org.apache.hadoop.hive.ql.Driver - Total jobs = 1
> > 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=TimeToSubmit start=1443451614701 end=1443451615873
> > duration=1172 from=org.apache.hadoop.hive.ql.Driver>
> > 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> > method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> > 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> > method=task.MAPREDLOCAL.Stage-5 from=org.apache.hadoop.hive.ql.Driver>
> > 1909 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask -
> > Generating plan file
> > file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20
> > -16-54_706_5027639748153132622-1/-local-10005/plan.xml
> > 1909 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> > method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
> > 1909 [main] INFO org.apache.hadoop.hive.ql.exec.Utilities -
> > Serializing MapredLocalWork via kryo
> > 1975 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=serializePlan start=1443451615882 end=1443451615948 duration=66 from=org.apache.hadoop.hive.ql.exec.Utilities>
> > 2079 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Executing: /opt/IdwPlatform/hadoop/hadoop-2.6.0//bin/hadoop jar /opt/IdwPlatform/hadoop/hadoopDirs/hadooptmp/nm-local-dir/filecache/761/Clickstream-0.0.1-SNAPSHOT-driver.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml -jobconffile file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10006/jobconf.xml
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Execution
> > failed with exit status: 1
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Obtaining
> > error information
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Task failed!
> > Task ID:
> > Stage-5
> >
> > Logs:
> >
> > 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > - Execution failed with exit status: 1
> > 12291 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED:
> > Execution Error, return code 1 from
> > org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> > 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=Driver.execute start=1443451615872 end=1443451626264
> > duration=10392 from=org.apache.hadoop.hive.ql.Driver>
> > 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> > 12417 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=releaseLocks start=1443451626264 end=1443451626390
> > duration=126 from=org.apache.hadoop.hive.ql.Driver>
> > 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> > 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> > </PERFLOG method=releaseLocks start=1443451626422 end=1443451626422
> > duration=0 from=org.apache.hadoop.hive.ql.Driver>
> >
> > <<< Invocation of Hive command completed <<<
> >
> > Hadoop Job IDs executed by Hive:
> >
> > Intercepting System.exit(1)
> >
> > <<< Invocation of Main class completed <<<
> >
> > Failing Oozie Launcher, Main class
> > [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
> >
> > Oozie Launcher failed, finishing Hadoop job gracefully
> >
> > Oozie Launcher, uploading action data to HDFS sequence file:
> > hdfs://192.168.145.191:8020/user/root/oozie-root/0000037-1509251417449
> > 78-oozie-root-W/WF9--hive/action-data.seq
> >
> > Oozie Launcher ends
> >
> > I am not able to get any useful info from the logs.
> > Not preety sure if this is somewhat related to some Hive related config settings because have not done any.
> > Any pointers will be great.
> >
> > Thanks,
> > -Nirmal
> >
> >
> > ________________________________
> >
> >
> >
> >
> >
> >
> > NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: Unable to run JOIN query from Hive Action in Oozie
Posted by Nirmal Kumar <ni...@impetus.co.in>.
I am using YARN.
-Nirmal
-----Original Message-----
From: Oussama Chougna [mailto:oussama1983@hotmail.com]
Sent: Tuesday, September 29, 2015 3:32 PM
To: user@oozie.apache.org
Subject: RE: Unable to run JOIN query from Hive Action in Oozie
Hi Nirmal,
It is often hard to find useful log info when running oozie. What processing framework are you using MRv1 or YARN?
Best,
Oussama
> From: nirmal.kumar@impetus.co.in
> To: user@oozie.apache.org
> Subject: Unable to run JOIN query from Hive Action in Oozie
> Date: Tue, 29 Sep 2015 09:18:36 +0000
>
> Hi All,
>
> I am unable to run a JOIN query via Hive Action however the same JOIN query is running from the Hive CLI.
> I tried a simple HQL via the Hive Action and was able to run the same from Oozie workflow.
>
> I am using:
>
> * oozie-4.1.0
>
> * Hive 1.0.0
>
> * Hadoop 2.6.0
>
>
> Here is snippet of my Hive Action:
> <action name="WF_4">
> <hive xmlns="uri:oozie:hive-action:0.2">
> <job-tracker>xxx.xxx.xxx.xxx:8032</job-tracker>
> <name-node>hdfs:// xxx.xxx.xxx.xxx:8020</name-node>
> <job-xml>hdfs:// xxx.xxx.xxx.xxx:8020/user/root/db/WorkFlow12333/hive-site.xml</job-xml>
> <configuration>
> <property>
> <name>mapred.job.queue.name</name>
> <value>default</value>
> </property>
> </configuration>
> <script>testhql.q</script>
> </hive>
> <ok to="WF_5"/>
> <error to=" Failure "/>
> </action>
>
>
> I am getting the following logs on Hadoop:
>
> 1899 [main] INFO org.apache.hadoop.hive.ql.Driver - Total jobs = 1
> 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> </PERFLOG method=TimeToSubmit start=1443451614701 end=1443451615873
> duration=1172 from=org.apache.hadoop.hive.ql.Driver>
> 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> method=task.MAPREDLOCAL.Stage-5 from=org.apache.hadoop.hive.ql.Driver>
> 1909 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask -
> Generating plan file
> file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20
> -16-54_706_5027639748153132622-1/-local-10005/plan.xml
> 1909 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG
> method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
> 1909 [main] INFO org.apache.hadoop.hive.ql.exec.Utilities -
> Serializing MapredLocalWork via kryo
> 1975 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=serializePlan start=1443451615882 end=1443451615948 duration=66 from=org.apache.hadoop.hive.ql.exec.Utilities>
> 2079 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Executing: /opt/IdwPlatform/hadoop/hadoop-2.6.0//bin/hadoop jar /opt/IdwPlatform/hadoop/hadoopDirs/hadooptmp/nm-local-dir/filecache/761/Clickstream-0.0.1-SNAPSHOT-driver.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml -jobconffile file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10006/jobconf.xml
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Execution
> failed with exit status: 1
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Obtaining
> error information
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Task failed!
> Task ID:
> Stage-5
>
> Logs:
>
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> - Execution failed with exit status: 1
> 12291 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED:
> Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> </PERFLOG method=Driver.execute start=1443451615872 end=1443451626264
> duration=10392 from=org.apache.hadoop.hive.ql.Driver>
> 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> 12417 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> </PERFLOG method=releaseLocks start=1443451626264 end=1443451626390
> duration=126 from=org.apache.hadoop.hive.ql.Driver>
> 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger -
> </PERFLOG method=releaseLocks start=1443451626422 end=1443451626422
> duration=0 from=org.apache.hadoop.hive.ql.Driver>
>
> <<< Invocation of Hive command completed <<<
>
> Hadoop Job IDs executed by Hive:
>
> Intercepting System.exit(1)
>
> <<< Invocation of Main class completed <<<
>
> Failing Oozie Launcher, Main class
> [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
>
> Oozie Launcher failed, finishing Hadoop job gracefully
>
> Oozie Launcher, uploading action data to HDFS sequence file:
> hdfs://192.168.145.191:8020/user/root/oozie-root/0000037-1509251417449
> 78-oozie-root-W/WF9--hive/action-data.seq
>
> Oozie Launcher ends
>
> I am not able to get any useful info from the logs.
> Not preety sure if this is somewhat related to some Hive related config settings because have not done any.
> Any pointers will be great.
>
> Thanks,
> -Nirmal
>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
________________________________
NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
RE: Unable to run JOIN query from Hive Action in Oozie
Posted by Oussama Chougna <ou...@hotmail.com>.
Hi Nirmal,
It is often hard to find useful log info when running oozie. What processing framework are you using MRv1 or YARN?
Best,
Oussama
> From: nirmal.kumar@impetus.co.in
> To: user@oozie.apache.org
> Subject: Unable to run JOIN query from Hive Action in Oozie
> Date: Tue, 29 Sep 2015 09:18:36 +0000
>
> Hi All,
>
> I am unable to run a JOIN query via Hive Action however the same JOIN query is running from the Hive CLI.
> I tried a simple HQL via the Hive Action and was able to run the same from Oozie workflow.
>
> I am using:
>
> * oozie-4.1.0
>
> * Hive 1.0.0
>
> * Hadoop 2.6.0
>
>
> Here is snippet of my Hive Action:
> <action name="WF_4">
> <hive xmlns="uri:oozie:hive-action:0.2">
> <job-tracker>xxx.xxx.xxx.xxx:8032</job-tracker>
> <name-node>hdfs:// xxx.xxx.xxx.xxx:8020</name-node>
> <job-xml>hdfs:// xxx.xxx.xxx.xxx:8020/user/root/db/WorkFlow12333/hive-site.xml</job-xml>
> <configuration>
> <property>
> <name>mapred.job.queue.name</name>
> <value>default</value>
> </property>
> </configuration>
> <script>testhql.q</script>
> </hive>
> <ok to="WF_5"/>
> <error to=" Failure "/>
> </action>
>
>
> I am getting the following logs on Hadoop:
>
> 1899 [main] INFO org.apache.hadoop.hive.ql.Driver - Total jobs = 1
> 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=TimeToSubmit start=1443451614701 end=1443451615873 duration=1172 from=org.apache.hadoop.hive.ql.Driver>
> 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
> 1900 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=task.MAPREDLOCAL.Stage-5 from=org.apache.hadoop.hive.ql.Driver>
> 1909 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Generating plan file file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml
> 1909 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=serializePlan from=org.apache.hadoop.hive.ql.exec.Utilities>
> 1909 [main] INFO org.apache.hadoop.hive.ql.exec.Utilities - Serializing MapredLocalWork via kryo
> 1975 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=serializePlan start=1443451615882 end=1443451615948 duration=66 from=org.apache.hadoop.hive.ql.exec.Utilities>
> 2079 [main] INFO org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Executing: /opt/IdwPlatform/hadoop/hadoop-2.6.0//bin/hadoop jar /opt/IdwPlatform/hadoop/hadoopDirs/hadooptmp/nm-local-dir/filecache/761/Clickstream-0.0.1-SNAPSHOT-driver.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -localtask -plan file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10005/plan.xml -jobconffile file:/opt/IdwPlatform/hive/hiveDirs/scratchDirLocal/hive_2015-09-28_20-16-54_706_5027639748153132622-1/-local-10006/jobconf.xml
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Execution failed with exit status: 1
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task - Obtaining error information
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.Task -
> Task failed!
> Task ID:
> Stage-5
>
> Logs:
>
> 12290 [main] ERROR org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask - Execution failed with exit status: 1
> 12291 [main] ERROR org.apache.hadoop.hive.ql.Driver - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=Driver.execute start=1443451615872 end=1443451626264 duration=10392 from=org.apache.hadoop.hive.ql.Driver>
> 12291 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> 12417 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=releaseLocks start=1443451626264 end=1443451626390 duration=126 from=org.apache.hadoop.hive.ql.Driver>
> 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> 12449 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=releaseLocks start=1443451626422 end=1443451626422 duration=0 from=org.apache.hadoop.hive.ql.Driver>
>
> <<< Invocation of Hive command completed <<<
>
> Hadoop Job IDs executed by Hive:
>
> Intercepting System.exit(1)
>
> <<< Invocation of Main class completed <<<
>
> Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [1]
>
> Oozie Launcher failed, finishing Hadoop job gracefully
>
> Oozie Launcher, uploading action data to HDFS sequence file: hdfs://192.168.145.191:8020/user/root/oozie-root/0000037-150925141744978-oozie-root-W/WF9--hive/action-data.seq
>
> Oozie Launcher ends
>
> I am not able to get any useful info from the logs.
> Not preety sure if this is somewhat related to some Hive related config settings because have not done any.
> Any pointers will be great.
>
> Thanks,
> -Nirmal
>
>
> ________________________________
>
>
>
>
>
>
> NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.