You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Rajbir singh <ra...@gmail.com> on 2020/03/18 05:43:19 UTC

MapReduce Job name of a Hive Query

Hello,

Question regarding how to set hive mapreduce job name for hive query child
jobs

Our hive query creates 9 map-reduce jobs and 17 stages(when I ran EXPLAIN
command, output showed 17 STAGES and STAGE DEPENDENCIES). Every child job
has the same mapreduce.job.name value
To distinguish these child jobs, is there any way I can set the
mapreduce.job.name inside the hive query so that for each job, I can see
the stage of the job in the job name.

For e.g. my hive query is doing group by, aggregation, union , loading data
into a view , dropping a view doing more aggregation and finally dumping
the data into a table.

If Through job names I can find what exact step the job is performing, will
be very helpful for us to troubleshoot by just looking at the job name.


For e.g. In mapreduce jobs the job name usually state what Ptable it is
joining , what is the key it is emitting as below.

wfName=testJobName-wf wfId=1961854-200216231040174-oozie-oozi-W
p=TestPipeline mode=Default : [[Avro(/tmp/crunch-1895619502/p9)+Transform
to emit key as
TransactionUid+joinTagLeft]/[Avro(/tmp/crunch-1895619502/p6)+joinTagRight]]+GBK+leftOuterJoinGBK+Get
Parent Transaction Values+Avro(/tmp/crunch-1895619502/p3) ID=9 (9/12)

I want the same information displayed under the mapreduce job of a
corresponding
hive query

Thank you
-- 
Regards,
Rajbir