You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Markovitz, Dudu" <dm...@paypal.com.INVALID> on 2016/08/11 10:40:42 UTC

change job name but keep the stage info

Hi guys

I'm looking for a way to generate a common id for all jobs generated from the same query.
I'm aware of 2 possible options (described below) which are someway problematic.

Are you aware of a way to achieve this in current/future versions?

Thanks

Dudu




1.
Setting the job name:

set mapred.job.name=demo 1;
select count(*) from (select 1) t;

ID

User

Name

application_1469828525963_122782<http://lvshdc2en0007.lvs.paypal.com:8088/cluster/app/application_1469828525963_122782>

dmarkovitz

demo 1


The downside:

*         I'm losing the stage information

2.
Adding a comment before the query:

-- demo 2
select count(*) from (select 1) t

ID

User

Name

application_1469828525963_122812<http://lvshdc2en0007.lvs.paypal.com:8088/cluster/app/application_1469828525963_122812>

dmarkovitz

-- demo 2 select count(*) from (select 1) t(Stage-1)


The downsides:

*         This current behavior of determining the job name is not guaranteed

*         It requires to add an additional text to all queries

*         It contains undesired text (the prefix of the query)