You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2015/04/10 18:36:13 UTC

[jira] [Comment Edited] (HIVE-10291) Hive on Spark job configuration needs to be logged [Spark Branch]

    [ https://issues.apache.org/jira/browse/HIVE-10291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14489887#comment-14489887 ] 

Xuefu Zhang edited comment on HIVE-10291 at 4/10/15 4:36 PM:
-------------------------------------------------------------

Thanks for the explanation. I guess the perf impact is about when a job is submitted, which I don't think is a big deal. Also, since we are logging it at INFO (DEBUG?) level, we should be okay.

One minor consideration: We should check the log level before dumping it into a string, such that it has no perf impact on logging levels higher than INFO.


was (Author: xuefuz):
Thanks for the explanation. I guess the perf impact is about when a job is submitted, which I don't think is a big deal. Also, since we are logging it at INFO level, we should be okay.

One minor consideration: We should check the log level before dumping it into a string, such that it has no perf impact on logging levels higher than INFO.

> Hive on Spark job configuration needs to be logged [Spark Branch]
> -----------------------------------------------------------------
>
>                 Key: HIVE-10291
>                 URL: https://issues.apache.org/jira/browse/HIVE-10291
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: 1.1.0
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>         Attachments: HIVE-10291-spark.patch, HIVE-10291.2-spark.patch
>
>
> In a Hive on MR job, all the job properties are put into the JobConf, which can then be viewed via the MR2 HistoryServer's Job UI.
> However, in Hive on Spark we are submitting an application that is long-lived.  Hence, we only put properties into the SparkConf relevant to application submission (spark and yarn properties).  Only these are viewable through the Spark HistoryServer Application UI.
> It is the Hive application code (RemoteDriver, aka RemoteSparkContext) that is responsible for serializing and deserializing the job.xml per job (ie, query) within the application.  Thus, for supportability we also need to give an equivalent mechanism to print the job.xml per job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)