You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Ashutosh Chauhan <ha...@apache.org> on 2013/06/03 23:03:15 UTC

Re: Review Request: HIVE-4513 - disable hivehistory logs by default

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/#review21352
-----------------------------------------------------------



data/conf/hive-site.xml
<https://reviews.apache.org/r/11029/#comment44263>

    Is there a reason for this to be set to true for tests? Unless there is, we should set config in tests to the default values, since we should test default configs.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44264>

    doesn't read right. I guess you wanted 
    ... statistics into a file.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44266>

    This is existing comment which doesnt read right. But since we are doing major surgery on HiveHistory, it will be good to update to make it more sensible.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44268>

    I think word job is not required in this comment.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44269>

    I think query is a better word than job here.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44270>

    Better worded as
    Called at the end of query.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44271>

    Again use of word job is confusing, we shall use query here as well.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44272>

    Incorrect comment.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java
<https://reviews.apache.org/r/11029/#comment44274>

    Function name is IdtoTable, but comment says table to id. One of this needs to be corrected.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44275>

    Similar comment as in HiveHistory.java



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44277>

    Should this be hive.ql.exec.HiveHistoryImpl to avoid confusion?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44278>

    and instead of an ?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44280>

    In case of incorrect config, should this throw an exception instead of silent return, otherwise there will be errors later when something is tried to be written in history file.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44281>

    Same comment as above.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44283>

    This should be static class variable, otherwise nextInt() will return same value for each invocation.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44284>

    Instead of / we shall use File.Seprator



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44287>

    Consider using File.createNewFile here.



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44288>

    Use  System.getProperty("line.separator") instead of \n



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java
<https://reviews.apache.org/r/11029/#comment44289>

    start of query ?



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java
<https://reviews.apache.org/r/11029/#comment44291>

    Missing apache header



ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java
<https://reviews.apache.org/r/11029/#comment44292>

    HiveHistoryViewer.class


- Ashutosh Chauhan


On May 13, 2013, 10:12 p.m., Thejas Nair wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11029/
> -----------------------------------------------------------
> 
> (Updated May 13, 2013, 10:12 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Description
> -------
> 
> HiveHistory log files (hive_job_log_hive_*.txt files) store information about hive query such as query string, plan , counters and MR job progress information.
> 
> There is no mechanism to delete these files and as a result they get accumulated over time, using up lot of disk space. 
> I don't think this is used by most people, so I think it would better to turn this off by default. Jobtracker logs already capture most of this information, though it is not as structured as history logs.
> 
> The change :
> A new config parameter hive.session.history.enabled controls if the history-log is enabled. By default it is set to false.
> SessionState initializes the HiveHIstory object. When this config is set to false, it creates a Proxy object that does not do anything. I did this instead of having SessionState return null, because that would add null checks in too many places. This keeps the code cleaner and avoids possibility of NPE.
> As the proxy only works against interfaces, i created a HiveHistory interface, moved the implementation to HiveHistoryImpl. static functions were moved to HiveHistoryUtil .
> 
> 
> This addresses bug HIVE-4513.
>     https://issues.apache.org/jira/browse/HIVE-4513
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1672453 
>   conf/hive-default.xml.template 3a7d1dc 
>   data/conf/hive-site.xml 544ba35 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java e1c1ae3 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryProxyHandler.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java fdd56db 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 3d43451 
>   ql/src/test/org/apache/hadoop/hive/ql/history/TestHiveHistory.java a783303 
> 
> Diff: https://reviews.apache.org/r/11029/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Thejas Nair
> 
>


Re: Review Request 11029: HIVE-4513 - disable hivehistory logs by default

Posted by Thejas Nair <th...@hortonworks.com>.

> On June 3, 2013, 9:03 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java, line 86
> > <https://reviews.apache.org/r/11029/diff/2/?file=290954#file290954line86>
> >
> >     In case of incorrect config, should this throw an exception instead of silent return, otherwise there will be errors later when something is tried to be written in history file.

Errors will not be there later, as it does not write if the histStream has not been initialized. 
I don't think we should fail the query just because hive history logging failed. This is also current behavior in hive.


- Thejas


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11029/#review21352
-----------------------------------------------------------


On Aug. 10, 2013, 4:24 p.m., Thejas Nair wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11029/
> -----------------------------------------------------------
> 
> (Updated Aug. 10, 2013, 4:24 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-4513
>     https://issues.apache.org/jira/browse/HIVE-4513
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HiveHistory log files (hive_job_log_hive_*.txt files) store information about hive query such as query string, plan , counters and MR job progress information.
> 
> There is no mechanism to delete these files and as a result they get accumulated over time, using up lot of disk space. 
> I don't think this is used by most people, so I think it would better to turn this off by default. Jobtracker logs already capture most of this information, though it is not as structured as history logs.
> 
> The change :
> A new config parameter hive.session.history.enabled controls if the history-log is enabled. By default it is set to false.
> SessionState initializes the HiveHIstory object. When this config is set to false, it creates a Proxy object that does not do anything. I did this instead of having SessionState return null, because that would add null checks in too many places. This keeps the code cleaner and avoids possibility of NPE.
> As the proxy only works against interfaces, i created a HiveHistory interface, moved the implementation to HiveHistoryImpl. static functions were moved to HiveHistoryUtil .
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 83f337b 
>   conf/hive-default.xml.template 0a6e433 
>   hbase-handler/src/test/templates/TestHBaseCliDriver.vm c59e882 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java 97436c5 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryImpl.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryProxyHandler.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryUtil.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java fdd56db 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java ab369f0 
>   ql/src/test/org/apache/hadoop/hive/ql/history/TestHiveHistory.java a783303 
>   ql/src/test/templates/TestCliDriver.vm a6ae6c3 
> 
> Diff: https://reviews.apache.org/r/11029/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Thejas Nair
> 
>