You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Sergey Shelukhin <se...@hortonworks.com> on 2018/05/15 04:55:42 UTC

Review Request 67125: HIVE-19418 add background stats updater similar to compactor

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67125/
-----------------------------------------------------------

Review request for hive, Ashutosh Chauhan and Seong (Steve) Yeom.


Repository: hive-git


Description
-------

see jira. This should eventually integrate with ACID stats to determine what stats are out of date, when that is done. Probably in separate jira if this goes in first.


Diffs
-----

  itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java 3d6fda6bd4 
  ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 89129f99fe 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java b698c84080 
  ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java PRE-CREATION 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 0be0aaa10c 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 92d2e3f368 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 48f77b9878 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 264fdb9db9 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java ce7d2861dd 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java b223920e82 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/EnumValidator.java PRE-CREATION 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java 114d5da205 
  standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java f6899be750 
  standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java 98a85cc758 


Diff: https://reviews.apache.org/r/67125/diff/1/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 67125: HIVE-19418 add background stats updater similar to compactor

Posted by Sergey Shelukhin <se...@hortonworks.com>.

> On May 23, 2018, 9:43 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
> > Lines 557 (patched)
> > <https://reviews.apache.org/r/67125/diff/1/?file=2022628#file2022628line557>
> >
> >     This will compute basic and column stats. I assume this is what you want.

Yes.


> On May 23, 2018, 9:43 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2424 (patched)
> > <https://reviews.apache.org/r/67125/diff/1/?file=2022633#file2022633line2424>
> >
> >     Returning concatenated table Name + Dbname is error prone. Lets make this return List <catName, dbName, tablename>

This seems to be a standard approach in Hive to pass around the full name(s). 
In fact some existing APIs for metastore return catalog+db in one string, and Hive uses full table name all over the place.
I don't think either is error-prone and requires String[]...


> On May 23, 2018, 9:43 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2425 (patched)
> > <https://reviews.apache.org/r/67125/diff/1/?file=2022633#file2022633line2425>
> >
> >     Currently it will fetch all tables (acid or not). Is that intentional?

Yes


> On May 23, 2018, 9:43 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
> > Lines 1642 (patched)
> > <https://reviews.apache.org/r/67125/diff/1/?file=2022635#file2022635line1642>
> >
> >     Return value should be List<List<CatName, DBname, TblName>>

Same as above... it's ok for Hive to use fq table name everywhere as a string, so it should be ok here


> On May 23, 2018, 9:43 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
> > Lines 738 (patched)
> > <https://reviews.apache.org/r/67125/diff/1/?file=2022638#file2022638line738>
> >
> >     I think this timeunit should be in minutes since this task need to run more frequently than that. That will make this config less error-prone.
> >     Default value : 1 hour.

Changed the setting. Note that this wait is the wait until we check the next time if the check was a noop... I don't think it's a good idea to wait for an hour.
It will produce counterintuitive behavior where if tables are updated steadily, after each non-noop run it will find yet another table to update and do the full check, but one small gap will result at nothing being updated for an hour.
This is a throttle for checks, not analyze queries. We may add separate one for queries if desired.


- Sergey


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67125/#review203694
-----------------------------------------------------------


On May 15, 2018, 4:55 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67125/
> -----------------------------------------------------------
> 
> (Updated May 15, 2018, 4:55 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Seong (Steve) Yeom.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira. This should eventually integrate with ACID stats to determine what stats are out of date, when that is done. Probably in separate jira if this goes in first.
> 
> 
> Diffs
> -----
> 
>   itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java 3d6fda6bd4 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 89129f99fe 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java b698c84080 
>   ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java PRE-CREATION 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 0be0aaa10c 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 92d2e3f368 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 48f77b9878 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 264fdb9db9 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java ce7d2861dd 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java b223920e82 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/EnumValidator.java PRE-CREATION 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java 114d5da205 
>   standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java f6899be750 
>   standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java 98a85cc758 
> 
> 
> Diff: https://reviews.apache.org/r/67125/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 67125: HIVE-19418 add background stats updater similar to compactor

Posted by Ashutosh Chauhan <ha...@apache.org>.

> On May 23, 2018, 9:43 p.m., Ashutosh Chauhan wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
> > Lines 2424 (patched)
> > <https://reviews.apache.org/r/67125/diff/1/?file=2022633#file2022633line2424>
> >
> >     Returning concatenated table Name + Dbname is error prone. Lets make this return List <catName, dbName, tablename>
> 
> Sergey Shelukhin wrote:
>     This seems to be a standard approach in Hive to pass around the full name(s). 
>     In fact some existing APIs for metastore return catalog+db in one string, and Hive uses full table name all over the place.
>     I don't think either is error-prone and requires String[]...

I dont agree. Yes, we use single string for dbName + tblName, but we want to change that in future. Motivation is to allow all chars in names (including + and unicode). There in fact is already a jira to do this change. Lets not introduce at one more place in a new code.


- Ashutosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67125/#review203694
-----------------------------------------------------------


On May 26, 2018, 1:30 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67125/
> -----------------------------------------------------------
> 
> (Updated May 26, 2018, 1:30 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Seong (Steve) Yeom.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira. This should eventually integrate with ACID stats to determine what stats are out of date, when that is done. Probably in separate jira if this goes in first.
> 
> 
> Diffs
> -----
> 
>   itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java 3d6fda6bd4 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6c56212c9e 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 982b180761 
>   ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java PRE-CREATION 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 9ab9e85742 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 92d2e3f368 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 5bb1985025 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 13ccdb145e 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java ce7d2861dd 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java b223920e82 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/EnumValidator.java PRE-CREATION 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java 114d5da205 
>   standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java f6899be750 
>   standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java 98a85cc758 
> 
> 
> Diff: https://reviews.apache.org/r/67125/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 67125: HIVE-19418 add background stats updater similar to compactor

Posted by Ashutosh Chauhan <ha...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67125/#review203694
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
Lines 66 (patched)
<https://reviews.apache.org/r/67125/#comment285971>

    Is this correct?



ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
Lines 138 (patched)
<https://reviews.apache.org/r/67125/#comment285965>

    Should add a comment along the line of:
    "Security is turned off. So, we can execute with annonymous user."



ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java
Lines 557 (patched)
<https://reviews.apache.org/r/67125/#comment285968>

    This will compute basic and column stats. I assume this is what you want.



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2424 (patched)
<https://reviews.apache.org/r/67125/#comment285972>

    Returning concatenated table Name + Dbname is error prone. Lets make this return List <catName, dbName, tablename>



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
Lines 2425 (patched)
<https://reviews.apache.org/r/67125/#comment285969>

    Currently it will fetch all tables (acid or not). Is that intentional?



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
Lines 1642 (patched)
<https://reviews.apache.org/r/67125/#comment285973>

    Return value should be List<List<CatName, DBname, TblName>>



standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
Lines 738 (patched)
<https://reviews.apache.org/r/67125/#comment285970>

    I think this timeunit should be in minutes since this task need to run more frequently than that. That will make this config less error-prone.
    Default value : 1 hour.


- Ashutosh Chauhan


On May 15, 2018, 4:55 a.m., Sergey Shelukhin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67125/
> -----------------------------------------------------------
> 
> (Updated May 15, 2018, 4:55 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Seong (Steve) Yeom.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> see jira. This should eventually integrate with ACID stats to determine what stats are out of date, when that is done. Probably in separate jira if this goes in first.
> 
> 
> Diffs
> -----
> 
>   itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java 3d6fda6bd4 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 89129f99fe 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java b698c84080 
>   ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java PRE-CREATION 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 0be0aaa10c 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 92d2e3f368 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 48f77b9878 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 264fdb9db9 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java ce7d2861dd 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java b223920e82 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/EnumValidator.java PRE-CREATION 
>   standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java 114d5da205 
>   standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java f6899be750 
>   standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java 98a85cc758 
> 
> 
> Diff: https://reviews.apache.org/r/67125/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>


Re: Review Request 67125: HIVE-19418 add background stats updater similar to compactor

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67125/
-----------------------------------------------------------

(Updated May 31, 2018, 2:36 a.m.)


Review request for hive, Ashutosh Chauhan and Seong (Steve) Yeom.


Repository: hive-git


Description
-------

see jira. This should eventually integrate with ACID stats to determine what stats are out of date, when that is done. Probably in separate jira if this goes in first.


Diffs (updated)
-----

  itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java 0cc0ae5771 
  ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6c56212c9e 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 982b180761 
  ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java PRE-CREATION 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java be05838ed7 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java d8b8414999 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 5bb1985025 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java b15d89d5c6 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java 283798cd77 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java 9da8d728f9 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/EnumValidator.java PRE-CREATION 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java 114d5da205 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java 742b6bf76b 
  standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java 0461c4ee9a 
  standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java b71eda4212 


Diff: https://reviews.apache.org/r/67125/diff/3/

Changes: https://reviews.apache.org/r/67125/diff/2-3/


Testing
-------


Thanks,

Sergey Shelukhin


Re: Review Request 67125: HIVE-19418 add background stats updater similar to compactor

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67125/
-----------------------------------------------------------

(Updated May 26, 2018, 1:30 a.m.)


Review request for hive, Ashutosh Chauhan and Seong (Steve) Yeom.


Repository: hive-git


Description
-------

see jira. This should eventually integrate with ACID stats to determine what stats are out of date, when that is done. Probably in separate jira if this goes in first.


Diffs (updated)
-----

  itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java 3d6fda6bd4 
  ql/src/java/org/apache/hadoop/hive/ql/DriverUtils.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 6c56212c9e 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 982b180761 
  ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java PRE-CREATION 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 9ab9e85742 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 92d2e3f368 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java 5bb1985025 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java 13ccdb145e 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java ce7d2861dd 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java b223920e82 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/EnumValidator.java PRE-CREATION 
  standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java 114d5da205 
  standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java f6899be750 
  standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java 98a85cc758 


Diff: https://reviews.apache.org/r/67125/diff/2/

Changes: https://reviews.apache.org/r/67125/diff/1-2/


Testing
-------


Thanks,

Sergey Shelukhin