You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Vaibhav Gumashta <vg...@hortonworks.com> on 2015/04/20 20:44:37 UTC
Review Request 33367: Aggregate stats cache for RDBMS based metastore
codepath
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/
-----------------------------------------------------------
Review request for hive.
Bugs: HIVE-10382
https://issues.apache.org/jira/browse/HIVE-10382
Repository: hive-git
Description
-------
Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
Diffs
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
Diff: https://reviews.apache.org/r/33367/diff/
Testing
-------
Thanks,
Vaibhav Gumashta
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Thejas Nair <th...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/#review81119
-----------------------------------------------------------
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131370>
this is not being set to false, which means the cleaner would run only once.
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131371>
For tracking how the cache is performing, would be useful to have an INFO level message about how many entries were there and how many were removed due to expiry and if the eviction based on LRU is going to be triggered.
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131372>
evicting one LRU node at a time is expensive.
I think we should just reduce the TTL to 0.9*TTL , 0.8*TTL etc and call this function again. Can be done in a follow up jira.
Ideally, in the long term, we should think of using both the frequency of use and cost of re-computing the stats while deciding which ones to evict.
- Thejas Nair
On April 20, 2015, 6:44 p.m., Vaibhav Gumashta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33367/
> -----------------------------------------------------------
>
> (Updated April 20, 2015, 6:44 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-10382
> https://issues.apache.org/jira/browse/HIVE-10382
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
> common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
> common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
> metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
> ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
>
> Diff: https://reviews.apache.org/r/33367/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Vaibhav Gumashta
>
>
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Thejas Nair <th...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/#review81714
-----------------------------------------------------------
Ship it!
Ship It!
- Thejas Nair
On April 27, 2015, 6:34 p.m., Vaibhav Gumashta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33367/
> -----------------------------------------------------------
>
> (Updated April 27, 2015, 6:34 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-10382
> https://issues.apache.org/jira/browse/HIVE-10382
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
> common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
> common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
> metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
> ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
> ql/src/test/queries/clientpositive/annotate_stats_part.q fcfe566
>
> Diff: https://reviews.apache.org/r/33367/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Vaibhav Gumashta
>
>
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Vaibhav Gumashta <vg...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/
-----------------------------------------------------------
(Updated April 27, 2015, 6:34 p.m.)
Review request for hive.
Bugs: HIVE-10382
https://issues.apache.org/jira/browse/HIVE-10382
Repository: hive-git
Description
-------
Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
Diffs (updated)
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
ql/src/test/queries/clientpositive/annotate_stats_part.q fcfe566
Diff: https://reviews.apache.org/r/33367/diff/
Testing
-------
Thanks,
Vaibhav Gumashta
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Vaibhav Gumashta <vg...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/
-----------------------------------------------------------
(Updated April 26, 2015, 8:44 p.m.)
Review request for hive.
Bugs: HIVE-10382
https://issues.apache.org/jira/browse/HIVE-10382
Repository: hive-git
Description
-------
Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
Diffs (updated)
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
Diff: https://reviews.apache.org/r/33367/diff/
Testing
-------
Thanks,
Vaibhav Gumashta
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Mostafa Mokhtar <mm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/#review81170
-----------------------------------------------------------
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131427>
Can you keeep track of hit ratio?
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131425>
Can you add logging on how many nodes where cleaned in how long.
Also print number of nodes before/after
- Mostafa Mokhtar
On April 20, 2015, 6:44 p.m., Vaibhav Gumashta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33367/
> -----------------------------------------------------------
>
> (Updated April 20, 2015, 6:44 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-10382
> https://issues.apache.org/jira/browse/HIVE-10382
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
> common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
> common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
> metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
> ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
>
> Diff: https://reviews.apache.org/r/33367/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Vaibhav Gumashta
>
>
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Mostafa Mokhtar <mm...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/#review81127
-----------------------------------------------------------
common/src/java/org/apache/hive/common/util/BloomFilter.java
<https://reviews.apache.org/r/33367/#comment131382>
There already is a BloomFilter implementaion in org.apache.hadoop.hive.ql.io.filters.
Can this be used instead of the new one?
- Mostafa Mokhtar
On April 20, 2015, 6:44 p.m., Vaibhav Gumashta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33367/
> -----------------------------------------------------------
>
> (Updated April 20, 2015, 6:44 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-10382
> https://issues.apache.org/jira/browse/HIVE-10382
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
> common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
> common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
> metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
> ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
>
> Diff: https://reviews.apache.org/r/33367/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Vaibhav Gumashta
>
>
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Thejas Nair <th...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/#review81122
-----------------------------------------------------------
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131375>
i would expect this map entry to be non null for vast majority of this call done from here.
Creating a new (arraylist) object is usually considered expensive, so i feel it would better to error on the side of having to make an additional put call.
metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
<https://reviews.apache.org/r/33367/#comment131379>
looking up in the db separately for each column will be expensive. We should make a single call for columns not in cache and then create the aggregated results.
Again a perf improvement that we can track in a followup jira.
- Thejas Nair
On April 20, 2015, 6:44 p.m., Vaibhav Gumashta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33367/
> -----------------------------------------------------------
>
> (Updated April 20, 2015, 6:44 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-10382
> https://issues.apache.org/jira/browse/HIVE-10382
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
> common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
> common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
> metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
> ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
>
> Diff: https://reviews.apache.org/r/33367/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Vaibhav Gumashta
>
>
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Thejas Nair <th...@hortonworks.com>.
> On April 22, 2015, 3:47 a.m., Thejas Nair wrote:
> > metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java, line 489
> > <https://reviews.apache.org/r/33367/diff/1/?file=935572#file935572line489>
> >
> > this is not being used anywhere, can be removed.
never mind, i see its used in evictOne
- Thejas
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/#review81115
-----------------------------------------------------------
On April 20, 2015, 6:44 p.m., Vaibhav Gumashta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33367/
> -----------------------------------------------------------
>
> (Updated April 20, 2015, 6:44 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-10382
> https://issues.apache.org/jira/browse/HIVE-10382
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
> common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
> common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
> metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
> ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
>
> Diff: https://reviews.apache.org/r/33367/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Vaibhav Gumashta
>
>
Re: Review Request 33367: Aggregate stats cache for RDBMS based
metastore codepath
Posted by Thejas Nair <th...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33367/#review81115
-----------------------------------------------------------
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131359>
I don't think this comment is applicable.
From http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html -
The memory effects for accesses and updates of atomics generally follow the rules for volatiles, as stated in section 17.4 of The Java™ Language Specification.
get has the memory effects of reading a volatile variable.
set has the memory effects of writing (assigning) a volatile variable.
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131364>
I don't think we really need the locks at the level of candidate list, it can be made a more finer lock by using ConcurrentLinkedQueue or something similar.
The only mutable part of AggrColStatsCached is already stored in a volatile member.
Can you please open a follow up jira to explore that ?
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131360>
match has to be null if this exception is thrown. (unnecessary also is unintuitive.)
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131361>
can we just skip these instead of adding them as potential candidates ?
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131362>
as discussed offline, there is potential for improving the performance here by avoiding two loops. That can be done in a follow up jira.
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131358>
spawnCleaner() or startCleaner() might be a better name.
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131363>
we should give this thread a name (for ease of debugging).
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131367>
this is not being used anywhere, can be removed.
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131368>
the name of this class is too similar to the outer class. I feel it would better to name it just AggrColStats or so
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131366>
the time is already being updated from findBestMatch, so this isn't necessary.
metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java
<https://reviews.apache.org/r/33367/#comment131365>
lets removed these unused classes.
- Thejas Nair
On April 20, 2015, 6:44 p.m., Vaibhav Gumashta wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33367/
> -----------------------------------------------------------
>
> (Updated April 20, 2015, 6:44 p.m.)
>
>
> Review request for hive.
>
>
> Bugs: HIVE-10382
> https://issues.apache.org/jira/browse/HIVE-10382
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Similar to the work done on the HBase branch (HIVE-9693), the stats cache can potentially have performance gains.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65ec1b9
> common/src/java/org/apache/hive/common/util/BloomFilter.java PRE-CREATION
> common/src/java/org/apache/hive/common/util/Murmur3.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestBloomFilter.java PRE-CREATION
> common/src/test/org/apache/hive/common/util/TestMurmur3.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/AggregateStatsCache.java PRE-CREATION
> metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java bf169c9
> metastore/src/test/org/apache/hadoop/hive/metastore/TestAggregateStatsCache.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilter.java 6ab0270
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/BloomFilterIO.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/filters/Murmur3.java e733892
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/FileDump.java 7bfd781
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java 49a8e80
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bde9fc2
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java a319204
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestBloomFilter.java 32b95ab
> ql/src/test/org/apache/hadoop/hive/ql/io/filters/TestMurmur3.java d92a3ce
> ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestRecordReaderImpl.java d0f3a5e
>
> Diff: https://reviews.apache.org/r/33367/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Vaibhav Gumashta
>
>