You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Marta Kuczora <ku...@cloudera.com> on 2017/03/31 12:58:54 UTC

Re: Review Request 55498: HIVE-15538: Test HIVE-13884 with more complex query predicates


> On Jan. 27, 2017, 4:22 p.m., Chaoyu Tang wrote:
> > Thanks for the patch and it looks good. However, I have a couple of questions which need your clarifications. Thanks

Thanks a lot for the review.


> On Jan. 27, 2017, 4:22 p.m., Chaoyu Tang wrote:
> > data/files/max_partition_test_input.txt
> > Lines 1 (patched)
> > <https://reviews.apache.org/r/55498/diff/2/?file=1604753#file1604753line1>
> >
> >     As I remember there was an upstream discussion, we should try to avoid to add a new test data file?

Oh, ok, I didn't know about that. I will change the test to use an already existing data file.


> On Jan. 27, 2017, 4:22 p.m., Chaoyu Tang wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMetaStoreLimitPartitionRequest.java
> > Lines 193 (patched)
> > <https://reviews.apache.org/r/55498/diff/2/?file=1604754#file1604754line193>
> >
> >     How could you tell from the test directly that the query has been falled back to ORM?

I didn't find a way to explicitly check in the test whether or not the query falls back to ORM. I went through the code and found some rules when the direct SQL cannot be used (for example in the MetaStoreDirectSql.visit method). If the operator LIKE is used or the type of the column/value is invalid or they don't match, direct SQL cannot be used.
Also found that if the query falls back to ORM, the number of the partitions can be fetched by the getNumPartitionsViaOrmFilter method or by the getPartitionNamesPrunedByExprNoTxn method. This logic is in the ObjectStore.getNumPartitionsByExpr method:

        if (exprTree != null) {
          try {
            numPartitions = getNumPartitionsViaOrmFilter(ctx.getTable(), exprTree, true);
          } catch (MetaException e) {
            numPartitions = null;
          }
        }

        // if numPartitions could not be obtained from ORM filters, then get number partitions names, and count them
        if (numPartitions == null) {
          List<String> filteredPartNames = new ArrayList<String>();
          getPartitionNamesPrunedByExprNoTxn(ctx.getTable(), tempExpr, "", (short) -1, filteredPartNames);
          numPartitions = filteredPartNames.size();
        }

The number of partitions will be fetched by the getNumPartitionsViaOrmFilter method for example for the "select value from %s where num=30 or num=25" query and by the getPartitionNamesPrunedByExprNoTxn method for the "select value from %s where num between 26 and 31" query.
I checked these rules and then debugged the queries to see if they really go to the expected way.

Do you have any suggestion how I could do this better or how I could check in the test if the query falls back to ORM?


> On Jan. 27, 2017, 4:22 p.m., Chaoyu Tang wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMetaStoreLimitPartitionRequest.java
> > Lines 230 (patched)
> > <https://reviews.apache.org/r/55498/diff/2/?file=1604754#file1604754line230>
> >
> >     HMS filter might support the IN clause, but not the LIKE in query predicate.

Yes, I noticed that the LIKE expression is not pushed to the filter, but I added a test case for it to see if the partition max limit is applied. If you think this test case is not needed, I will remove it.


- Marta


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/55498/#review163279
-----------------------------------------------------------


On Jan. 13, 2017, 3:26 p.m., Marta Kuczora wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/55498/
> -----------------------------------------------------------
> 
> (Updated Jan. 13, 2017, 3:26 p.m.)
> 
> 
> Review request for hive and Chaoyu Tang.
> 
> 
> Bugs: HIVE-15538
>     https://issues.apache.org/jira/browse/HIVE-15538
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Added unit test for testing HIVE-13884 with more complex queries and hive.metastore.limit.partition.request enabled.
> It covers cases when the query predicates can be pushed down and the number of partitions can be retrieved via directSQL.
> It also covers cases when the number of partitions cannot be retrieved via directSQL, so it falls back to ORM.
> 
> 
> Diffs
> -----
> 
>   data/files/max_partition_test_input.txt PRE-CREATION 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestMetaStoreLimitPartitionRequest.java PRE-CREATION 
>   metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 121b825 
> 
> 
> Diff: https://reviews.apache.org/r/55498/diff/2/
> 
> 
> Testing
> -------
> 
> The patch contains only a new unit test. Ran the test multiple times successfully.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>