You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by cheng xu <ch...@intel.com> on 2014/10/21 10:13:13 UTC

Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/
-----------------------------------------------------------

Review request for hive.


Repository: hive-git


Description
-------

HIVE-8122: convert ExprNode to Parquet supported FilterPredict


Diffs
-----

  pom.xml 970da97ba2cae8a88e8641e76c88f1bd05c06f36 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterBuilder.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
  ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestFilterPredicateFactory.java PRE-CREATION 

Diff: https://reviews.apache.org/r/26968/diff/


Testing
-------

local UT passed


Thanks,

cheng xu


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by cheng xu <ch...@intel.com>.

> On Nov. 14, 2014, 12:33 a.m., Mohit Sabharwal wrote:
> > Had a few high level questions. Will take another pass after that. I'd appreciate adding more comments to the code.

Thanks for your review. I added some inline comments below and update the patch.


> On Nov. 14, 2014, 12:33 a.m., Mohit Sabharwal wrote:
> > pom.xml, line 149
> > <https://reviews.apache.org/r/26968/diff/4/?file=751463#file751463line149>
> >
> >     I think we need an actual release version (not a release candidate)

At this point, I only found this RC version which has the filter feature supported. And since the parquet is close to its latest releas, we can change it later.


> On Nov. 14, 2014, 12:33 a.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java, lines 132-135
> > <https://reviews.apache.org/r/26968/diff/4/?file=751468#file751468line132>
> >
> >     Could you explain why we need this?
> >     
> >     Wasn't this step already happening in boxLiteral() ?

I look at the code again and find it has been done in getType(ExprNodeDesc expr) method of ExpressionBuilder class. Thanks for pointing out this!


> On Nov. 14, 2014, 12:33 a.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java, lines 153-155
> > <https://reviews.apache.org/r/26968/diff/4/?file=751468#file751468line153>
> >
> >     Could you explain why this special handling is needed  ?

Only Integer type needs to be cast into Long type when trying to get the literal list.


> On Nov. 14, 2014, 12:33 a.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java, line 1099
> > <https://reviews.apache.org/r/26968/diff/4/?file=751468#file751468line1099>
> >
> >     Don't we still need to return a Long (and not an Integer) for ORC ?
> >     
> >     Also, why not handle Long case here as well ?

This method is used by orc file which needs Long type.


> On Nov. 14, 2014, 12:33 a.m., Mohit Sabharwal wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java, line 765
> > <https://reviews.apache.org/r/26968/diff/4/?file=751469#file751469line765>
> >
> >     Please add assert for leaf.getParquetType() everywhere we are checking leaf.getOrcType()

No need anymore since reserve the previous implement for getType().


- cheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review61267
-----------------------------------------------------------


On Oct. 21, 2014, 8:13 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2014, 8:13 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml a5f851f31df15660cebef0e4691ea34699c6d1ef 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java f8fa316b8da29f8970ed6839c7c2f883fa8d9829 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641545ed0ad69f3bbc9a8383697fc7efe37d 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 831ef8c8ec64c270ef62d5336b4cc78d9e34b398 
>   serde/pom.xml 98e55061b6b3abe18030b0b8d3f511bd98bee5f7 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6dbd1ec71ad178f41e8666bad2500e68e151 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f0148e2a995534a4c1369fc4c542cd0b4e6ab 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by Mohit Sabharwal <mo...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review61267
-----------------------------------------------------------


Had a few high level questions. Will take another pass after that. I'd appreciate adding more comments to the code.


pom.xml
<https://reviews.apache.org/r/26968/#comment102912>

    I think we need an actual release version (not a release candidate)



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/26968/#comment102920>

    conditional not needed.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/26968/#comment102921>

    log error is filterpredicate is null ?



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102799>

    please move these to the top. (imports need to be in alphabetized order.)



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102898>

    Could you explain why we need this?
    
    Wasn't this step already happening in boxLiteral() ?



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102899>

    we should be able to get rid of this duplication by passing FileFormat to this method.



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102856>

    getOrcLiteralList() ?



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102900>

    Could you explain why this special handling is needed  ?



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102905>

    should be or() instead of and()
    
    Please add unit test for to catch this case :)



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102906>

    pad else with spaces



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102909>

    These two lines are unrelated to the conditional. Could you move these out ?



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102922>

    Don't we still need to return a Long (and not an Integer) for ORC ?
    
    Also, why not handle Long case here as well ?



ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102794>

    Please add assert for leaf.getParquetType() everywhere we are checking leaf.getOrcType()



ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment102795>

    Please add assert for leaf.getParquetLiteral() everywhere we are checking leaf.getOrcLiteral()



serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java
<https://reviews.apache.org/r/26968/#comment102791>

    parquet -> ORC



serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java
<https://reviews.apache.org/r/26968/#comment102815>

    Did you consider adding another enum to this interface like
    enum FileFormat {
      ORC,
      PARQUET
    }
    instead of adding getOrc and getParquet methods ?
    
    We can pass FileFormat as an arg to the get methods, i.e:
      getLiteral(FileFormat)
      getType(FileFormat)
      getLiteralList(FileFormat)
    
    IMO, that's extensible and avoids code duplication, for example between getOrcLiteral() and getParquetLiteral().



serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java
<https://reviews.apache.org/r/26968/#comment102792>

    ORC -> parquet


- Mohit Sabharwal


On Oct. 21, 2014, 8:13 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2014, 8:13 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml a5f851f31df15660cebef0e4691ea34699c6d1ef 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java f8fa316b8da29f8970ed6839c7c2f883fa8d9829 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641545ed0ad69f3bbc9a8383697fc7efe37d 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 831ef8c8ec64c270ef62d5336b4cc78d9e34b398 
>   serde/pom.xml 98e55061b6b3abe18030b0b8d3f511bd98bee5f7 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6dbd1ec71ad178f41e8666bad2500e68e151 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f0148e2a995534a4c1369fc4c542cd0b4e6ab 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by Brock Noland <br...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review58914
-----------------------------------------------------------


Hi,

I looked into this a little more, and I think that you should actually be able to use SearchArgument. The way you would do this is:

1) Create the SearchArgument object as here:

https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L293


2) Translate SearchArgument to FilterPredict.

Essentially instead of using the UDF classes and column list in FilterPredicateFactory you could use the SearchArgument object.

How does that sound?

- Brock Noland


On Oct. 21, 2014, 8:13 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2014, 8:13 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml 970da97ba2cae8a88e8641e76c88f1bd05c06f36 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
>   ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestFilterPredicateFactory.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by Mohit Sabharwal <mo...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review61847
-----------------------------------------------------------

Ship it!


Thanks for making the changes!

- Mohit Sabharwal


On Nov. 16, 2014, 3:52 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Nov. 16, 2014, 3:52 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml 793ea86 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java a6a0ec1 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java c91644c 
>   serde/pom.xml 98e5506 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6db 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f014 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by cheng xu <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/
-----------------------------------------------------------

(Updated Nov. 16, 2014, 3:52 a.m.)


Review request for hive.


Changes
-------

1. enable more tests about complex type supports
2. clean some code style issues like import
3. enhance the error handlings


Repository: hive-git


Description
-------

HIVE-8122: convert ExprNode to Parquet supported FilterPredict


Diffs (updated)
-----

  pom.xml 793ea86 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java a6a0ec1 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java c91644c 
  serde/pom.xml 98e5506 
  serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6db 
  serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f014 

Diff: https://reviews.apache.org/r/26968/diff/


Testing
-------

local UT passed


Thanks,

cheng xu


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by cheng xu <ch...@intel.com>.

> On Nov. 16, 2014, 12:49 a.m., Mohit Sabharwal wrote:
> > Thank you for making the changes! I had a few more comments on error handling and readability. But otherwise, looks good to me. Thanks!
> > 
> > Could you please add import statements for all the inner classes? This makes things more readable. For example:
> >  import org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf.FileFormat
> >  import static parquet.filter2.predicate.FilterApi.eq
> >  import static parquet.filter2.predicate.FilterApi.intColumn
> > and so on...
> > If you're using Intellij, there is an option to "Insert imports for inner classes". I'm sure there is one in Eclipse as well.

Thank you for your conscientious review. Update the patch according to your comments.


> On Nov. 16, 2014, 12:49 a.m., Mohit Sabharwal wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java, lines 1710-1715
> > <https://reviews.apache.org/r/26968/diff/6/?file=764696#file764696line1710>
> >
> >     Should we add one for date, decimal and timestamp ? -- ones we don't support currently -- just to make sure we're not blowing up.

Thanks for pointing this out. The test cases named in testBuilderComplexTypes* covered it.


> On Nov. 16, 2014, 12:49 a.m., Mohit Sabharwal wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java, lines 125-129
> > <https://reviews.apache.org/r/26968/diff/6/?file=764694#file764694line125>
> >
> >     I assume we are checking READ_COLUMN_NAMES_CONF_STR because this is set only when we do predicate pushdown. Correct ?

Yes, that's right!


- cheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review61668
-----------------------------------------------------------


On Nov. 15, 2014, 2:51 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Nov. 15, 2014, 2:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml 793ea86 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java a6a0ec1 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java c91644c 
>   serde/pom.xml 98e5506 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6db 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f014 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by Mohit Sabharwal <mo...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review61668
-----------------------------------------------------------


Thank you for making the changes! I had a few more comments on error handling and readability. But otherwise, looks good to me. Thanks!

Could you please add import statements for all the inner classes? This makes things more readable. For example:
 import org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf.FileFormat
 import static parquet.filter2.predicate.FilterApi.eq
 import static parquet.filter2.predicate.FilterApi.intColumn
and so on...
If you're using Intellij, there is an option to "Insert imports for inner classes". I'm sure there is one in Eclipse as well.


ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment103427>

    please move import up (alphabetical order)



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment103429>

    For consistency, could we call it "literals" instead of "constants" ? (since this argument come from PredicateLeaf, where it's called a literal). Calling it by a different name here is confusing.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment103430>

    for (Object literal : literals)



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment103432>

    This is pretty serious, right ? Should be throwing a RuntimeException here.



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment103433>

    instead throw new RuntimeException("Unknown PredicateLeaf Operator type: " + ... )



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment103437>

    build -> Build



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103438>

    please alphabetize



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103439>

    move this below as part of param description



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103440>

    "literal"



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103441>

    same as above; throw expection



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103442>

    same as above; throw expection



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103443>

    same as above; throw expection



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103444>

    same as above; throw expection



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103445>

    same as above; throw expection



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
<https://reviews.apache.org/r/26968/#comment103446>

    Good to log debug statement here: LOG.debug("Conversion to Parquet FilterPredicate not supported for " + type);



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/26968/#comment103450>

    I assume we are checking READ_COLUMN_NAMES_CONF_STR because this is set only when we do predicate pushdown. Correct ?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/26968/#comment103448>

    move to one line



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/26968/#comment103449>

    combine into one line



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment103452>

    @Override



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment103453>

    @Override



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment103454>

    nit: "File format " + format + " is not supported ...



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment103455>

    To avoid duplicating it for getOrcLiteral(), let's move this code to a separate method ?



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment103458>

    Could you add name of exception in the log line ?
    
    LOG.error("Failed to build predicate filter leaf with errors " + e, e);



ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment103460>

    Should we add one for date, decimal and timestamp ? -- ones we don't support currently -- just to make sure we're not blowing up.


- Mohit Sabharwal


On Nov. 15, 2014, 2:51 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Nov. 15, 2014, 2:51 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml 793ea86 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java a6a0ec1 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java c91644c 
>   serde/pom.xml 98e5506 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6db 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f014 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by cheng xu <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/
-----------------------------------------------------------

(Updated Nov. 15, 2014, 2:51 a.m.)


Review request for hive.


Changes
-------

Summary:
1. quick fix for the NPE issue


Repository: hive-git


Description
-------

HIVE-8122: convert ExprNode to Parquet supported FilterPredict


Diffs (updated)
-----

  pom.xml 793ea86 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java a6a0ec1 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java c91644c 
  serde/pom.xml 98e5506 
  serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6db 
  serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f014 

Diff: https://reviews.apache.org/r/26968/diff/


Testing
-------

local UT passed


Thanks,

cheng xu


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by cheng xu <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/
-----------------------------------------------------------

(Updated Nov. 14, 2014, 3:55 p.m.)


Review request for hive.


Changes
-------

Summary for the update:
1. code clean especially in TestSearchArgumentImpl
2. do some refactor to aviod duplication codes


Repository: hive-git


Description
-------

HIVE-8122: convert ExprNode to Parquet supported FilterPredict


Diffs (updated)
-----

  pom.xml 793ea86 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java a6a0ec1 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d 
  ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java c91644c 
  serde/pom.xml 98e5506 
  serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6db 
  serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f014 

Diff: https://reviews.apache.org/r/26968/diff/


Testing
-------

local UT passed


Thanks,

cheng xu


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by cheng xu <ch...@intel.com>.

> On Nov. 1, 2014, 8:21 p.m., Brock Noland wrote:
> > serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java, line 58
> > <https://reviews.apache.org/r/26968/diff/2/?file=745523#file745523line58>
> >
> >     Perhaps we can change the Type enum to seperate out the types we need and then alter Orc to perform an type == String || type == CHAR || type VARCHAR?

Thanks for your suggestions, invoking the attribution "FilterPredicateType" will pollute the codes indeed. Update the patch by adding serveral adaptive methods for Parquet and Orc getting literal and types. Besides that, Decimal and Timestamp are still not able to create a FilterPredicate. Currently, it is allowed for Ints, Longs, Strings and Booleans.


- cheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review59500
-----------------------------------------------------------


On Oct. 21, 2014, 8:13 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2014, 8:13 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml c69498004cdf93d3955c863031858a2dde2d8ccc 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641545ed0ad69f3bbc9a8383697fc7efe37d 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 831ef8c8ec64c270ef62d5336b4cc78d9e34b398 
>   serde/pom.xml 98e55061b6b3abe18030b0b8d3f511bd98bee5f7 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6dbd1ec71ad178f41e8666bad2500e68e151 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f0148e2a995534a4c1369fc4c542cd0b4e6ab 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by Brock Noland <br...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review59500
-----------------------------------------------------------


Hi,

This approach looks great! I think we should try and avoid creating FilterPredicateType which duplicates Type. We can update Type and make the associated changes in ORC as needed.

Additionally the latest parquet supports Timestamp and Decimal.

Thanks!!


serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java
<https://reviews.apache.org/r/26968/#comment100778>

    Perhaps we can change the Type enum to seperate out the types we need and then alter Orc to perform an type == String || type == CHAR || type VARCHAR?


- Brock Noland


On Oct. 21, 2014, 8:13 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2014, 8:13 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml c69498004cdf93d3955c863031858a2dde2d8ccc 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641545ed0ad69f3bbc9a8383697fc7efe37d 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 831ef8c8ec64c270ef62d5336b4cc78d9e34b398 
>   serde/pom.xml 98e55061b6b3abe18030b0b8d3f511bd98bee5f7 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6dbd1ec71ad178f41e8666bad2500e68e151 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f0148e2a995534a4c1369fc4c542cd0b4e6ab 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>


Re: Review Request 26968: HIVE-8122: convert ExprNode to Parquet supported FilterPredict

Posted by Szehon Ho <sz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26968/#review59711
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment101014>

    Can we capitalize the first letter of all javadocs in this class (including the top one)?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
<https://reviews.apache.org/r/26968/#comment101013>

    Can we fix the spelling (buildPredicate?)



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/26968/#comment101018>

    Please format the if else (space before and after the parentheses).



ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java
<https://reviews.apache.org/r/26968/#comment101016>

    Please properly format this if-else case.



serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java
<https://reviews.apache.org/r/26968/#comment101015>

    Can we capitalize this?


Some minor formatting comments only.

- Szehon Ho


On Oct. 21, 2014, 8:13 a.m., cheng xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26968/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2014, 8:13 a.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-8122: convert ExprNode to Parquet supported FilterPredict
> 
> 
> Diffs
> -----
> 
>   pom.xml c69498004cdf93d3955c863031858a2dde2d8ccc 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/RecordReaderImpl.java bdc2806895a4f4f550fc4d7fe64e8f6d56604f86 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f5da46d392d8ac5f5589f66c37d567b1d3bd8843 
>   ql/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgumentImpl.java eeb9641545ed0ad69f3bbc9a8383697fc7efe37d 
>   ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestSearchArgumentImpl.java 831ef8c8ec64c270ef62d5336b4cc78d9e34b398 
>   serde/pom.xml 98e55061b6b3abe18030b0b8d3f511bd98bee5f7 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.java 616c6dbd1ec71ad178f41e8666bad2500e68e151 
>   serde/src/java/org/apache/hadoop/hive/ql/io/sarg/SearchArgument.java db0f0148e2a995534a4c1369fc4c542cd0b4e6ab 
> 
> Diff: https://reviews.apache.org/r/26968/diff/
> 
> 
> Testing
> -------
> 
> local UT passed
> 
> 
> Thanks,
> 
> cheng xu
> 
>