You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by GitBox <gi...@apache.org> on 2021/07/09 21:08:34 UTC

[GitHub] [parquet-mr] shangxinli opened a new pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

shangxinli opened a new pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918


   Make sure you have checked _all_ steps below.
   
   ### Jira
   
   - [ ] My PR addresses the following [Parquet Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references them in the PR title. For example, "PARQUET-1234: My Parquet PR"
     - https://issues.apache.org/jira/browse/PARQUET-XXX
     - In case you are adding a dependency, check if the license complies with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes how to use it.
     - All the public functions and the classes in the PR contain Javadoc that explain what it does
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] shangxinli commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
shangxinli commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-877865250


   Thanks Chao for your comments. Can you make a PR for [PARQUET-2061](https://issues.apache.org/jira/browse/PARQUET-2061)? I am thinking we can add a new API to return the RowRanges directly and use this PR to make Range public accessible so that your API can this class? 
   
   @gszadovszky Do you think it makes sense? If yes, I am going to fix this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] shangxinli commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
shangxinli commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-881969803


   @gszadovszky I just made the change but the building error 'METHOD_RETURN_TYPE_CHANGED' was reported by 'japicmp'. I guess we need to change something to let 'japicmp' know the 'METHOD_RETURN_TYPE_CHANGED' is intentional.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] shangxinli commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
shangxinli commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-892720333


   Thank you so much @gszadovszky! Sorry didn't get time to check your ask yesterday. I see INFRA-22171 is moving forward. Let's see. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] shangxinli commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
shangxinli commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-891051718


   Agree. Let's keep it internal for now. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] gszadovszky commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
gszadovszky commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-890851073


   @shangxinli, when I've said we should move the class `RowRanges` from the internal package I did not realize it would be a breaking change from java API point of view. japicmp is right and it is not easy to suppress such errors. We can exclude the whole rule or the related class from the check but it might lead to uncaught changes in the future.
   The easiest and more future proof solution is to leave the class `RowRanges` in the internal package. (Sorry for the confusion.) It is not the best solution because it might be misleading to the API clients but I don't have a better one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] gszadovszky commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
gszadovszky commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-892418311


   @shangxinli, I've created the jira INFRA-22171 about the Travis failures. Since the Travis test is for arm64 validation and most of the changes should be platform independent, we can ignore these failures and step forward with the PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] shangxinli commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
shangxinli commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-892720333


   Thank you so much @gszadovszky! Sorry didn't get time to check your ask yesterday. I see INFRA-22171 is moving forward. Let's see. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] sunchao commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-877658152


   I also opened https://issues.apache.org/jira/browse/PARQUET-2061 for similar purpose.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] gszadovszky commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
gszadovszky commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-892435443


   Re-triggering CI...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] shangxinli merged pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
shangxinli merged pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] gszadovszky closed pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
gszadovszky closed pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] gszadovszky commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
gszadovszky commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-878037608


   @shangxinli, unfortunately we still struggle with not having proper low level API for our clients. We did not add the ranges and such to the public API because they should not be required there. Since both Spark and Hive uses lower level APIs (that was not designed to be public originally) I don't think we have any other choice for now to make all the necessary classes/methods public.
   
   Meanwhile, I've added `RowRanges` to the package `org.apache.parquet.internal.filter2.columnindex` (note `internal`) to make it clear that even though the class is public it is not for our clients. So, if we really want to make this public we also need to move it to another package.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] sunchao commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
sunchao commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-878420122


   @shangxinli sure I can make a PR after this one is done - it depends on making the `Range` class public.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [parquet-mr] gszadovszky commented on pull request #918: PARQUET-2064: Make Range public accessible in RowRanges

Posted by GitBox <gi...@apache.org>.
gszadovszky commented on pull request #918:
URL: https://github.com/apache/parquet-mr/pull/918#issuecomment-895317831


   @shangxinli, so the infra ticket is solved now, everything seems to be working fine. The only failing check here is due to the fact that this PR was existed before the fix. We have another entry for Travis CI that is passing so we can ignore the failing one.
   
   Feel free to push this.
   (This fix seems to be required for a patch release so it shall also be backported to the parquet-1.12.x branch.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org