You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2020/10/01 06:07:04 UTC

[GitHub] [orc] dongjoon-hyun opened a new pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

dongjoon-hyun opened a new pull request #547:
URL: https://github.com/apache/orc/pull/547


   ### What changes were proposed in this pull request?
   
   This PR aims to reduce two breaking changes in ReaderImpl.java at Apache ORC 1.6.x.
   
   ### Why are the changes needed?
   
   This helps Apache Hive and Spark works with Apache ORC 1.6.x.
   
   ### How was this patch tested?
   
   Manually review because this adds back a field and changes visibility of the field like ORC 1.5.x,


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] omalley commented on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
omalley commented on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702866052


   done


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702327136


   @omalley and @alanfgates .
   
   I'm trying to make Apache Spark 3.1 (scheduled on December 2020) to use Apache ORC 1.6.6. We need at least
   - ORC-669. Reduce breaking changes in ReaderImpl.java (this PR)
   - [[SPARK-33047][BUILD] Upgrade hive-storage-api to 2.7.2](https://github.com/apache/spark/pull/29923)
   
   The other stuffs I'm looking at are the followings
   - The ORC's case-insensitive predicate handling change (1.6.x shows a different result compared with 1.5.x)
   - `OrcTail` API change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-701910054


   Could you review this, @omalley ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702878435


   Thank you so much!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702327136


   @omalley and @alanfgates .
   
   I'm trying to make Apache Spark 3.1 (scheduled on December 2020) to use Apache ORC 1.6.6. We need at least
   - ORC-669. Reduce breaking changes in ReaderImpl.java (this PR)
   - [[SPARK-33047][BUILD] Upgrade hive-storage-api to 2.7.2](https://github.com/apache/spark/pull/29923)
   
   The other stuffs I'm looking at are the followings
   - The ORC's case-insensitive predicate handling change (1.6.x returns more rows compared with 1.5.x)
   - `OrcTail` API change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702327136


   @omalley and @alanfgates .
   
   I'm trying to make Apache Spark 3.1 (scheduled on December 2020) to use Apache ORC 1.6.6. We need at least
   - ORC-669. Reduce breaking changes in ReaderImpl.java (this PR)
   - [[SPARK-33047][BUILD] Upgrade hive-storage-api to 2.7.2](https://github.com/apache/spark/pull/29923)
   
   The other stuffs I'm looking at are the followings
   - The ORC's case-insensitive predicate handling change (1.6.x returns more rows compared with 1.5.x). Although Spark can filter this, this may cause a performance regression.
   - `OrcTail` API change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702809829


   Thank you for merging, @omalley . The above two are not used Spark `sql` module, but are used in Hive library in Spark's `hive` module.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702327136


   @omalley and @alanfgates .
   
   I'm trying to make Apache Spark 3.1 (scheduled on December 2020) to use Apache ORC 1.6.6. We need at least
   - ORC-669. Reduce breaking changes in ReaderImpl.java (this PR)
   - [[SPARK-33047][BUILD] Upgrade hive-storage-api to 2.7.2](https://github.com/apache/spark/pull/29923)
   
   The other stuffs I'm looking at are the followings
   - The ORC's case-insensitive predicate handling change (1.6.x returns more rows compared with 1.5.x). Although Spark can filter this from Spark side, this may cause a performance regression.
   - `OrcTail` API change.
   
   This will help Apache Hive eventually too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702810181


   If you don't mind, please land this to `branch-1.6`, too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702327136


   @omalley and @alanfgates .
   
   I'm trying to make Apache Spark 3.1 (scheduled on December 2020) to use Apache ORC 1.6. We need at least
   - ORC-669. Reduce breaking changes in ReaderImpl.java (this PR)
   - [[SPARK-33047][BUILD] Upgrade hive-storage-api to 2.7.2](https://github.com/apache/spark/pull/29923)
   
   The other stuffs I'm looking at are the followings
   - The ORC's case-insensitive predicate handling change (1.6.x shows a different result compared with 1.5.x)
   - `OrcTail` API change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702327136


   @omalley and @alanfgates .
   
   I'm trying to make Apache Spark 3.1 (scheduled on December 2020) to use Apache ORC 1.6.6. We need at least
   - ORC-669. Reduce breaking changes in ReaderImpl.java (this PR)
   - [[SPARK-33047][BUILD] Upgrade hive-storage-api to 2.7.2](https://github.com/apache/spark/pull/29923)
   
   The other stuffs I'm looking at are the followings
   - The ORC's case-insensitive predicate handling change (1.6.x returns more results compared with 1.5.x)
   - `OrcTail` API change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702327136


   @omalley and @alanfgates .
   
   I'm trying to make Apache Spark 3.1 (scheduled on December 2020) to use Apache ORC 1.6.6. We need at least
   - ORC-669. Reduce breaking changes in ReaderImpl.java (this PR)
   - [[SPARK-33047][BUILD] Upgrade hive-storage-api to 2.7.2](https://github.com/apache/spark/pull/29923)
   
   The other stuffs I'm looking at are the followings
   - The ORC's case-insensitive predicate handling change (1.6.x returns more rows compared with 1.5.x). Although Spark can filter this, this may cause a performance regression.
   - `OrcTail` API change.
   
   This will help Apache Hive eventually too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] omalley commented on pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
omalley commented on pull request #547:
URL: https://github.com/apache/orc/pull/547#issuecomment-702806127


   Is Spark using the details of ReaderImpl?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] omalley merged pull request #547: ORC-669. Reduce breaking changes in ReaderImpl.java

Posted by GitBox <gi...@apache.org>.
omalley merged pull request #547:
URL: https://github.com/apache/orc/pull/547


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org