You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2021/01/26 07:24:58 UTC

[GitHub] [orc] pavibhai commented on a change in pull request #634: ORC-741: Schema Evolution missing column not handled in the presence of filters

pavibhai commented on a change in pull request #634:
URL: https://github.com/apache/orc/pull/634#discussion_r564295072



##########
File path: java/core/src/java/org/apache/orc/impl/RecordReaderImpl.java
##########
@@ -104,17 +104,10 @@
    */
   static int findColumns(SchemaEvolution evolution,

Review comment:
       The behavior change we did to this API is as follows:
   * If the column is present is missing in readSchema then throw an exception of missing column
   * If the column is missing in file but present in readSchema then we see this as missing column in file handled via NullReader
   
   Wanted to confirm that the ask is to not throw an exception when Hive/Spark use a column name that is missing in the readSchema. I might be missing the scenario where this is the case, I was under the impression that column references should always be valid within the readSchema.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org