You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Lei Sun (Jira)" <ji...@apache.org> on 2020/03/17 23:34:00 UTC

[jira] [Created] (ORC-613) OrcMapredRecordReader mis-reuse struct object when actual children schema differs

Lei Sun created ORC-613:
---------------------------

             Summary: OrcMapredRecordReader mis-reuse struct object when actual children schema differs
                 Key: ORC-613
                 URL: https://issues.apache.org/jira/browse/ORC-613
             Project: ORC
          Issue Type: Bug
          Components: Java
            Reporter: Lei Sun


When reading from schema like following:  

{{uniontype <struct<field0, field1, ..., fieldN>, struct<>> }}

`org.apache.orc.mapreduce.OrcMapreduceRecordReader#nextStruct` will determine if previous object's schema can be reused or not. The determination of this is problematic, since it only checks the top-level type (OrcStruct) but not the schema of OrcStruct. Therefore, if encountering schema like above, and when struct at tag_0 is processed followed with a struct at tag_1, it will reuse the tag_0's struct schema which results in in correct result. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)