You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by "Lei Sun (Jira)" <ji...@apache.org> on 2020/03/17 23:34:00 UTC
[jira] [Created] (ORC-613) OrcMapredRecordReader mis-reuse struct
object when actual children schema differs
Lei Sun created ORC-613:
---------------------------
Summary: OrcMapredRecordReader mis-reuse struct object when actual children schema differs
Key: ORC-613
URL: https://issues.apache.org/jira/browse/ORC-613
Project: ORC
Issue Type: Bug
Components: Java
Reporter: Lei Sun
When reading from schema like following:
{{uniontype <struct<field0, field1, ..., fieldN>, struct<>> }}
`org.apache.orc.mapreduce.OrcMapreduceRecordReader#nextStruct` will determine if previous object's schema can be reused or not. The determination of this is problematic, since it only checks the top-level type (OrcStruct) but not the schema of OrcStruct. Therefore, if encountering schema like above, and when struct at tag_0 is processed followed with a struct at tag_1, it will reuse the tag_0's struct schema which results in in correct result.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)