You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Alessandro Solimando (Jira)" <ji...@apache.org> on 2022/05/10 09:35:00 UTC

[jira] [Commented] (HIVE-26150) OrcRawRecordMerger reads each row twice

    [ https://issues.apache.org/jira/browse/HIVE-26150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534265#comment-17534265 ] 

Alessandro Solimando commented on HIVE-26150:
---------------------------------------------

I tried few times to make the issue surface with _SortMergedDeleteEventRegistry_ but I haven't managed, sorry!

I have tried to add some updates/inserts in between deletes (to make the condition similar to the UTs where the issue appears) but it did not reproduce, probably it's not the only condition that is required.

> OrcRawRecordMerger reads each row twice
> ---------------------------------------
>
>                 Key: HIVE-26150
>                 URL: https://issues.apache.org/jira/browse/HIVE-26150
>             Project: Hive
>          Issue Type: Bug
>          Components: ORC, Transactions
>    Affects Versions: 4.0.0-alpha-2
>            Reporter: Alessandro Solimando
>            Priority: Major
>
> OrcRawRecordMerger reads each row twice, the issue does not surface since the merger is only used with the parameter "collapseEvents" as true, which filters out one of the two rows.
> collapseEvents true and false should produce the same result, since in current acid implementation, each event has a distinct rowid, so two identical rows cannot be there, this is the case only for the bug.
> In order to reproduce the issue, it is sufficient to set the second parameter to false [here|https://github.com/apache/hive/blob/61d4ff2be48b20df9fd24692c372ee9c2606babe/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L2103-L2106], and run tests in TestOrcRawRecordMerger and observe two tests failing:
> {code:bash}
> mvn test -Dtest=TestOrcRawRecordMerger -pl ql
> {code}
> {noformat}
> [INFO] Results:
> [INFO]
> [ERROR] Failures:
> [ERROR]   TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta:1332 Found unexpected row: (0,ignore.1)
> [ERROR]   TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta:1208 Found unexpected row: (0,ignore.1)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)