You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Peter Vary (Jira)" <ji...@apache.org> on 2020/07/20 13:33:00 UTC

[jira] [Commented] (HIVE-23883) Streaming does not flush the side file

    [ https://issues.apache.org/jira/browse/HIVE-23883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17161246#comment-17161246 ] 

Peter Vary commented on HIVE-23883:
-----------------------------------

CC: [~kuczoram], [~klcopp]

> Streaming does not flush the side file
> --------------------------------------
>
>                 Key: HIVE-23883
>                 URL: https://issues.apache.org/jira/browse/HIVE-23883
>             Project: Hive
>          Issue Type: Bug
>          Components: Streaming, Transactions
>            Reporter: Peter Vary
>            Priority: Major
>
> When a streaming write commits a mid-batch write with {{connection.commitTransaction()}} then it tries to flush the sideFile with {{OrcInputFormat.SHIMS.hflush(flushLengths)}}. This uses FSOutputSummer.flush, which does not flush the buffer data to the disk so the actual data is not written.
> Had to remove the check from the end of the streaming tests in {{TestCrudCompactorOnTez.java}}
> {code:java}
>       CompactorTestUtilities.checkAcidVersion(fs.listFiles(new Path(table.getSd().getLocation()), true), fs,
>           conf.getBoolVar(HiveConf.ConfVars.HIVE_WRITE_ACID_VERSION_FILE),
>           new String[] { AcidUtils.DELTA_PREFIX });
> {code}
> These checks verifies the {{_flush_length}} files, and they would fail otherwise.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)