You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Jungtaek Lim (JIRA)" <ji...@apache.org> on 2018/12/02 07:19:00 UTC

[jira] [Resolved] (STORM-3292) Trident HiveState must flush writers when the batch commits

     [ https://issues.apache.org/jira/browse/STORM-3292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jungtaek Lim resolved STORM-3292.
---------------------------------
       Resolution: Fixed
         Assignee: Arun Mahadevan
    Fix Version/s: 1.2.3
                   2.0.0

Thanks [~arunmahadevan], I merged into master and 1.x-branch.

> Trident HiveState must flush writers when the batch commits
> -----------------------------------------------------------
>
>                 Key: STORM-3292
>                 URL: https://issues.apache.org/jira/browse/STORM-3292
>             Project: Apache Storm
>          Issue Type: Improvement
>            Reporter: Arun Mahadevan
>            Assignee: Arun Mahadevan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.0.0, 1.2.3
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> For trident the hive writer is flushed only after it hits the batch size.
> see - https://github.com/apache/storm/blob/master/external/storm-hive/src/main/java/org/apache/storm/hive/trident/HiveState.java#L108
> Trident HiveState does not flush during the batch commit and it appears to be an oversight. Without this trident state cannot guarantee at-least once. (E.g. if the transaction is open but trident moves to the next txid and later fails the data in the open transaction is lost).
> So I think for at-least once, the HiveState must flush all the writers irrespective of the batch sizes when trident invokes the "commit(txid)" .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)