You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Ajo Thomas (Jira)" <ji...@apache.org> on 2022/12/16 17:40:00 UTC

[jira] [Updated] (SAMZA-2749) Startpoint bug fix

     [ https://issues.apache.org/jira/browse/SAMZA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ajo Thomas updated SAMZA-2749:
------------------------------
    Fix Version/s: 1.8

> Startpoint bug fix
> ------------------
>
>                 Key: SAMZA-2749
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2749
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Alan Zhang
>            Assignee: Alan Zhang
>            Priority: Major
>             Fix For: 1.8
>
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Within Samza (the main class to pay attention to is OffsetManager.scala), there is a bug in which a startpoint can be deleted before the startpoint actually gets used for message consumption. If a container gets into this situation, then the result is that the startpoint is ignored and consumption will continue from the previous processed message from before the startpoint was applied.
>  # Load last processed offsets and startpoints
>  # Use startpoints to register starting offsets for consumers
>  # Message processing starts, but messages for only some of the partitions are received
>  # Write checkpoint using last processed offsets
>  ## If a partition did not get messages, then the last processed offset is still the offset from before the standpoint.
>  # Delete startpoints
>  # Container dies (e.g. due to running out of memory)
>  # On restart, load last processed offsets (startpoints have been deleted)
>  ## The partitions that did have messages in the previous deployment will have the correct checkpoint.
>  ## The partitions that did not have messages will have the checkpoint set to the offset from before the startpoint was applied. This is unexpected, and it means that bootstrapping is not happening for this partition.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)