You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/07/27 20:52:00 UTC

[jira] [Commented] (SAMZA-1768) Handle corrupted OFFSET file elegantly

    [ https://issues.apache.org/jira/browse/SAMZA-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16560316#comment-16560316 ] 

ASF GitHub Bot commented on SAMZA-1768:
---------------------------------------

GitHub user xinyuiscool opened a pull request:

    https://github.com/apache/samza/pull/588

    SAMZA-1768: Handle corrupted OFFSET file

    This patch addresses the following tickets:
    
    SAMZA-1778: SIGSEGV when reading properties (metrics) on a closed RocksDB store
    SAMZA-1777: Logged store OFFSET file write during flush should be atomic
    SAMZA-1768: Handle corrupted OFFSET file elegantly

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xinyuiscool/samza SAMZA-1768

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/588.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #588
    
----
commit 27e24e8238b39f0cf927d34fbada8f6e8d22e2c0
Author: xinyuiscool <xi...@...>
Date:   2018-07-26T18:14:45Z

    SAMZA-1768: Handle corrupted OFFSET file elegantly

commit 853f48ab9909fcad410f651d03f2f98507e6222d
Author: xinyuiscool <xi...@...>
Date:   2018-07-26T18:31:50Z

    Minor changes

----


> Handle corrupted OFFSET file elegantly
> --------------------------------------
>
>                 Key: SAMZA-1768
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1768
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Xinyu Liu
>            Assignee: Xinyu Liu
>            Priority: Major
>
> TaskStorageManager.readOffsetFile() has a bug that if the OFFSET file is corrupted, it will throw exception and shut down the container. If host affinity is turned on, the container won't be able start up again since it will read the same corrupted OFFSET file every time, until it was manually removed. Since we cannot recover in this case, we should catch the exception and return null and let the store bootstrap from changelog.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)