You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Francesco Mari (JIRA)" <ji...@apache.org> on 2016/06/02 12:58:59 UTC

[jira] [Comment Edited] (OAK-4291) FileStore.flush prone to races leading to corruption

    [ https://issues.apache.org/jira/browse/OAK-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15312255#comment-15312255 ] 

Francesco Mari edited comment on OAK-4291 at 6/2/16 12:58 PM:
--------------------------------------------------------------

[~mduerig], I made some modifications in [^OAK-4291-02.patch], In particular:

# {{allReturned()}} doesn't need to be guarded by the {{poolMonitor}}. 
# according to Guava's documentation, the {{Guard}} returned by {{allReturned()}} is called while holding {{poolMonitor}}, so it's safe to access {{disposed}} without further synchronization in {{isSatisfied()}}.
# when flushing, only the writers that were borrowed before the call to {{flush()}} are considered. The implementations in the previous patch flushed every disposed writer, even if some of those writers could have been borrowed, returned and disposed in the middle of the {{flush()}} method. This might be overkill, but I find this implementation more consistent.
# since {{toFlush}} and {{toReturn}} are local to the thread calling {{flush()}}, there is no need for a new {{flushMonitor}}. Two threads calling {{flush{}}} concurrently will operate on a disjoint set of writers. This simplification relies on the implementation of {{SegmentStore.writeSegment()}} to be thread-safe.

Can you take a look at this?


was (Author: frm):
[~mduerig], I made some modifications in [^OAK-4291-02.patch], In particular:
# {{allReturned()}} doesn't need to be guarded by the {{poolMonitor}}. 
# according to Guava's documentation, the {{Guard}} returned by {{allReturned()}} is called while holding {{poolMonitor}}, so it's safe to access {{disposed}} without further synchronization in {{isSatisfied()}}.
# when flushing, only the writers that were borrowed before the call to {{flush()}} are considered. The implementations in the previous patch flushed every disposed writer, even if some of those writers could have been borrowed, returned and disposed in the middle of the {{flush()}} method. This might be overkill, but I find this implementation more consistent.
# since {{toFlush}} and {{toReturn}} are local to the thread calling {{flush()}}, there is no need for a new {{flushMonitor}}. Two threads calling {{flush{}}} concurrently will operate on a disjoint set of writers. This simplification relies on the implementation of {{SegmentStore.writeSegment()}} to be thread-safe.
Can you take a look at this?

> FileStore.flush prone to races leading to corruption
> ----------------------------------------------------
>
>                 Key: OAK-4291
>                 URL: https://issues.apache.org/jira/browse/OAK-4291
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>            Priority: Critical
>              Labels: resilience
>             Fix For: 1.6
>
>         Attachments: OAK-4291-02.patch, OAK_4291-UTs.patch, OAK_4291.patch
>
>
> There is a small window in {{FileStore.flush}} that could lead to data corruption: if we crash right after setting the persisted head but before any delay-flushed {{SegmentBufferWriter}} instance flushes (see {{SegmentBufferWriterPool.returnWriter()}}) then that data is lost although it might already be referenced from the persisted head.
> We need to come up with a test case for this. 
> A possible fix would be to return a future from {{SegmentWriter.flush}} and rely on a completion callback. Such a change would most likely also be useful for OAK-3690. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)