You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (Jira)" <ji...@apache.org> on 2019/12/30 21:51:00 UTC

[jira] [Resolved] (BEAM-9039) Fix Datachannel stuckness on errors

     [ https://issues.apache.org/jira/browse/BEAM-9039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Burke resolved BEAM-9039.
--------------------------------
    Resolution: Fixed

> Fix Datachannel stuckness on errors
> -----------------------------------
>
>                 Key: BEAM-9039
>                 URL: https://issues.apache.org/jira/browse/BEAM-9039
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-go
>    Affects Versions: Not applicable
>            Reporter: Robert Burke
>            Assignee: Robert Burke
>            Priority: Major
>             Fix For: Not applicable
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Catch all task for any data channel stuckness issues, in particular if any happen on errors. Re-open this issue if a new one is found.
> The last known one I have is a race condition on DataChannel.readErr 
> Close off a race condition where a closing DataChannel might have new readers created for it while it is failing, causing stuckness in the bundles.
> In particular, the c.readErr must be interacted while c.mu is held.
>  Otherwise something like the following happens.
>  Given a channel C, and goroutines G1,G2.
>  # G1 A request for a new reader on C arrives, checks C.readErr finds it null.
>  # G2 An error occurs on reading. The lock is acquired, and C.readErr is set. Readers are closed. The channel is officially closed with A.forceRecreate, removing it from the DataManager cache.
>  # G1 calls A.makeReader, and creates a new reader there.
> There could be an arbitrary number of G1s.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)