You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Francesco Mari (JIRA)" <ji...@apache.org> on 2016/03/02 16:28:18 UTC

[jira] [Comment Edited] (OAK-4083) Simplify concurrency when loading data from the primary

    [ https://issues.apache.org/jira/browse/OAK-4083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175596#comment-15175596 ] 

Francesco Mari edited comment on OAK-4083 at 3/2/16 3:27 PM:
-------------------------------------------------------------

I consider the implementation potentially buggy because it uses Netty in potentially buggy ways. There are very suspicious interactions between {{StandbyClientHandler}} and {{SegmentLoaderHandler}} due to the fact that they share a {{ChannelHandlerContext}}.

Moreover, the implementation presents at least an observable buggy behaviour. OAK-4058 was not completely fixed by my latest effort. Even if the frequency of thrown {{RejectedExecutionException}} is lower with my fix, they still occur. Confirming my intuition is very difficult because a lot depends on the internals of Netty.

I fixed the issue that I reported above in a local checkout, and the problems described in OAK-4058 seem to be completely gone. Moreover, the overall implementation is greatly simplified and uses less system resources than before. I will attach a patch as soon as I'm convinced by my changes.


was (Author: frm):
The implementation actually is potentially buggy because it's using Netty in potentially buggy ways. There are very suspicious interactions between {{StandbyClientHandler}} and {{SegmentLoaderHandler}} due to the fact that they share a {{ChannelHandlerContext}}.

Moreover, the implementation actually presents at least an observable buggy behaviour. OAK-4058 was not completely fixed by my latest effort. Even if the frequency of thrown {{RejectedExecutionException}} is lower with my fix, they still occur. Confirming my intuition is very difficult, because a lot depends on the internals on Netty.

I fixed on my local checkouts the issue that I reported above, and the problems described in OAK-4058 seem to be completely gone. Moreover, the overall implementation is greatly simplified and uses less system resources than before. I will attach a patch as soon as my tests are successful.

> Simplify concurrency when loading data from the primary
> -------------------------------------------------------
>
>                 Key: OAK-4083
>                 URL: https://issues.apache.org/jira/browse/OAK-4083
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: tarmk-standby
>            Reporter: Francesco Mari
>            Assignee: Francesco Mari
>             Fix For: 1.4
>
>
> The cold standby client is overly complicated and potentially buggy when segments are loaded from the primary.
> In particular:
> - when the {{StandbyClientHandler}} is removed from the pipeline, its {{EventExecutorGroup}} is passed to the {{SegmentLoaderHandler}} to signal a surrogate of the "channel close" event. It would be more sensible for {{SegmentLoaderHandler}} to listen to the proper "channel close" event instead.
> - after the {{SegmentLoaderHandler}} is added to the pipeline, the "channel active" callback method is called directly using the {{ChannelHandlerContext}} of the {{StandbyClientHandler}}. It would be better for {{SegmentLoaderHandler}} to use a proper initialisation point like, in example, the "handler added" callback method. The {{SegmentLoaderHandler}} should also use its own {{ChannelHandlerContext}}, instead of owning one from another handler.
> - {{SegmentLoaderHandler}} saves and uses the {{ChannelHandlerContext}} of the {{StandbyClientHandler}}. The only purpose for it is to use the {{EventExecutorGroup}} of the {{StandbyClientHandler}} to issue "write messages" events. The {{SegmentLoaderHandler}} should use Netty's event loop to issue I/O operations, and instead use a different thread for business logic - in this case, the synchronization process between primary and standby instances.
> - The {{StandbyClientHandler}} is registered with its own {{EventExecutorGroup}} with four threads. This is overkill, since the synchronization process should run anyway in a separate thread (see below). The {{StandbyClientHandler}} should use the default event loop.
> - The {{SegmentLoaderHandler}} is registered with its own {{EventExecutorGroup}} with four threads. This is overkill, since request/response cycle between the standby and the primary instances is synchronous. The {{SegmentLoaderHandler}} should instead use the default event loop for I/O events and run another (single) thread to execute the synchronization process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)