You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Julian Sedding (JIRA)" <ji...@apache.org> on 2016/01/28 16:53:40 UTC

[jira] [Commented] (SLING-5421) Handle timeout cause when JCR installer is paused

    [ https://issues.apache.org/jira/browse/SLING-5421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121764#comment-15121764 ] 

Julian Sedding commented on SLING-5421:
---------------------------------------

FYI: In the following I will call nodes created under {{/system/sling/installer/jcr/pauseInstallation}} "pause-marker".

Another possible reason for an orphaned pause-marker could be that an {{InterruptedException}} is thrown, which causes the NIO resources of an Oak repository to be closed prematurely. This would prevent any further writes to the repository and thus not allow a client to clean up its pause-marker.

Two points to note:
* We need to record the pausing of the JCR installer in the repository, so other instances in a cluster can see it and also pause their installers.
* We need to be able to recognize an orphaned pause-marker and automatically remove it at some point (e.g. during startup).

I believe that this issue shows the limitations of the current implementation that is based on an unenforced convention between different components in the system (i.e. another service can block the JCR installer indefinitely).

To move forward there seems to be a consensus in offline discussions, that the installer should provide an API to allow services to pause it.

Having an API would then allow making the implementation more robust in a single place. Also, the responsibility for recovery (i.e. removing orphaned pause-markers) would reside with the component that is blocked and thus allow it to self heal).

> Handle timeout cause when JCR installer is paused
> -------------------------------------------------
>
>                 Key: SLING-5421
>                 URL: https://issues.apache.org/jira/browse/SLING-5421
>             Project: Sling
>          Issue Type: Improvement
>          Components: Installer
>    Affects Versions: JCR Installer 3.1.8
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>            Priority: Minor
>             Fix For: JCR Installer 3.1.18
>
>
> With SLING-3747 the JCR installer provided a mechanism for pausing the installer to support cases where installation can result in restart of installer bundle itself.
> However it may happen that once this flag is set the process gets abruptly killed and the flag remain set. In such a case the installer would remain paused and a user would have to remove the flag for it to work again. To support such cases there should be some kind of timeout such that installer does not remain in pause state for ever



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)