You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@felix.apache.org by "Thomas Watson (JIRA)" <ji...@apache.org> on 2017/06/20 19:18:01 UTC

[jira] [Comment Edited] (FELIX-5416) Endless loop throwing InterruptedException when shutting down framework

    [ https://issues.apache.org/jira/browse/FELIX-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056276#comment-16056276 ] 

Thomas Watson edited comment on FELIX-5416 at 6/20/17 7:17 PM:
---------------------------------------------------------------

This is a real issue and SCR code is wrong.  Here is the sequence of events that happen.

1) The thread stopping the framework is interrupted (in this case it was gogo, but anything could do that).
2) The framework proceeds to stop the bundles
3) The SCR bundle is stopped (current thread is still interrupted)
4) SCR activator cases enters org.apache.felix.scr.impl.ComponentActorThread.terminate()
5) a TERMINATION_TASK is scheduled and the lock on 'tasks' is immediately obtained

Here is where the failure happens.  If the consuming task code in ComponentActorThread.run() does not obtain the 'tasks' lock in order to consume the TERMINATION_TASK before step 5 obtains the 'tasks' lock in order to wait for tasks to be empty we will enter an endless loop of interruptions and this all happens while holding the 'tasks' lock which then also causes deadlock and prevention of the TERMINATION_TASK from ever being consumed.

The terminate() method should be updated to something like the following in order to correctly ignore interrupted exceptions.

{code}
    void terminate()
    {
        schedule( TERMINATION_TASK );
        synchronized (tasks)
        {
            while (!tasks.isEmpty())
            {
                final boolean interrupted = Thread.interrupted();
                try
                {
                    tasks.wait();
                }
                catch (InterruptedException e)
                {
                    logger.log(LogService.LOG_ERROR,
                        "Interrupted exception waiting for queue to empty", e);
                }
                finally
                {
                    if (interrupted)
                    { // restore interrupt status
                        Thread.currentThread().interrupt();
                    }
                }
            }
        }
    }
{code}


was (Author: tjwatson):
This is a real issue and SCR code is wrong.  Here is the sequence of events that happen.

1) The thread stopping the framework is interrupted (in this case it was gogo, but anything could do that).
2) The framework proceeds to stop the bundles
3) The SCR bundle is stopped (current thread is still interrupted)
4) SCR activator cases enters org.apache.felix.scr.impl.ComponentActorThread.terminate()
5) a TERMINATION_TASK is scheduled and the lock on 'tasks' is immediately obtained

Here is where the failure happens.  If the consuming task code in ComponentActorThread.run() does not obtain the 'tasks' lock in order to consume the TERMINATION_TASK before step 5 obtains the 'tasks' lock in order to wait for tasks to be empty we will enter an endless loop of interruptions and this all happens while holding the 'tasks' lock which then also causes deadlock and prevention of the TERMINATION_TASK from ever being consumed.

The terminate() method should be updated to something like the following in order to correctly ignore interrupted exceptions.

{code}
    void terminate()
    {
        schedule( TERMINATION_TASK );
        while (!tasks.isEmpty()) {
            final boolean interrupted = Thread.interrupted();
            try {
                tasks.wait();
            } catch (InterruptedException e) {
                logger.log(LogService.LOG_ERROR,
                        "Interrupted exception waiting for queue to empty", e);
            } finally {
                if (interrupted) { // restore interrupt status
                    Thread.currentThread().interrupt();
                }
            }
        }
    }
{code}

> Endless loop throwing InterruptedException when shutting down framework
> -----------------------------------------------------------------------
>
>                 Key: FELIX-5416
>                 URL: https://issues.apache.org/jira/browse/FELIX-5416
>             Project: Felix
>          Issue Type: Bug
>          Components: Declarative Services (SCR)
>    Affects Versions: scr-2.0.6
>         Environment: OS: linux and windows
> Java Version: 1.8.0_111
> OSGi Impl: Apache Felix (5.6.1)
>            Reporter: Jorge Cercas
>
> When shutting down the framework via the framework's stop method or in a gogo terminal shell via stop 0 command, the Apache Felix Declarative Services goes into a never ending loop with the following log ENTRIES:
> 2016-11-16 17:44:22,030 | ERROR | FelixStartLevel  | scr                              | 7 - org.apache.felix.scr - 2.0.6 | Interrupted exception waiting for queue to empty
> java.lang.InterruptedException
> 	at java.lang.Object.wait(Native Method)[:1.8.0_111]
> 	at java.lang.Object.wait(Object.java:502)[:1.8.0_111]
> 	at org.apache.felix.scr.impl.ComponentActorThread.terminate(ComponentActorThread.java:131)[7:org.apache.felix.scr:2.0.6]
> 	at org.apache.felix.scr.impl.Activator.doStop(Activator.java:216)[7:org.apache.felix.scr:2.0.6]
> 	at org.apache.felix.utils.extender.AbstractExtender.stop(AbstractExtender.java:128)[7:org.apache.felix.scr:2.0.6]
> 	at org.apache.felix.scr.impl.Activator.stop(Activator.java:181)[7:org.apache.felix.scr:2.0.6]
> 	at org.apache.felix.framework.util.SecureAction.stopActivator(SecureAction.java:719)[org.apache.felix.framework-5.6.1.jar:]
> 	at org.apache.felix.framework.Felix.stopBundle(Felix.java:2610)[org.apache.felix.framework-5.6.1.jar:]
> 	at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1389)[org.apache.felix.framework-5.6.1.jar:]
> 	at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:308)[org.apache.felix.framework-5.6.1.jar:]
> 	at java.lang.Thread.run(Thread.java:745)[:1.8.0_111]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)