You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sling.apache.org by "Ian Boston (JIRA)" <ji...@apache.org> on 2012/10/31 08:17:12 UTC
[jira] [Commented] (SLING-2535) QuartzScheduler:ApacheSling thread group remaining after stopping the scheduler bundle

    [ https://issues.apache.org/jira/browse/SLING-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13487598#comment-13487598 ] 

Ian Boston commented on SLING-2535:
-----------------------------------

Testing in the trunk at 1403475 shows that the thread pool does shutdown if Sling has just been started and no jobs have been run.

31.10.2012 17:48:56.013 *INFO* [401258966@qtp-1399560387-11] org.apache.sling.commons.threads.impl.DefaultThreadPool Shutting down thread pool [ThreadPool-0a751a8b-e907-4993-806f-64923e3d85cd (Apache Sling Eventing Thread Pool)] ...
31.10.2012 17:48:56.013 *INFO* [401258966@qtp-1399560387-11] org.apache.sling.commons.threads.impl.DefaultThreadPool Thread pool [ThreadPool-0a751a8b-e907-4993-806f-64923e3d85cd (Apache Sling Eventing Thread Pool)] is shut down.

JMX records the Thread group disappears when the bundle unloads.
(My JDK is 1.6 Java HotSpot(TM) 64-Bit Server VM version 20.12-b01-434)

Default configuration on thread groups is to shutdown non gracefully so its not an issue with threads in the thread group being slow to terminate.

Looking at the code:

In org.apache.sling.commons.scheduler.impl.QuartzScheduler.QuartzThreadPool.shutdown(boolean) does nothing. 

QuartzThreadPool.shutdown is called by org.quartz.core.QuartzScheduler.shutdown(boolean) line 677.

        resources.getThreadPool().shutdown(waitForJobsToComplete);


The comment on org.apache.sling.commons.scheduler.impl.QuartzScheduler.QuartzThreadPool.shutdown(boolean) indicates that the pool is managed by the thread pool manager.

in the org.apache.sling.commons.scheduler.impl.QuartzScheduler.dispose(Scheduler) line 222, tpm.release(this.threadPool); is called. 
This calls org.apache.sling.commons.threads.impl.DefaultThreadPoolManager.Entry.decUsage() which when a reference counter reaches zero the thread pool is shutdown by calling org.apache.sling.commons.threads.impl.DefaultThreadPool.shutdown() which calls down to the JDK.

That last method should emit some messages:
        this.logger.info("Shutting down thread pool [{}] ...", name);
followed by
        this.logger.info("Thread pool [{}] is shut down.", this.name);
 
the incUsage and decUsage methods reference count with ints protected by synchronized(this.pool) where this.pool is the pool they were added to.... except for one location.

org.apache.sling.commons.threads.impl.DefaultThreadPoolManager.create(ThreadPoolConfig) does this:

  final Entry entry = new Entry(null, config, name);
  synchronized ( this.pools ) {
          this.pools.put(name, entry);
   }
   return entry.incUsage();

which could result in an invalid reference count causing decUsage to never call the ThreadExecutor shutdown.


To recap:
If there is no 
Shutting down thread pool [ThreadPool-0a751a8b-e907-4993-806f-64923e3d85cd (Apache Sling Eventing Thread Pool)] 

in the logs there is a race condition in decUsage, incUsage

if there is, but there is no 
Thread pool [ThreadPool-0a751a8b-e907-4993-806f-64923e3d85cd (Apache Sling Eventing Thread Pool)] is shut down.

Then its an issue with running jobs in the Quartz scheduler.
Since the default configuration of the ThreadPools (unless there is a configuration) seems to be to a non graceful shutdown which points to a race condition.

In addition FindBugs reports 
VO_VOLATILE_INCREMENT
This code increments a volatile field. Increments of volatile fields aren't atomic. If more than one thread is incrementing the field at the same time, increments could be lost.

(It doesn't detect the potential synchronization issue or that the volatile field is protected)


To fix, for certain I would need to be able to reproduce. Is there are reliable way ?

(Sorry for the long comment)







                
> QuartzScheduler:ApacheSling thread group remaining after stopping the scheduler bundle
> --------------------------------------------------------------------------------------
>
>                 Key: SLING-2535
>                 URL: https://issues.apache.org/jira/browse/SLING-2535
>             Project: Sling
>          Issue Type: Bug
>          Components: Commons
>    Affects Versions: Commons Scheduler 2.3.4
>            Reporter: Felix Meschberger
>
> When the Scheduler bundle is stopped, the threads (probably the thread pool) is cleaned away but the thread group "QuartzScheduler:ApacheSling" remains. For ultimate cleanup, I would think the thread group should also be destroyed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira