You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@felix.apache.org by Karl Pauls <ka...@gmail.com> on 2008/02/02 01:20:32 UTC

Re: Deadlock in Felix

On Jan 30, 2008 11:31 AM, Stuart McCulloch <st...@jayway.net> wrote:
> On 30/01/2008, Felix Meschberger <fm...@gmail.com> wrote:
> >
> > Hi Niclas,
> >
> > The problem is (a) the generous synchronisation of Log4J and (b) locking
> > used by stuff used for class loading. In our projects we regularly face
> > issues between Log4J and our ClassLoader implementations synchronizing
> > on ClassLoader.loadClass().
> >
> > The deadlock occurrs because both parties - framework and Log4J - lock
> > "big" parts of their code and call to code outside of their scope while
> > being locked: the framework calls the LogService outside of the
> > framework and Log4J calls into class loading outside of Log4J.
> >
> > On solution, I could imagine, is not using Log4J, which may or may not
> > be an option. Maybe SLF4J or Logback could be an option here ? [ In
> > Sling we actually use Logback as a logging backend for our LogService
> > implementation ]
> >
> > Another solution would be to enhance the framework Logger to diable the
> > use of a LogService. E.g. by defining a framework property, which when
> > set, causes the Logger to never use the LogService.
> >
> > Both solutions don't sound right ...
>
>
> other possible solutions:
>
> a)  have a separate thread make the LogService call (fed from a queue)
>      although you'd have to be careful not to introduce other deadlocks

So would it be acceptable to deliver log calls asynchronously? If so I
can probably make that change quickly ...

regards,

Karl

> b)  delay sending log messages from critical sections of the framework
>      ie. log to a buffer, then send the messages when it's safe to do so
>
>
> Regards
> > Felix
> >
> > Am Mittwoch, den 30.01.2008, 13:51 +0800 schrieb Niclas Hedhman:
> > > On Tuesday 29 January 2008 16:55, Karl Pauls wrote:
> > > > Could you have them retry using Felix 1.0.3? This might be related to
> > > > some of the concurrency things we fixed.
> > > >
> > > > In case they can not be bothered retrying on Felix 1.0.3 then maybe
> > > > they can provide a minimal config file that only uses publicly
> > > > available bundles and has this issue (then I can look into it).
> > >
> > > I am looking at the code in the trunk, and it appears that the locking
> > that
> > > triggers this is in place.
> > >
> > > As always, it is a bit tricky to setup threading problems. SO, I would
> > like to
> > > run a "head exercise" first.
> > >
> > > 1. The Starter thread "FelixStartLevel" locks the ModuleFactoryImpl
> > instance
> > > in R4SearchPolicyCore.resolve().
> > >
> > > 2. The Configuration Admin thread "Configuration Updater" calls the
> > LogService
> > > with the Configuration instance, which leads to Log4J locks on its own
> > > RootLogger and  calls ClassLoader.loadClass() on something found in the
> > > configuration. This leads to trying to acquire the ModuleFactoryImpl
> > lock
> > > either in R4SearchPolicyCore.resolve() or in the provided stack trace
> > > R4SearchPolicyCore.getInUseCandidates() due to a ClassNotFound in the
> > > previous step.
> > >
> > > 3. The "FelixStartLevel" thread reaches
> > > m_logger.log(Logger.LOG_DEBUG, "WIRE: " + wires[wireIdx]);
> > > in R4SearchPolicyCore.createWires() and he log() method will try to
> > acquire
> > > the RootLogger lock.
> > >
> > > 4. DEADLOCK.
> > >
> > >
> > > I agree this is very special to the LogService, since Felix binds to it
> > and
> > > uses it for its internal use, and the responsibility is across two
> > different
> > > systems. Suggestions are welcome.
> > >
> > >
> > > Cheers
> >
> >
>
>
> --
> Cheers, Stuart
>



-- 
Karl Pauls
karlpauls@gmail.com

Re: Deadlock in Felix

Posted by Karl Pauls <ka...@gmail.com>.
> On Monday 11 February 2008 16:17, Karl Pauls wrote:
> > However, do you (and others) think
> > it is acceptable to deliver log events asynchronously or at least
> > later then the actual point they occur?
>
> If you are only worried about Pax Logging case, then you need to do nothing.
> The locking in Log4J is severe enough for us to consider it "our problem" and
> I will probably later in the week do the "buffering" for you, which will be
> in proper order.
>
> I am not sure if there are any other Log Services out there, which will have
> similar problems.
>
> The broader concern probably still remains; Should the framework have any
> locks in place when calling outside itself, whether being log message, events
> or callbacks? The spec is pretty silent about it, and I have no clue how
> Felix measure up against the others on this point. But I think it is a valid
> question to ask both here and to the spec guys.

Well, we only hold locks in the case of log messages. For the other calls we
should be lock free .

regards,

Karl

> Cheers
> --
> Niclas Hedhman, Software Developer
>
> I  live here; http://tinyurl.com/2qq9er
> I  work here; http://tinyurl.com/2ymelc
> I relax here; http://tinyurl.com/2cgsug
>



-- 
Karl Pauls
karlpauls@gmail.com

Re: Deadlock in Felix

Posted by "Richard S. Hall" <he...@ungoverned.org>.
Niclas Hedhman wrote:
> The broader concern probably still remains; Should the framework have any 
> locks in place when calling outside itself, whether being log message, events 
> or callbacks? The spec is pretty silent about it, and I have no clue how 
> Felix measure up against the others on this point. But I think it is a valid 
> question to ask both here and to the spec guys.

No, it shouldn't. And I do think the spec mentions this somewhere. 
However, it might not always be possible to achieve. Still, we should 
review each callback point and verify its correctness.

-> richard

Re: Deadlock in Felix

Posted by Niclas Hedhman <ni...@hedhman.org>.
On Monday 11 February 2008 16:17, Karl Pauls wrote:
> However, do you (and others) think
> it is acceptable to deliver log events asynchronously or at least
> later then the actual point they occur?

If you are only worried about Pax Logging case, then you need to do nothing. 
The locking in Log4J is severe enough for us to consider it "our problem" and 
I will probably later in the week do the "buffering" for you, which will be 
in proper order.

I am not sure if there are any other Log Services out there, which will have 
similar problems.

The broader concern probably still remains; Should the framework have any 
locks in place when calling outside itself, whether being log message, events 
or callbacks? The spec is pretty silent about it, and I have no clue how 
Felix measure up against the others on this point. But I think it is a valid 
question to ask both here and to the spec guys.


Cheers
-- 
Niclas Hedhman, Software Developer

I  live here; http://tinyurl.com/2qq9er
I  work here; http://tinyurl.com/2ymelc
I relax here; http://tinyurl.com/2cgsug

Re: Deadlock in Felix

Posted by Karl Pauls <ka...@gmail.com>.
On Feb 11, 2008 9:29 AM, Felix Meschberger <fm...@gmail.com> wrote:
> Hi,
>
> Am Montag, den 11.02.2008, 09:17 +0100 schrieb Karl Pauls:
> > > Later on, I think I'll change Pax Logging to move all incoming requests (log
> > > and config) to its own thread to prevent the whole issue all together.
> >
> > Good - so we don't have to hurry. However, do you (and others) think
> > it is acceptable to deliver log events asynchronously or at least
> > later then the actual point they occur? We have the eventdispatching
> > thread around anyways and I think it should be doable to use it to
> > deliver events. But I will only spend time looking into this if the
> > general opinion is that it is worthwhile ...
>
> I am not sure, whether this would be a good solution, because you would
> then have log messages from the application and log messages from the
> framework being out of think with regard to actuall order of processing.
> This could cause some headaches when trying to interpret the log files.
>
> In addition, I tend to think, that the way Log4J locks itself and by
> loading classes inside that lock potentially the complete system is kind
> of problematic. Not sure how NLog4J and Logback act in this respect.
>
> On the Felix side, we might come up with a solution where we may
> actively delay logging when inside some synchronized/locked code.
> Probably the only really problematic location for logging is inside the
> ClassLoader support. Could we gather messages there to be logged after
> releasing any locks ?

Sure, we could do that. Let's see whether there are other opinions
else I'll give it a shot.

regards,

Karl

> Regards
> Felix
>
>



-- 
Karl Pauls
karlpauls@gmail.com

Re: Deadlock in Felix

Posted by Felix Meschberger <fm...@gmail.com>.
Hi,

Am Montag, den 11.02.2008, 09:17 +0100 schrieb Karl Pauls:
> > Later on, I think I'll change Pax Logging to move all incoming requests (log
> > and config) to its own thread to prevent the whole issue all together.
> 
> Good - so we don't have to hurry. However, do you (and others) think
> it is acceptable to deliver log events asynchronously or at least
> later then the actual point they occur? We have the eventdispatching
> thread around anyways and I think it should be doable to use it to
> deliver events. But I will only spend time looking into this if the
> general opinion is that it is worthwhile ...

I am not sure, whether this would be a good solution, because you would
then have log messages from the application and log messages from the
framework being out of think with regard to actuall order of processing.
This could cause some headaches when trying to interpret the log files.

In addition, I tend to think, that the way Log4J locks itself and by
loading classes inside that lock potentially the complete system is kind
of problematic. Not sure how NLog4J and Logback act in this respect.

On the Felix side, we might come up with a solution where we may
actively delay logging when inside some synchronized/locked code.
Probably the only really problematic location for logging is inside the
ClassLoader support. Could we gather messages there to be logged after
releasing any locks ?

Regards
Felix


Re: Deadlock in Felix

Posted by Karl Pauls <ka...@gmail.com>.
> On Saturday 02 February 2008 08:20, Karl Pauls wrote:
>
> > So would it be acceptable to deliver log calls asynchronously? If so I
> > can probably make that change quickly ...
>
> For now, the immediate problem solved by changing framework loglevel to INFO.

Ok.

> Later on, I think I'll change Pax Logging to move all incoming requests (log
> and config) to its own thread to prevent the whole issue all together.

Good - so we don't have to hurry. However, do you (and others) think
it is acceptable to deliver log events asynchronously or at least
later then the actual point they occur? We have the eventdispatching
thread around anyways and I think it should be doable to use it to
deliver events. But I will only spend time looking into this if the
general opinion is that it is worthwhile ...

regards,

Karl

> Cheers
> --
> Niclas Hedhman, Software Developer
>
> I  live here; http://tinyurl.com/2qq9er
> I  work here; http://tinyurl.com/2ymelc
> I relax here; http://tinyurl.com/2cgsug
>



-- 
Karl Pauls
karlpauls@gmail.com

Re: Deadlock in Felix

Posted by Niclas Hedhman <ni...@hedhman.org>.
On Saturday 02 February 2008 08:20, Karl Pauls wrote:

> So would it be acceptable to deliver log calls asynchronously? If so I
> can probably make that change quickly ...

For now, the immediate problem solved by changing framework loglevel to INFO.

Later on, I think I'll change Pax Logging to move all incoming requests (log 
and config) to its own thread to prevent the whole issue all together.


Cheers
-- 
Niclas Hedhman, Software Developer

I  live here; http://tinyurl.com/2qq9er
I  work here; http://tinyurl.com/2ymelc
I relax here; http://tinyurl.com/2cgsug