You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@logging.apache.org by Carter Kozak <ck...@ckozak.net> on 2020/12/06 23:16:09 UTC

LOG4J2-2965 deadlock brainstorming

Hi Friends,

I've been thinking about ways that we can solve LOG4J2-2965 which is a deadlock between JUL init and log4j init, both of which use synchronization. The crux of the issue is that JUL LogManager uses a synchronized block to initialize, which requests log4j-core initialization and log4j-core attempts to initialize the JUL LogManager if Disruptor is used. If One thread attempts to initialize JUL directly while another begins to initialize log4j via a non-jul path, we may deadlock. I have an example stack trace in the ticket.

This problem is specific to JUL due to the synchronization incurred while initializing our LogManager via java.util.logging.LogManager, so we could update our JUL implementation to lazily connect jul logger instances to the log4j framework. Instead of reaching out to the loggercontext when JUL loggers are requested, we can initialize the delegate log4j components upon the first interaction rather than when the logger is requested, decoupling JUL LogManager initialization from log4j init so it behaves like other bindings. We would need to be careful to track the original caller data to avoid associating JUL loggers with incorrect log4j contexts.

What do you think? Any alternative ideas or suggestions?

Thanks,
Carter

Re: LOG4J2-2965 deadlock brainstorming

Posted by Carter Kozak <ck...@ckozak.net>.
Hi Remko,

I had thought exactly the same thing until I drafted my email, but in doing so I came to the realization (hopefully correctly) that the problem is the direct result of jul.LogManager synchronizing initialization, so if we can avoid log4j initialization within the locked region and defer to a later point, the concurrency model will match our other API bindings and should be safe.
I don't think we can provide a mechanism to wait for initialization because JUL may be bootstrapped by log4j via disruptor, or it may be the component which results in log4j initialization, so I think the solutions which involve waiting can also deadlock. The tricky case is when thread logs using JUL an acquires the lock while another thread is initializing log4j, but blocks attempting to initialize Disruptor classes with static JUL loggers. I have a consistent minimal reproducer here: 
https://github.com/apache/logging-log4j2/pull/447

Carter

On Tue, Dec 8, 2020, at 18:44, Remko Popma wrote:
> Hi Carter,
> 
> I’m not sure if I understand the problem and your proposed solution fully
> but it sounds like it may reduce the probability of a deadlock occurring
> but not fully eliminate the possibility.
> 
> (Without looking at the code, away from pc),
> would it be an idea to expose a status (or perhaps we already have a
> life-cycle status) that components like JUL can query, perhaps with a
> notification (a countdown latch?) so that these components can wait and
> continue initialization when the core is completely initialized?
> 
> Remko
> 
> 
> On Mon, Dec 7, 2020 at 8:16 Carter Kozak <ck...@ckozak.net> wrote:
> 
> > Hi Friends,
> >
> > I've been thinking about ways that we can solve LOG4J2-2965 which is a
> > deadlock between JUL init and log4j init, both of which use
> > synchronization. The crux of the issue is that JUL LogManager uses a
> > synchronized block to initialize, which requests log4j-core initialization
> > and log4j-core attempts to initialize the JUL LogManager if Disruptor is
> > used. If One thread attempts to initialize JUL directly while another
> > begins to initialize log4j via a non-jul path, we may deadlock. I have an
> > example stack trace in the ticket.
> >
> > This problem is specific to JUL due to the synchronization incurred while
> > initializing our LogManager via java.util.logging.LogManager, so we could
> > update our JUL implementation to lazily connect jul logger instances to the
> > log4j framework. Instead of reaching out to the loggercontext when JUL
> > loggers are requested, we can initialize the delegate log4j components upon
> > the first interaction rather than when the logger is requested, decoupling
> > JUL LogManager initialization from log4j init so it behaves like other
> > bindings. We would need to be careful to track the original caller data to
> > avoid associating JUL loggers with incorrect log4j contexts.
> >
> > What do you think? Any alternative ideas or suggestions?
> >
> > Thanks,
> > Carter
> >
> 

Re: LOG4J2-2965 deadlock brainstorming

Posted by Remko Popma <re...@gmail.com>.
Hi Carter,

I’m not sure if I understand the problem and your proposed solution fully
but it sounds like it may reduce the probability of a deadlock occurring
but not fully eliminate the possibility.

(Without looking at the code, away from pc),
would it be an idea to expose a status (or perhaps we already have a
life-cycle status) that components like JUL can query, perhaps with a
notification (a countdown latch?) so that these components can wait and
continue initialization when the core is completely initialized?

Remko


On Mon, Dec 7, 2020 at 8:16 Carter Kozak <ck...@ckozak.net> wrote:

> Hi Friends,
>
> I've been thinking about ways that we can solve LOG4J2-2965 which is a
> deadlock between JUL init and log4j init, both of which use
> synchronization. The crux of the issue is that JUL LogManager uses a
> synchronized block to initialize, which requests log4j-core initialization
> and log4j-core attempts to initialize the JUL LogManager if Disruptor is
> used. If One thread attempts to initialize JUL directly while another
> begins to initialize log4j via a non-jul path, we may deadlock. I have an
> example stack trace in the ticket.
>
> This problem is specific to JUL due to the synchronization incurred while
> initializing our LogManager via java.util.logging.LogManager, so we could
> update our JUL implementation to lazily connect jul logger instances to the
> log4j framework. Instead of reaching out to the loggercontext when JUL
> loggers are requested, we can initialize the delegate log4j components upon
> the first interaction rather than when the logger is requested, decoupling
> JUL LogManager initialization from log4j init so it behaves like other
> bindings. We would need to be careful to track the original caller data to
> avoid associating JUL loggers with incorrect log4j contexts.
>
> What do you think? Any alternative ideas or suggestions?
>
> Thanks,
> Carter
>

Re: LOG4J2-2965 deadlock brainstorming

Posted by Matt Sicker <bo...@gmail.com>.
As one of the biggest tinkerers in log4j-jul myself, I have to say
that sounds like a decent idea. I haven't revisited my assumptions in
that module in a long time as I haven't had to integrate with JUL
lately (at least outside Jenkins, but there I'm using log4j-audit in a
plugin which doesn't interact with the JUL config), so I have no
objections to that.

On Sun, 6 Dec 2020 at 17:16, Carter Kozak <ck...@ckozak.net> wrote:
>
> Hi Friends,
>
> I've been thinking about ways that we can solve LOG4J2-2965 which is a deadlock between JUL init and log4j init, both of which use synchronization. The crux of the issue is that JUL LogManager uses a synchronized block to initialize, which requests log4j-core initialization and log4j-core attempts to initialize the JUL LogManager if Disruptor is used. If One thread attempts to initialize JUL directly while another begins to initialize log4j via a non-jul path, we may deadlock. I have an example stack trace in the ticket.
>
> This problem is specific to JUL due to the synchronization incurred while initializing our LogManager via java.util.logging.LogManager, so we could update our JUL implementation to lazily connect jul logger instances to the log4j framework. Instead of reaching out to the loggercontext when JUL loggers are requested, we can initialize the delegate log4j components upon the first interaction rather than when the logger is requested, decoupling JUL LogManager initialization from log4j init so it behaves like other bindings. We would need to be careful to track the original caller data to avoid associating JUL loggers with incorrect log4j contexts.
>
> What do you think? Any alternative ideas or suggestions?
>
> Thanks,
> Carter