You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@logging.apache.org by Volkan Yazıcı <vo...@gmail.com> on 2019/12/28 21:10:39 UTC

The convention for thread-local allocations (TLAs)

Hello,

I see that many Log4j components rely on TLA for efficiency reasons.
There I have two main concerns:

1. TLAs, for obvious reasons, entail an extra memory cost on
   threads, particularly, even when they are not used. Consider
   a thread invoking a Log4j flow where various TLAs are
   performed. That storage will keep on occupying the memory
   until the next time it gets used.

2. JEE servers are known to accommodate quite some threads,
   e.g., the the default thread count is 200 for Tomcat[1]. TLAs at
   will will certainly have an impact on context switching costs too.

Is there a convention recommended in this domain? In
log4j2-logstash-layout, I favor object pools over TLAs. Object pools
offer better memory utilization with the cost of synchronization. I
see that many text-based Log4j 2.0 layouts rely on
ThreadLocal<StringBuilder>s. While it performs pretty good, I am
struggling with the idea of associating a large(?) text chunk next to
each thread who has at least once attempted to log something.

Best.

[1] http://tomcat.apache.org/tomcat-5.5-doc/config/http.html#Common_Attributes

Re: The convention for thread-local allocations (TLAs)

Posted by Volkan Yazıcı <vo...@gmail.com>.
In the PR that I am working on JsonTemplateLayout, I have updated
ThreadLocalVsPoolBenchmark. I am struggling to explain the results[1].
I will appreciate some feedback here.

[1] https://github.com/apache/logging-log4j2/pull/335#issuecomment-579835151

On Sat, Dec 28, 2019 at 11:27 PM Ralph Goers <ra...@dslextreme.com> wrote:
>
> Many of the ThreadLocals were added in the effort to make much of Log4j be garbage free. As I recall Remko did some performance tests on various alternatives and what we have is a result of that. But that is just my recollection.
>
> Ralph
>
> > On Dec 28, 2019, at 2:39 PM, Volkan Yazıcı <vo...@gmail.com> wrote:
> >
> > Regarding log4j2.enableThreadlocals configuration knob, it feels more
> > like a hint rather than a hard restriction. See how
> > AbstractStringLayout#getStringBuilder() doesn't take that flag into
> > account.
> >
> > On Sat, Dec 28, 2019 at 10:10 PM Volkan Yazıcı <vo...@gmail.com> wrote:
> >>
> >> Hello,
> >>
> >> I see that many Log4j components rely on TLA for efficiency reasons.
> >> There I have two main concerns:
> >>
> >> 1. TLAs, for obvious reasons, entail an extra memory cost on
> >>   threads, particularly, even when they are not used. Consider
> >>   a thread invoking a Log4j flow where various TLAs are
> >>   performed. That storage will keep on occupying the memory
> >>   until the next time it gets used.
> >>
> >> 2. JEE servers are known to accommodate quite some threads,
> >>   e.g., the the default thread count is 200 for Tomcat[1]. TLAs at
> >>   will will certainly have an impact on context switching costs too.
> >>
> >> Is there a convention recommended in this domain? In
> >> log4j2-logstash-layout, I favor object pools over TLAs. Object pools
> >> offer better memory utilization with the cost of synchronization. I
> >> see that many text-based Log4j 2.0 layouts rely on
> >> ThreadLocal<StringBuilder>s. While it performs pretty good, I am
> >> struggling with the idea of associating a large(?) text chunk next to
> >> each thread who has at least once attempted to log something.
> >>
> >> Best.
> >>
> >> [1] http://tomcat.apache.org/tomcat-5.5-doc/config/http.html#Common_Attributes
> >
>
>

Re: The convention for thread-local allocations (TLAs)

Posted by Ralph Goers <ra...@dslextreme.com>.
The requirement for core is, and will always be, that it can work without any additional libraries. In the places we use Thread Locals we would require that the Object Pooling logic be part of Log4j, not in another jar. 

Ralph


> On Jan 8, 2020, at 4:32 PM, Gary Gregory <ga...@gmail.com> wrote:
> 
> Can we please not invent yet another pooling mechanism at Apache? ;-) OTOH
> bringing in Apache Commons Pool might be too heavy handed.
> 
> For the two main server stacks I work on at my day job, it's not an issue
> since I already use Commons Pool. Actually, now that I think of it: Commons
> DBCP (which depend on Commons Pool) is already an optional dependency for
> the JDBC Appender...
> 
> Discuss! :-)
> 
> Gary
> 
> On Wed, Jan 8, 2020 at 5:17 PM Volkan Yazıcı <vo...@gmail.com>
> wrote:
> 
>> On Mon, Jan 6, 2020 at 7:09 PM Matt Sicker <bo...@gmail.com> wrote:
>>> Would it be useful to implement some sort of buffer pool for
>>> StringBuilders and ByteBuffers? Could likely copy code from netty's
>>> util library (ByteBuf et al.) or reuse stuff from commons-pool if
>>> needed. This would work properly in applications, servlets, and even
>>> reactive streams and lightweight threads later on.
>> 
>> Would you mind elaborating a little bit more on the idea you have in
>> mind, please? To get the discussion going, I have the following draft:
>> 
>>    public interface ObjectPool<V> {
>>        V acquire();
>>        void release(V instance);
>>    }
>> 
>>    public interface ObjectPoolFactory {
>>        // For char[], byte[], ByteBuffer, etc. (Types that cannot get
>> extended implicitly.)
>>        ObjectPool<V> create(Class<V> clazz, Supplier<V> objectFactory);
>>        // For StringBuilder, etc. (Types that can get extended
>> implicitly.)
>>        ObjectPool<V> share(Class<V> clazz);
>>    }
>> 
>> One can pretty efficiently implement such an ObjectPool<V>, e.g., via
>> a JCTools[1] MPMC queue. Though note that object pools necessitate
>> manual object release, which is not the case for TLA-provided objects.
>> 
>> Providing a shared object pool factory to the rest of the code base
>> might need a little bit more thinking.
>> 
>> Exposing configuration knobs to enforce certain limits on the pool and
>> its factory is also a question that needs to be addressed.
>> 
>> [1] https://github.com/JCTools/JCTools/
>> 



Re: The convention for thread-local allocations (TLAs)

Posted by Matt Sicker <bo...@gmail.com>.
I haven't thought too much about an interface yet, though having it
work in such a way as to still allow thread local values to be used as
one strategy would be a good design goal. Allowing for a more advanced
object pool implementation as some sort of optional dependency or
module could work there.

Your idea about using an MPMC queue for the object pool was
essentially what I was thinking, though we can likely just reuse
existing object pool implementations to avoid reinventing the wheel
here.

As for what the pool needs to allocate and manage, I'd expect that to
either be Java-level buffer classes like StringBuilder, ByteBuffer,
and potentially CharBuffer, or some sort of abstraction around those
classes (e.g., we have ByteBufferDestination and
StringBuilderFormattable from what I recall).

A nice bonus of the resource pooling idea is that asynchronous
appenders could still work in garbage free mode without overloading
the system with OOM errors or having overly complicated book-keeping
code.

On Thu, 9 Jan 2020 at 07:22, Volkan Yazıcı <vo...@gmail.com> wrote:
>
> On Thu, Jan 9, 2020 at 12:33 AM Gary Gregory <ga...@gmail.com> wrote:
> > Can we please not invent yet another pooling mechanism at Apache?
>
> That definitely was not my intention. I was just suggesting borrowing
> code (not dependency!) from certain projects to get us going, since,
> to the best of my knowledge, log4j-core is thriving to stay dependency
> free. (@Ralph has made the very same remark as well.)
>
> In any case, I still did not hear anything about the *interface* we
> want to make accessible within the Log4j code base. (@Matt?) How am I
> gonna claim a char[] of certain length? What about a StringBuilder?
> How are we gonna introduce control knobs to tune pool size(s)?
>
> In LogstashLayout, I dealt with these issues in an unpleasant way:
> each pool had its own configuration knob. While this worked okayish
> for LogstashLayout -- since I needed ~3 pools in total -- I am pretty
> sure it will turn into a configuration nightmare for bigger code
> bases, e.g., log4j-core.



-- 
Matt Sicker <bo...@gmail.com>

Re: The convention for thread-local allocations (TLAs)

Posted by Volkan Yazıcı <vo...@gmail.com>.
On Thu, Jan 9, 2020 at 12:33 AM Gary Gregory <ga...@gmail.com> wrote:
> Can we please not invent yet another pooling mechanism at Apache?

That definitely was not my intention. I was just suggesting borrowing
code (not dependency!) from certain projects to get us going, since,
to the best of my knowledge, log4j-core is thriving to stay dependency
free. (@Ralph has made the very same remark as well.)

In any case, I still did not hear anything about the *interface* we
want to make accessible within the Log4j code base. (@Matt?) How am I
gonna claim a char[] of certain length? What about a StringBuilder?
How are we gonna introduce control knobs to tune pool size(s)?

In LogstashLayout, I dealt with these issues in an unpleasant way:
each pool had its own configuration knob. While this worked okayish
for LogstashLayout -- since I needed ~3 pools in total -- I am pretty
sure it will turn into a configuration nightmare for bigger code
bases, e.g., log4j-core.

Re: The convention for thread-local allocations (TLAs)

Posted by Gary Gregory <ga...@gmail.com>.
Can we please not invent yet another pooling mechanism at Apache? ;-) OTOH
bringing in Apache Commons Pool might be too heavy handed.

For the two main server stacks I work on at my day job, it's not an issue
since I already use Commons Pool. Actually, now that I think of it: Commons
DBCP (which depend on Commons Pool) is already an optional dependency for
the JDBC Appender...

Discuss! :-)

Gary

On Wed, Jan 8, 2020 at 5:17 PM Volkan Yazıcı <vo...@gmail.com>
wrote:

> On Mon, Jan 6, 2020 at 7:09 PM Matt Sicker <bo...@gmail.com> wrote:
> > Would it be useful to implement some sort of buffer pool for
> > StringBuilders and ByteBuffers? Could likely copy code from netty's
> > util library (ByteBuf et al.) or reuse stuff from commons-pool if
> > needed. This would work properly in applications, servlets, and even
> > reactive streams and lightweight threads later on.
>
> Would you mind elaborating a little bit more on the idea you have in
> mind, please? To get the discussion going, I have the following draft:
>
>     public interface ObjectPool<V> {
>         V acquire();
>         void release(V instance);
>     }
>
>     public interface ObjectPoolFactory {
>         // For char[], byte[], ByteBuffer, etc. (Types that cannot get
> extended implicitly.)
>         ObjectPool<V> create(Class<V> clazz, Supplier<V> objectFactory);
>         // For StringBuilder, etc. (Types that can get extended
> implicitly.)
>         ObjectPool<V> share(Class<V> clazz);
>     }
>
> One can pretty efficiently implement such an ObjectPool<V>, e.g., via
> a JCTools[1] MPMC queue. Though note that object pools necessitate
> manual object release, which is not the case for TLA-provided objects.
>
> Providing a shared object pool factory to the rest of the code base
> might need a little bit more thinking.
>
> Exposing configuration knobs to enforce certain limits on the pool and
> its factory is also a question that needs to be addressed.
>
> [1] https://github.com/JCTools/JCTools/
>

Re: The convention for thread-local allocations (TLAs)

Posted by Volkan Yazıcı <vo...@gmail.com>.
On Mon, Jan 6, 2020 at 7:09 PM Matt Sicker <bo...@gmail.com> wrote:
> Would it be useful to implement some sort of buffer pool for
> StringBuilders and ByteBuffers? Could likely copy code from netty's
> util library (ByteBuf et al.) or reuse stuff from commons-pool if
> needed. This would work properly in applications, servlets, and even
> reactive streams and lightweight threads later on.

Would you mind elaborating a little bit more on the idea you have in
mind, please? To get the discussion going, I have the following draft:

    public interface ObjectPool<V> {
        V acquire();
        void release(V instance);
    }

    public interface ObjectPoolFactory {
        // For char[], byte[], ByteBuffer, etc. (Types that cannot get
extended implicitly.)
        ObjectPool<V> create(Class<V> clazz, Supplier<V> objectFactory);
        // For StringBuilder, etc. (Types that can get extended implicitly.)
        ObjectPool<V> share(Class<V> clazz);
    }

One can pretty efficiently implement such an ObjectPool<V>, e.g., via
a JCTools[1] MPMC queue. Though note that object pools necessitate
manual object release, which is not the case for TLA-provided objects.

Providing a shared object pool factory to the rest of the code base
might need a little bit more thinking.

Exposing configuration knobs to enforce certain limits on the pool and
its factory is also a question that needs to be addressed.

[1] https://github.com/JCTools/JCTools/

Re: The convention for thread-local allocations (TLAs)

Posted by Volkan Yazıcı <vo...@gmail.com>.
On Tue, Jan 7, 2020 at 12:36 AM Ralph Goers <ra...@dslextreme.com> wrote:
> I am not against making a change but I would need to see performance results before it gets merged.

Unless my knowledge on the JVM is fundamentally wrong, no shared data
structure can beat a TLA in terms of access costs. But at which point
synchronization costs of a shared data structure outweigh its
advantage -- predictable (and user configurable) memory footprint? I
don't think anybody would be willing to realize such a change without
convincing benchmark figures, hence agreeing with your comment.

Re: The convention for thread-local allocations (TLAs)

Posted by Ralph Goers <ra...@dslextreme.com>.
I am not against making a change but I would need to see performance results before it gets merged.

Ralph

> On Jan 6, 2020, at 2:20 PM, Carter Kozak <ck...@ckozak.net> wrote:
> 
> Let's give it a try and see how we like it :-) I probably won't have time to prototype it for a month or two, but I think we could gain performance by passing top level objects around (including between threads) instead of copying data between buffer and objects on heap.
> 
> -ck
> 
> On Mon, Jan 6, 2020, at 14:17, Matt Sicker wrote:
>> It seems like it'd be more flexible to tune, too. The current
>> ThreadLocal approach scales uncontrollably with the number of threads
>> as it is.
>> 
>> On Mon, 6 Jan 2020 at 12:20, Gary Gregory <ga...@gmail.com> wrote:
>>> 
>>> I like the idea of centrally controlling these objects. This should make
>>> resource monitoring easier as well.
>>> 
>>> Gary
>>> 
>>> On Mon, Jan 6, 2020, 13:09 Matt Sicker <bo...@gmail.com> wrote:
>>> 
>>>> Would it be useful to implement some sort of buffer pool for
>>>> StringBuilders and ByteBuffers? Could likely copy code from netty's
>>>> util library (ByteBuf et al.) or reuse stuff from commons-pool if
>>>> needed. This would work properly in applications, servlets, and even
>>>> reactive streams and lightweight threads later on.
>>>> 
>>>> On Tue, 31 Dec 2019 at 03:22, Volkan Yazıcı <vo...@gmail.com>
>>>> wrote:
>>>>> 
>>>>> On Mon, Dec 30, 2019 at 10:15 PM Carter Kozak <ck...@ckozak.net> wrote:
>>>>>> Beyond StringBuilder instances, we attempt to clear references
>>>>>> from all thread local references to avoid substantial overhead. In
>>>>>> practice this works nicely because it reinforces java performance
>>>>>> characteristics. Java threads are fairly memory expensive (not to
>>>>>> mention the cost of initialization) so the threadlocal object overhead
>>>>>> from log4j tends to be inconsequential by comparison. Applications
>>>>>> in memory constrained environments already have relatively few
>>>>>> threads, and applications which constantly create and destroy threads
>>>>>> tend not to worry about the performance of creating log events
>>>>>> because it's inexpensive compared to thread initialization.
>>>>>> 
>>>>>> Have you observed a problem? We've found and resolved a few issues
>>>>>> over the last year or so where references were held longer than
>>>>>> expected. If you're aware of places we're using more memory than we
>>>>>> should, please file a ticket :-)
>>>>> 
>>>>> AFAIC, the only TLA in Log4j 2.0 core violating
>>>>> log4j2.enableThreadlocals flag is
>>>>> AbstractStringLayout#getStringBuilder(). Given AbstractStringLayout is
>>>>> used by many internal (HTML, XML, JSON, YAML, Pattern, Gelf, Syslog)
>>>>> and external (ECS) layouts, the fix will incur a significant
>>>>> performance penalty. I wouldn't be surprised if we start receiving
>>>>> performance regression bug reports from users after releasing such a
>>>>> fix, since a notable amount of Log4j 2.0 users, to the best of my
>>>>> knowledge, are using it in JEE context (e.g., Spring WebMvc with
>>>>> Tomcat backend) where ENABLE_THREADLOCALS are disabled due to the
>>>>> present IS_WEB_APP condition:
>>>>> 
>>>>> o.a.l.l.u.Constants.ENABLE_THREADLOCALS =
>>>>> !IS_WEB_APP &&
>>>>> PropertiesUtil
>>>>> .getProperties()
>>>>> .getBooleanProperty("log4j2.enable.threadlocals", true);
>>>>> 
>>>>> Created LOG4J2-2753[1] for this issue.
>>>>> 
>>>>> The reason I started the discussion is, in log4j2-logstash-layout, I
>>>>> am aiming for the fastest approach, always. The performance comparison
>>>>> is JMH-driven, where all competitors (LogstashLayout, EcsLayout,
>>>>> JsonLayout, etc.) are fine tuned for fairness. There I try to play
>>>>> fair, but neither Log4j 2.0 JsonLayout, nor EcsLayout of Elastic does
>>>>> that; they perform TLA without taking log4j2.enableThreadlocals flag
>>>>> into account. This leads me to questionthe presence of such a flag in
>>>>> the first place. Why don't we just remove the
>>>>> log4j2.enableThreadlocals flag? What are its use cases?
>>>>> 
>>>>> [1] https://issues.apache.org/jira/browse/LOG4J2-2753
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Matt Sicker <bo...@gmail.com>
>>>> 
>> 
>> 
>> 
>> -- 
>> Matt Sicker <bo...@gmail.com>
>> 



Re: The convention for thread-local allocations (TLAs)

Posted by Carter Kozak <ck...@ckozak.net>.
Let's give it a try and see how we like it :-) I probably won't have time to prototype it for a month or two, but I think we could gain performance by passing top level objects around (including between threads) instead of copying data between buffer and objects on heap.

-ck

On Mon, Jan 6, 2020, at 14:17, Matt Sicker wrote:
> It seems like it'd be more flexible to tune, too. The current
> ThreadLocal approach scales uncontrollably with the number of threads
> as it is.
> 
> On Mon, 6 Jan 2020 at 12:20, Gary Gregory <ga...@gmail.com> wrote:
> >
> > I like the idea of centrally controlling these objects. This should make
> > resource monitoring easier as well.
> >
> > Gary
> >
> > On Mon, Jan 6, 2020, 13:09 Matt Sicker <bo...@gmail.com> wrote:
> >
> > > Would it be useful to implement some sort of buffer pool for
> > > StringBuilders and ByteBuffers? Could likely copy code from netty's
> > > util library (ByteBuf et al.) or reuse stuff from commons-pool if
> > > needed. This would work properly in applications, servlets, and even
> > > reactive streams and lightweight threads later on.
> > >
> > > On Tue, 31 Dec 2019 at 03:22, Volkan Yazıcı <vo...@gmail.com>
> > > wrote:
> > > >
> > > > On Mon, Dec 30, 2019 at 10:15 PM Carter Kozak <ck...@ckozak.net> wrote:
> > > > > Beyond StringBuilder instances, we attempt to clear references
> > > > > from all thread local references to avoid substantial overhead. In
> > > > > practice this works nicely because it reinforces java performance
> > > > > characteristics. Java threads are fairly memory expensive (not to
> > > > > mention the cost of initialization) so the threadlocal object overhead
> > > > > from log4j tends to be inconsequential by comparison. Applications
> > > > > in memory constrained environments already have relatively few
> > > > > threads, and applications which constantly create and destroy threads
> > > > > tend not to worry about the performance of creating log events
> > > > > because it's inexpensive compared to thread initialization.
> > > > >
> > > > > Have you observed a problem? We've found and resolved a few issues
> > > > > over the last year or so where references were held longer than
> > > > > expected. If you're aware of places we're using more memory than we
> > > > > should, please file a ticket :-)
> > > >
> > > > AFAIC, the only TLA in Log4j 2.0 core violating
> > > > log4j2.enableThreadlocals flag is
> > > > AbstractStringLayout#getStringBuilder(). Given AbstractStringLayout is
> > > > used by many internal (HTML, XML, JSON, YAML, Pattern, Gelf, Syslog)
> > > > and external (ECS) layouts, the fix will incur a significant
> > > > performance penalty. I wouldn't be surprised if we start receiving
> > > > performance regression bug reports from users after releasing such a
> > > > fix, since a notable amount of Log4j 2.0 users, to the best of my
> > > > knowledge, are using it in JEE context (e.g., Spring WebMvc with
> > > > Tomcat backend) where ENABLE_THREADLOCALS are disabled due to the
> > > > present IS_WEB_APP condition:
> > > >
> > > > o.a.l.l.u.Constants.ENABLE_THREADLOCALS =
> > > > !IS_WEB_APP &&
> > > > PropertiesUtil
> > > > .getProperties()
> > > > .getBooleanProperty("log4j2.enable.threadlocals", true);
> > > >
> > > > Created LOG4J2-2753[1] for this issue.
> > > >
> > > > The reason I started the discussion is, in log4j2-logstash-layout, I
> > > > am aiming for the fastest approach, always. The performance comparison
> > > > is JMH-driven, where all competitors (LogstashLayout, EcsLayout,
> > > > JsonLayout, etc.) are fine tuned for fairness. There I try to play
> > > > fair, but neither Log4j 2.0 JsonLayout, nor EcsLayout of Elastic does
> > > > that; they perform TLA without taking log4j2.enableThreadlocals flag
> > > > into account. This leads me to questionthe presence of such a flag in
> > > > the first place. Why don't we just remove the
> > > > log4j2.enableThreadlocals flag? What are its use cases?
> > > >
> > > > [1] https://issues.apache.org/jira/browse/LOG4J2-2753
> > >
> > >
> > >
> > > --
> > > Matt Sicker <bo...@gmail.com>
> > >
> 
> 
> 
> -- 
> Matt Sicker <bo...@gmail.com>
> 

Re: The convention for thread-local allocations (TLAs)

Posted by Matt Sicker <bo...@gmail.com>.
It seems like it'd be more flexible to tune, too. The current
ThreadLocal approach scales uncontrollably with the number of threads
as it is.

On Mon, 6 Jan 2020 at 12:20, Gary Gregory <ga...@gmail.com> wrote:
>
> I like the idea of centrally controlling these objects. This should make
> resource monitoring easier as well.
>
> Gary
>
> On Mon, Jan 6, 2020, 13:09 Matt Sicker <bo...@gmail.com> wrote:
>
> > Would it be useful to implement some sort of buffer pool for
> > StringBuilders and ByteBuffers? Could likely copy code from netty's
> > util library (ByteBuf et al.) or reuse stuff from commons-pool if
> > needed. This would work properly in applications, servlets, and even
> > reactive streams and lightweight threads later on.
> >
> > On Tue, 31 Dec 2019 at 03:22, Volkan Yazıcı <vo...@gmail.com>
> > wrote:
> > >
> > > On Mon, Dec 30, 2019 at 10:15 PM Carter Kozak <ck...@ckozak.net> wrote:
> > > > Beyond StringBuilder instances, we attempt to clear references
> > > > from all thread local references to avoid substantial overhead. In
> > > > practice this works nicely because it reinforces java performance
> > > > characteristics. Java threads are fairly memory expensive (not to
> > > > mention the cost of initialization) so the threadlocal object overhead
> > > > from log4j tends to be inconsequential by comparison. Applications
> > > > in memory constrained environments already have relatively few
> > > > threads, and applications which constantly create and destroy threads
> > > > tend not to worry about the performance of creating log events
> > > > because it's inexpensive compared to thread initialization.
> > > >
> > > > Have you observed a problem? We've found and resolved a few issues
> > > > over the last year or so where references were held longer than
> > > > expected. If you're aware of places we're using more memory than we
> > > > should, please file a ticket :-)
> > >
> > > AFAIC, the only TLA in Log4j 2.0 core violating
> > > log4j2.enableThreadlocals flag is
> > > AbstractStringLayout#getStringBuilder(). Given AbstractStringLayout is
> > > used by many internal (HTML, XML, JSON, YAML, Pattern, Gelf, Syslog)
> > > and external (ECS) layouts, the fix will incur a significant
> > > performance penalty. I wouldn't be surprised if we start receiving
> > > performance regression bug reports from users after releasing such a
> > > fix, since a notable amount of Log4j 2.0 users, to the best of my
> > > knowledge, are using it in JEE context (e.g., Spring WebMvc with
> > > Tomcat backend) where ENABLE_THREADLOCALS are disabled due to the
> > > present IS_WEB_APP condition:
> > >
> > > o.a.l.l.u.Constants.ENABLE_THREADLOCALS =
> > >         !IS_WEB_APP &&
> > >         PropertiesUtil
> > >                 .getProperties()
> > >                 .getBooleanProperty("log4j2.enable.threadlocals", true);
> > >
> > > Created LOG4J2-2753[1] for this issue.
> > >
> > > The reason I started the discussion is, in log4j2-logstash-layout, I
> > > am aiming for the fastest approach, always. The performance comparison
> > > is JMH-driven, where all competitors (LogstashLayout, EcsLayout,
> > > JsonLayout, etc.) are fine tuned for fairness. There I try to play
> > > fair, but neither Log4j 2.0 JsonLayout, nor EcsLayout of Elastic does
> > > that; they perform TLA without taking log4j2.enableThreadlocals flag
> > > into account. This leads me to questionthe presence of such a flag in
> > > the first place. Why don't we just remove the
> > > log4j2.enableThreadlocals flag? What are its use cases?
> > >
> > > [1] https://issues.apache.org/jira/browse/LOG4J2-2753
> >
> >
> >
> > --
> > Matt Sicker <bo...@gmail.com>
> >



-- 
Matt Sicker <bo...@gmail.com>

Re: The convention for thread-local allocations (TLAs)

Posted by Gary Gregory <ga...@gmail.com>.
I like the idea of centrally controlling these objects. This should make
resource monitoring easier as well.

Gary

On Mon, Jan 6, 2020, 13:09 Matt Sicker <bo...@gmail.com> wrote:

> Would it be useful to implement some sort of buffer pool for
> StringBuilders and ByteBuffers? Could likely copy code from netty's
> util library (ByteBuf et al.) or reuse stuff from commons-pool if
> needed. This would work properly in applications, servlets, and even
> reactive streams and lightweight threads later on.
>
> On Tue, 31 Dec 2019 at 03:22, Volkan Yazıcı <vo...@gmail.com>
> wrote:
> >
> > On Mon, Dec 30, 2019 at 10:15 PM Carter Kozak <ck...@ckozak.net> wrote:
> > > Beyond StringBuilder instances, we attempt to clear references
> > > from all thread local references to avoid substantial overhead. In
> > > practice this works nicely because it reinforces java performance
> > > characteristics. Java threads are fairly memory expensive (not to
> > > mention the cost of initialization) so the threadlocal object overhead
> > > from log4j tends to be inconsequential by comparison. Applications
> > > in memory constrained environments already have relatively few
> > > threads, and applications which constantly create and destroy threads
> > > tend not to worry about the performance of creating log events
> > > because it's inexpensive compared to thread initialization.
> > >
> > > Have you observed a problem? We've found and resolved a few issues
> > > over the last year or so where references were held longer than
> > > expected. If you're aware of places we're using more memory than we
> > > should, please file a ticket :-)
> >
> > AFAIC, the only TLA in Log4j 2.0 core violating
> > log4j2.enableThreadlocals flag is
> > AbstractStringLayout#getStringBuilder(). Given AbstractStringLayout is
> > used by many internal (HTML, XML, JSON, YAML, Pattern, Gelf, Syslog)
> > and external (ECS) layouts, the fix will incur a significant
> > performance penalty. I wouldn't be surprised if we start receiving
> > performance regression bug reports from users after releasing such a
> > fix, since a notable amount of Log4j 2.0 users, to the best of my
> > knowledge, are using it in JEE context (e.g., Spring WebMvc with
> > Tomcat backend) where ENABLE_THREADLOCALS are disabled due to the
> > present IS_WEB_APP condition:
> >
> > o.a.l.l.u.Constants.ENABLE_THREADLOCALS =
> >         !IS_WEB_APP &&
> >         PropertiesUtil
> >                 .getProperties()
> >                 .getBooleanProperty("log4j2.enable.threadlocals", true);
> >
> > Created LOG4J2-2753[1] for this issue.
> >
> > The reason I started the discussion is, in log4j2-logstash-layout, I
> > am aiming for the fastest approach, always. The performance comparison
> > is JMH-driven, where all competitors (LogstashLayout, EcsLayout,
> > JsonLayout, etc.) are fine tuned for fairness. There I try to play
> > fair, but neither Log4j 2.0 JsonLayout, nor EcsLayout of Elastic does
> > that; they perform TLA without taking log4j2.enableThreadlocals flag
> > into account. This leads me to questionthe presence of such a flag in
> > the first place. Why don't we just remove the
> > log4j2.enableThreadlocals flag? What are its use cases?
> >
> > [1] https://issues.apache.org/jira/browse/LOG4J2-2753
>
>
>
> --
> Matt Sicker <bo...@gmail.com>
>

Re: The convention for thread-local allocations (TLAs)

Posted by Matt Sicker <bo...@gmail.com>.
Would it be useful to implement some sort of buffer pool for
StringBuilders and ByteBuffers? Could likely copy code from netty's
util library (ByteBuf et al.) or reuse stuff from commons-pool if
needed. This would work properly in applications, servlets, and even
reactive streams and lightweight threads later on.

On Tue, 31 Dec 2019 at 03:22, Volkan Yazıcı <vo...@gmail.com> wrote:
>
> On Mon, Dec 30, 2019 at 10:15 PM Carter Kozak <ck...@ckozak.net> wrote:
> > Beyond StringBuilder instances, we attempt to clear references
> > from all thread local references to avoid substantial overhead. In
> > practice this works nicely because it reinforces java performance
> > characteristics. Java threads are fairly memory expensive (not to
> > mention the cost of initialization) so the threadlocal object overhead
> > from log4j tends to be inconsequential by comparison. Applications
> > in memory constrained environments already have relatively few
> > threads, and applications which constantly create and destroy threads
> > tend not to worry about the performance of creating log events
> > because it's inexpensive compared to thread initialization.
> >
> > Have you observed a problem? We've found and resolved a few issues
> > over the last year or so where references were held longer than
> > expected. If you're aware of places we're using more memory than we
> > should, please file a ticket :-)
>
> AFAIC, the only TLA in Log4j 2.0 core violating
> log4j2.enableThreadlocals flag is
> AbstractStringLayout#getStringBuilder(). Given AbstractStringLayout is
> used by many internal (HTML, XML, JSON, YAML, Pattern, Gelf, Syslog)
> and external (ECS) layouts, the fix will incur a significant
> performance penalty. I wouldn't be surprised if we start receiving
> performance regression bug reports from users after releasing such a
> fix, since a notable amount of Log4j 2.0 users, to the best of my
> knowledge, are using it in JEE context (e.g., Spring WebMvc with
> Tomcat backend) where ENABLE_THREADLOCALS are disabled due to the
> present IS_WEB_APP condition:
>
> o.a.l.l.u.Constants.ENABLE_THREADLOCALS =
>         !IS_WEB_APP &&
>         PropertiesUtil
>                 .getProperties()
>                 .getBooleanProperty("log4j2.enable.threadlocals", true);
>
> Created LOG4J2-2753[1] for this issue.
>
> The reason I started the discussion is, in log4j2-logstash-layout, I
> am aiming for the fastest approach, always. The performance comparison
> is JMH-driven, where all competitors (LogstashLayout, EcsLayout,
> JsonLayout, etc.) are fine tuned for fairness. There I try to play
> fair, but neither Log4j 2.0 JsonLayout, nor EcsLayout of Elastic does
> that; they perform TLA without taking log4j2.enableThreadlocals flag
> into account. This leads me to questionthe presence of such a flag in
> the first place. Why don't we just remove the
> log4j2.enableThreadlocals flag? What are its use cases?
>
> [1] https://issues.apache.org/jira/browse/LOG4J2-2753



-- 
Matt Sicker <bo...@gmail.com>

Re: The convention for thread-local allocations (TLAs)

Posted by Volkan Yazıcı <vo...@gmail.com>.
On Mon, Dec 30, 2019 at 10:15 PM Carter Kozak <ck...@ckozak.net> wrote:
> Beyond StringBuilder instances, we attempt to clear references
> from all thread local references to avoid substantial overhead. In
> practice this works nicely because it reinforces java performance
> characteristics. Java threads are fairly memory expensive (not to
> mention the cost of initialization) so the threadlocal object overhead
> from log4j tends to be inconsequential by comparison. Applications
> in memory constrained environments already have relatively few
> threads, and applications which constantly create and destroy threads
> tend not to worry about the performance of creating log events
> because it's inexpensive compared to thread initialization.
>
> Have you observed a problem? We've found and resolved a few issues
> over the last year or so where references were held longer than
> expected. If you're aware of places we're using more memory than we
> should, please file a ticket :-)

AFAIC, the only TLA in Log4j 2.0 core violating
log4j2.enableThreadlocals flag is
AbstractStringLayout#getStringBuilder(). Given AbstractStringLayout is
used by many internal (HTML, XML, JSON, YAML, Pattern, Gelf, Syslog)
and external (ECS) layouts, the fix will incur a significant
performance penalty. I wouldn't be surprised if we start receiving
performance regression bug reports from users after releasing such a
fix, since a notable amount of Log4j 2.0 users, to the best of my
knowledge, are using it in JEE context (e.g., Spring WebMvc with
Tomcat backend) where ENABLE_THREADLOCALS are disabled due to the
present IS_WEB_APP condition:

o.a.l.l.u.Constants.ENABLE_THREADLOCALS =
        !IS_WEB_APP &&
        PropertiesUtil
                .getProperties()
                .getBooleanProperty("log4j2.enable.threadlocals", true);

Created LOG4J2-2753[1] for this issue.

The reason I started the discussion is, in log4j2-logstash-layout, I
am aiming for the fastest approach, always. The performance comparison
is JMH-driven, where all competitors (LogstashLayout, EcsLayout,
JsonLayout, etc.) are fine tuned for fairness. There I try to play
fair, but neither Log4j 2.0 JsonLayout, nor EcsLayout of Elastic does
that; they perform TLA without taking log4j2.enableThreadlocals flag
into account. This leads me to questionthe presence of such a flag in
the first place. Why don't we just remove the
log4j2.enableThreadlocals flag? What are its use cases?

[1] https://issues.apache.org/jira/browse/LOG4J2-2753

Re: The convention for thread-local allocations (TLAs)

Posted by Carter Kozak <ck...@ckozak.net>.
Hi Volkan,

Beyond StringBuilder instances, we attempt to clear references from all thread local references to avoid substantial overhead. In practice this works nicely because it reinforces java performance characteristics. Java threads are fairly memory expensive (not to mention the cost of initialization) so the threadlocal object overhead from log4j tends to be inconsequential by comparison. Applications in memory constrained environments already have relatively few threads, and applications which constantly create and destroy threads tend not to worry about the performance of creating log events because it's inexpensive compared to thread initialization.
Have you observed a problem? We've found and resolved a few issues over the last year or so where references were held longer than expected. If you're aware of places we're using more memory than we should, please file a ticket :-)

-ck

On Sat, Dec 28, 2019, at 18:13, Remko Popma wrote:
> Ralph is correct.
> When running in a servlet environment (J2EE), Log4j2 automatically
> does not use ThreadLocals because of the problems associated with
> threadpools in those environments.
> This is documented in the manual:
> https://logging.apache.org/log4j/2.x/manual/garbagefree.html#Config
> 
> In addition, we did some work to ensure that the StringBuilders in
> ThreadLocals do not grow beyond some configurable maximum size to
> ensure that memory consumption remains reasonable.
> 
> I don't remember the reason why AbstractStringLayout#getStringBuilder
> does not check log4j2.enableThreadlocals. This may be an oversight, I
> am not sure. I will take another look when I have time.
> 
> Remko.
> 
> On Sun, Dec 29, 2019 at 7:27 AM Ralph Goers <ra...@dslextreme.com> wrote:
> >
> > Many of the ThreadLocals were added in the effort to make much of Log4j be garbage free. As I recall Remko did some performance tests on various alternatives and what we have is a result of that. But that is just my recollection.
> >
> > Ralph
> >
> > > On Dec 28, 2019, at 2:39 PM, Volkan Yazıcı <vo...@gmail.com> wrote:
> > >
> > > Regarding log4j2.enableThreadlocals configuration knob, it feels more
> > > like a hint rather than a hard restriction. See how
> > > AbstractStringLayout#getStringBuilder() doesn't take that flag into
> > > account.
> > >
> > > On Sat, Dec 28, 2019 at 10:10 PM Volkan Yazıcı <vo...@gmail.com> wrote:
> > >>
> > >> Hello,
> > >>
> > >> I see that many Log4j components rely on TLA for efficiency reasons.
> > >> There I have two main concerns:
> > >>
> > >> 1. TLAs, for obvious reasons, entail an extra memory cost on
> > >> threads, particularly, even when they are not used. Consider
> > >> a thread invoking a Log4j flow where various TLAs are
> > >> performed. That storage will keep on occupying the memory
> > >> until the next time it gets used.
> > >>
> > >> 2. JEE servers are known to accommodate quite some threads,
> > >> e.g., the the default thread count is 200 for Tomcat[1]. TLAs at
> > >> will will certainly have an impact on context switching costs too.
> > >>
> > >> Is there a convention recommended in this domain? In
> > >> log4j2-logstash-layout, I favor object pools over TLAs. Object pools
> > >> offer better memory utilization with the cost of synchronization. I
> > >> see that many text-based Log4j 2.0 layouts rely on
> > >> ThreadLocal<StringBuilder>s. While it performs pretty good, I am
> > >> struggling with the idea of associating a large(?) text chunk next to
> > >> each thread who has at least once attempted to log something.
> > >>
> > >> Best.
> > >>
> > >> [1] http://tomcat.apache.org/tomcat-5.5-doc/config/http.html#Common_Attributes
> > >
> >
> >
> 

Re: The convention for thread-local allocations (TLAs)

Posted by Remko Popma <re...@gmail.com>.
Ralph is correct.
When running in a servlet environment (J2EE), Log4j2 automatically
does not use ThreadLocals because of the problems associated with
threadpools in those environments.
This is documented in the manual:
https://logging.apache.org/log4j/2.x/manual/garbagefree.html#Config

In addition, we did some work to ensure that the StringBuilders in
ThreadLocals do not grow beyond some configurable maximum size to
ensure that memory consumption remains reasonable.

I don't remember the reason why AbstractStringLayout#getStringBuilder
does not check log4j2.enableThreadlocals. This may be an oversight, I
am not sure. I will take another look when I have time.

Remko.

On Sun, Dec 29, 2019 at 7:27 AM Ralph Goers <ra...@dslextreme.com> wrote:
>
> Many of the ThreadLocals were added in the effort to make much of Log4j be garbage free. As I recall Remko did some performance tests on various alternatives and what we have is a result of that. But that is just my recollection.
>
> Ralph
>
> > On Dec 28, 2019, at 2:39 PM, Volkan Yazıcı <vo...@gmail.com> wrote:
> >
> > Regarding log4j2.enableThreadlocals configuration knob, it feels more
> > like a hint rather than a hard restriction. See how
> > AbstractStringLayout#getStringBuilder() doesn't take that flag into
> > account.
> >
> > On Sat, Dec 28, 2019 at 10:10 PM Volkan Yazıcı <vo...@gmail.com> wrote:
> >>
> >> Hello,
> >>
> >> I see that many Log4j components rely on TLA for efficiency reasons.
> >> There I have two main concerns:
> >>
> >> 1. TLAs, for obvious reasons, entail an extra memory cost on
> >>   threads, particularly, even when they are not used. Consider
> >>   a thread invoking a Log4j flow where various TLAs are
> >>   performed. That storage will keep on occupying the memory
> >>   until the next time it gets used.
> >>
> >> 2. JEE servers are known to accommodate quite some threads,
> >>   e.g., the the default thread count is 200 for Tomcat[1]. TLAs at
> >>   will will certainly have an impact on context switching costs too.
> >>
> >> Is there a convention recommended in this domain? In
> >> log4j2-logstash-layout, I favor object pools over TLAs. Object pools
> >> offer better memory utilization with the cost of synchronization. I
> >> see that many text-based Log4j 2.0 layouts rely on
> >> ThreadLocal<StringBuilder>s. While it performs pretty good, I am
> >> struggling with the idea of associating a large(?) text chunk next to
> >> each thread who has at least once attempted to log something.
> >>
> >> Best.
> >>
> >> [1] http://tomcat.apache.org/tomcat-5.5-doc/config/http.html#Common_Attributes
> >
>
>

Re: The convention for thread-local allocations (TLAs)

Posted by Ralph Goers <ra...@dslextreme.com>.
Many of the ThreadLocals were added in the effort to make much of Log4j be garbage free. As I recall Remko did some performance tests on various alternatives and what we have is a result of that. But that is just my recollection.  

Ralph

> On Dec 28, 2019, at 2:39 PM, Volkan Yazıcı <vo...@gmail.com> wrote:
> 
> Regarding log4j2.enableThreadlocals configuration knob, it feels more
> like a hint rather than a hard restriction. See how
> AbstractStringLayout#getStringBuilder() doesn't take that flag into
> account.
> 
> On Sat, Dec 28, 2019 at 10:10 PM Volkan Yazıcı <vo...@gmail.com> wrote:
>> 
>> Hello,
>> 
>> I see that many Log4j components rely on TLA for efficiency reasons.
>> There I have two main concerns:
>> 
>> 1. TLAs, for obvious reasons, entail an extra memory cost on
>>   threads, particularly, even when they are not used. Consider
>>   a thread invoking a Log4j flow where various TLAs are
>>   performed. That storage will keep on occupying the memory
>>   until the next time it gets used.
>> 
>> 2. JEE servers are known to accommodate quite some threads,
>>   e.g., the the default thread count is 200 for Tomcat[1]. TLAs at
>>   will will certainly have an impact on context switching costs too.
>> 
>> Is there a convention recommended in this domain? In
>> log4j2-logstash-layout, I favor object pools over TLAs. Object pools
>> offer better memory utilization with the cost of synchronization. I
>> see that many text-based Log4j 2.0 layouts rely on
>> ThreadLocal<StringBuilder>s. While it performs pretty good, I am
>> struggling with the idea of associating a large(?) text chunk next to
>> each thread who has at least once attempted to log something.
>> 
>> Best.
>> 
>> [1] http://tomcat.apache.org/tomcat-5.5-doc/config/http.html#Common_Attributes
> 



Re: The convention for thread-local allocations (TLAs)

Posted by Volkan Yazıcı <vo...@gmail.com>.
Regarding log4j2.enableThreadlocals configuration knob, it feels more
like a hint rather than a hard restriction. See how
AbstractStringLayout#getStringBuilder() doesn't take that flag into
account.

On Sat, Dec 28, 2019 at 10:10 PM Volkan Yazıcı <vo...@gmail.com> wrote:
>
> Hello,
>
> I see that many Log4j components rely on TLA for efficiency reasons.
> There I have two main concerns:
>
> 1. TLAs, for obvious reasons, entail an extra memory cost on
>    threads, particularly, even when they are not used. Consider
>    a thread invoking a Log4j flow where various TLAs are
>    performed. That storage will keep on occupying the memory
>    until the next time it gets used.
>
> 2. JEE servers are known to accommodate quite some threads,
>    e.g., the the default thread count is 200 for Tomcat[1]. TLAs at
>    will will certainly have an impact on context switching costs too.
>
> Is there a convention recommended in this domain? In
> log4j2-logstash-layout, I favor object pools over TLAs. Object pools
> offer better memory utilization with the cost of synchronization. I
> see that many text-based Log4j 2.0 layouts rely on
> ThreadLocal<StringBuilder>s. While it performs pretty good, I am
> struggling with the idea of associating a large(?) text chunk next to
> each thread who has at least once attempted to log something.
>
> Best.
>
> [1] http://tomcat.apache.org/tomcat-5.5-doc/config/http.html#Common_Attributes