You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Tim Funk <fu...@apache.org> on 2023/08/22 17:06:57 UTC

solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

I've tried to switch from the 8.X to 9.3 solrj client library. At the same
time - I switched to Http2SolrClient since the other was marked deprecated.
We use the client in the pattern ...

try (SolrClient client =  createSolrClient()) {
  response = client.query(solrQuery);
  // do stuff with response
}

Which should  auto close the client to clean things up. But we've noticed
on tomcat shutdown this error in the logs ...

22-Aug-2023 11:22:04.645 SEVERE [Catalina-utility-1]
org.apache.catalina.loader.WebappClassLoaderBase.checkThreadLocalMapForLeaks
The web application [foo] created a ThreadLocal with key of type
[java.lang.ThreadLocal] (value [java.lang.ThreadLocal@727c756]) and a value
of type [org.eclipse.jetty.util.Pool.MonoEntry] (value [MonoEntry@2447031
{IDLE,pooled=RetainableByteBuffer@447004d6{DirectByteBuffer@6554e973[p=0,l=0,c=16384,r=0]={<<<>>>/16\\"","p...ce":"15},r=0}}])
but failed to remove it when the web application was stopped. Threads are
going to be renewed over time to try and avoid a probable memory leak.

Even worse, in another environment when left up long enough with lots or
queries ...  we eventually noticed an OutOfMemory error due to all the
ThreadLocal's which do not seem to be cleaned up.

I guess the question I have is
1) Is there a leak?
2) Is my usage wrong? Should I instead create a Singleton(ish) instance
used/shared by all the concurrent requests?
3) Other?

-Tim

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Tim Funk <fu...@apache.org>.
End of message is the getSolrClient() .  (We have a different core per
language). Also using http 1_1 since we have apache(http) reverse proxy
in front of solr in non-prod tiers and haproxy load balancing in
production.

As I was testing earlier (after initial question), there doesn't
seem to be a leak in close(). It seems time out is deeper in the
core. I reached that conclusion via hitting my app instance with
`ab -n 5000000 -c 200` and other random hits. Then on app
restart, I saw in the logs just 3 ThreadLocal leaks attributed
to that webapp with the MonoEntry signature in the log. I did
refactor a different app (since it was smaller) to share a single
SolrClient and did the same amount of pound of traffic and didn't
see the leak error message for that app.

public SolrClient getSolrClient(String locale) {
  Http2SolrClient.Builder builder = null;
  try {
    String baseUrl = "http://"+solrHost+":"+solrPort+"/solr/" +
getSolrCore(locale);
    builder = new Http2SolrClient.Builder(baseUrl);

    if (StringUtil.hasText(solrUser)) {
      builder = builder.withBasicAuthCredentials(solrUser, solrPassword);
    }

    builder.useHttp1_1(true);

    return builder.build();
  } catch (Exception e) {
    log_.fatal("error getting client", e);
  }

  return null;
}

On Tue, Aug 22, 2023 at 2:45 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> I suspect the source of this problem is either something low level in the
> jetty HttpClient cleanup code ( which Solr should should already be
> correctly cleaning up on Http2SolrClient.close() ) or it's some nuance of
> how your 'createSolrClient()' method is impleemnted that creates an edge
> case preventing 'Http2SolrClient.close()' from doing a full cleanup of the
> underlying HttpClient.
>
> can you please share the details of createSolrClient() ?
>

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Tim Funk <ti...@funkman.net>.
End of message is the getSolrClient() .  (We have a different core per
language). Also using http 1_1 since we have apache(http) reverse proxy
in front of solr in non-prod tiers and haproxy load balancing in
production.

As I was testing earlier (after initial question), there doesn't
seem to be a leak in close(). It seems time out is deeper in the
core. I reached that conclusion via hitting my app instance with
`ab -n 5000000 -c 200` and other random hits. Then on app
restart, I saw in the logs just 3 ThreadLocal leaks attributed
to that webapp with the MonoEntry signature in the log. I did
refactor a different app (since it was smaller) to share a single
SolrClient and did the same amount of pound of traffic and didn't
see the leak error message for that app.

public SolrClient getSolrClient(String locale) {
  Http2SolrClient.Builder builder = null;
  try {
    String baseUrl = "http://"+solrHost+":"+solrPort+"/solr/" +
getSolrCore(locale);
    builder = new Http2SolrClient.Builder(baseUrl);

    if (StringUtil.hasText(solrUser)) {
      builder = builder.withBasicAuthCredentials(solrUser, solrPassword);
    }

    builder.useHttp1_1(true);

    return builder.build();
  } catch (Exception e) {
    log_.fatal("error getting client", e);
  }

  return null;
}


On Tue, Aug 22, 2023 at 2:45 PM Chris Hostetter <ho...@fucit.org>
wrote:

>
> I suspect the source of this problem is either something low level in the
> jetty HttpClient cleanup code ( which Solr should should already be
> correctly cleaning up on Http2SolrClient.close() ) or it's some nuance of
> how your 'createSolrClient()' method is impleemnted that creates an edge
> case preventing 'Http2SolrClient.close()' from doing a full cleanup of the
> underlying HttpClient.
>
> can you please share the details of createSolrClient() ?
>
>

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Chris Hostetter <ho...@fucit.org>.
I suspect the source of this problem is either something low level in the 
jetty HttpClient cleanup code ( which Solr should should already be 
correctly cleaning up on Http2SolrClient.close() ) or it's some nuance of 
how your 'createSolrClient()' method is impleemnted that creates an edge 
case preventing 'Http2SolrClient.close()' from doing a full cleanup of the 
underlying HttpClient.

can you please share the details of createSolrClient() ?

To answer your followup question: SolrClient instances are designed to be 
threadsafe and re-used by multiple concurrent threads making concurrent 
requests ... but that doesn't mean there should be memory leaks if you use 
it the way you are.


-Hoss
http://www.lucidworks.com/


: time - I switched to Http2SolrClient since the other was marked deprecated.
: We use the client in the pattern ...
: 
: try (SolrClient client =  createSolrClient()) {
:   response = client.query(solrQuery);
:   // do stuff with response
: }
: 
: Which should  auto close the client to clean things up. But we've noticed
: on tomcat shutdown this error in the logs ...
	...
: Even worse, in another environment when left up long enough with lots or
: queries ...  we eventually noticed an OutOfMemory error due to all the
: ThreadLocal's which do not seem to be cleaned up.

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Vincenzo D'Amore <v....@gmail.com>.
Hi Tim, thanks for letting me know, I experienced the same problem, my
application became unstable and crashed.
My first implementation was very similar to yours and relied heavily
on try-with-resources java statements with CloudSolrClient.
As said in my previous email, I ended up using the solr clients as
singletons reusing one instance per solr instance/collection.


On Mon, Aug 28, 2023 at 1:48 PM Tim Funk <fu...@apache.org> wrote:

> I reverted to HttpSolrClient. That seems to have plugged the leak.
> As for root cause, I haven't had time to dig farther. Since this happens
> regardless of reusing SolrClient vs instantiating a new one, I'm hoping
> that's a data point of interest. But as for constructing a "simple" test
> to reproduce, I'm not sure if I'll find the time in the near future to do
> other $work priorities.
>
> As for future triage, I'd try the any of the following
> - Change my endpoint and use Http2 ( disable: builder.useHttp1_1(true))
> - Revert to Http2Client and add a timer / logger in existing apps servers
> counting threadlocals and look for patterns
> - Write a standalone client, single thread. See if I can count the
> threadlocals over time.
> - Write a standalone client - Make all executions in new different threads
> with occasional reuse of thread
>
> -Tim
>
>
> On Mon, Aug 28, 2023 at 7:17 AM Vincenzo D'Amore <v....@gmail.com>
> wrote:
>
> > Hi Tim, have you figured out the problem? Just curious to know what you
> > have done at the end.
> >
> > On Fri, Aug 25, 2023 at 4:48 PM Vincenzo D'Amore <v....@gmail.com>
> > wrote:
> >
> > > Just my 2 cent:, I have always used solr clients as singletons. You
> have
> > > to instantiate them only once and reuse them forever.
> > >
> >
> >
>


-- 
Vincenzo D'Amore

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Tim Funk <fu...@apache.org>.
solrj 9.3 was built on jetty 10.0.15. - Doing some searches,
there appeared to be a leak similar around 10.0.11 but fixed.
**But** also there is another leak fix in 10.0.16. 10.0.16 appears tagged
5 days ago. So once that is released - I may try to swap in that
implementation to see if that fixes it before doing more triage.

-Tim

On Mon, Aug 28, 2023 at 7:47 AM Tim Funk <fu...@apache.org> wrote:

>
> As for future triage, I'd try the any of the following
> - Change my endpoint and use Http2 ( disable: builder.useHttp1_1(true))
> - Revert to Http2Client and add a timer / logger in existing apps servers
> counting threadlocals and look for patterns
> - Write a standalone client, single thread. See if I can count the
> threadlocals over time.
> - Write a standalone client - Make all executions in new different threads
> with occasional reuse of thread
>
> On Mon, Aug 28, 2023 at 7:17 AM Vincenzo D'Amore <v....@gmail.com>
> wrote:
>
>> Hi Tim, have you figured out the problem? Just curious to know what you
>> have done at the end.
>>
>

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Tim Funk <fu...@apache.org>.
I reverted to HttpSolrClient. That seems to have plugged the leak.
As for root cause, I haven't had time to dig farther. Since this happens
regardless of reusing SolrClient vs instantiating a new one, I'm hoping
that's a data point of interest. But as for constructing a "simple" test
to reproduce, I'm not sure if I'll find the time in the near future to do
other $work priorities.

As for future triage, I'd try the any of the following
- Change my endpoint and use Http2 ( disable: builder.useHttp1_1(true))
- Revert to Http2Client and add a timer / logger in existing apps servers
counting threadlocals and look for patterns
- Write a standalone client, single thread. See if I can count the
threadlocals over time.
- Write a standalone client - Make all executions in new different threads
with occasional reuse of thread

-Tim


On Mon, Aug 28, 2023 at 7:17 AM Vincenzo D'Amore <v....@gmail.com> wrote:

> Hi Tim, have you figured out the problem? Just curious to know what you
> have done at the end.
>
> On Fri, Aug 25, 2023 at 4:48 PM Vincenzo D'Amore <v....@gmail.com>
> wrote:
>
> > Just my 2 cent:, I have always used solr clients as singletons. You have
> > to instantiate them only once and reuse them forever.
> >
>
>

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Vincenzo D'Amore <v....@gmail.com>.
Hi Tim, have you figured out the problem? Just curious to know what you
have done at the end.

On Fri, Aug 25, 2023 at 4:48 PM Vincenzo D'Amore <v....@gmail.com> wrote:

> Just my 2 cent:, I have always used solr clients as singletons. You have
> to instantiate them only once and reuse them forever.
>
> On Fri, 25 Aug 2023 at 15:35, Tim Funk <fu...@apache.org> wrote:
>
>> Update - It looks like the ThreadLocal leak is different and unrelated to
>> creating  / closing a new Http2SolrClient every request.  Even using a
>> shared
>> Http2SolrClient for my webapp - I noticed the same issue in a QA
>> environment
>> of leaking ThreadLocals. Falling back to HttpSolrClient optimistically is
>> the fix so far.
>>
>> Client is OpenJDK 11.0.17
>>
>> -Tim
>>
>> On Wed, Aug 23, 2023 at 9:46 AM Tim Funk <fu...@apache.org> wrote:
>>
>> > Cool - For now I'll either revert to HttpSolrClient or use a single
>> client
>> > (depending
>> >  on what I have to refactor)
>> >
>> > My only concern with a shared client is if one calls close()
>> "accidently",
>> > i don't
>> > see an easy way to query the client to see if it was closed so I can
>> > destroy it
>> > and create a new one. (Without resorting to an webapp restart)
>> >
>> > -Tim
>> >
>> > On Tue, Aug 22, 2023 at 6:42 PM Shawn Heisey <ap...@elyograg.org>
>> wrote:
>> >
>> >>
>> >> That kind of try-with-resources approach should take care of the
>> >> problem, because it would run the close() method on the SolrClient
>> object.
>> >>
>> >> The classes in the error are Jetty classes.  This probably means that
>> >> the problem is in Jetty, but I couldn't guarantee that.
>> >>
>> >> You do not need multiple client objects just because you have multiple
>> >> cores.  You only need one Http2SolrClient object per hostname:port
>> >> combination used to access Solr, and you should only need to create
>> them
>> >> when the application starts and close them when the application ends.
>> >>
>> >>
>>
>

-- 
Vincenzo D'Amore

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Vincenzo D'Amore <v....@gmail.com>.
Just my 2 cent:, I have always used solr clients as singletons. You have to
instantiate them only once and reuse them forever.

On Fri, 25 Aug 2023 at 15:35, Tim Funk <fu...@apache.org> wrote:

> Update - It looks like the ThreadLocal leak is different and unrelated to
> creating  / closing a new Http2SolrClient every request.  Even using a
> shared
> Http2SolrClient for my webapp - I noticed the same issue in a QA
> environment
> of leaking ThreadLocals. Falling back to HttpSolrClient optimistically is
> the fix so far.
>
> Client is OpenJDK 11.0.17
>
> -Tim
>
> On Wed, Aug 23, 2023 at 9:46 AM Tim Funk <fu...@apache.org> wrote:
>
> > Cool - For now I'll either revert to HttpSolrClient or use a single
> client
> > (depending
> >  on what I have to refactor)
> >
> > My only concern with a shared client is if one calls close()
> "accidently",
> > i don't
> > see an easy way to query the client to see if it was closed so I can
> > destroy it
> > and create a new one. (Without resorting to an webapp restart)
> >
> > -Tim
> >
> > On Tue, Aug 22, 2023 at 6:42 PM Shawn Heisey <ap...@elyograg.org>
> wrote:
> >
> >>
> >> That kind of try-with-resources approach should take care of the
> >> problem, because it would run the close() method on the SolrClient
> object.
> >>
> >> The classes in the error are Jetty classes.  This probably means that
> >> the problem is in Jetty, but I couldn't guarantee that.
> >>
> >> You do not need multiple client objects just because you have multiple
> >> cores.  You only need one Http2SolrClient object per hostname:port
> >> combination used to access Solr, and you should only need to create them
> >> when the application starts and close them when the application ends.
> >>
> >>
>

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Tim Funk <fu...@apache.org>.
Update - It looks like the ThreadLocal leak is different and unrelated to
creating  / closing a new Http2SolrClient every request.  Even using a
shared
Http2SolrClient for my webapp - I noticed the same issue in a QA
environment
of leaking ThreadLocals. Falling back to HttpSolrClient optimistically is
the fix so far.

Client is OpenJDK 11.0.17

-Tim

On Wed, Aug 23, 2023 at 9:46 AM Tim Funk <fu...@apache.org> wrote:

> Cool - For now I'll either revert to HttpSolrClient or use a single client
> (depending
>  on what I have to refactor)
>
> My only concern with a shared client is if one calls close() "accidently",
> i don't
> see an easy way to query the client to see if it was closed so I can
> destroy it
> and create a new one. (Without resorting to an webapp restart)
>
> -Tim
>
> On Tue, Aug 22, 2023 at 6:42 PM Shawn Heisey <ap...@elyograg.org> wrote:
>
>>
>> That kind of try-with-resources approach should take care of the
>> problem, because it would run the close() method on the SolrClient object.
>>
>> The classes in the error are Jetty classes.  This probably means that
>> the problem is in Jetty, but I couldn't guarantee that.
>>
>> You do not need multiple client objects just because you have multiple
>> cores.  You only need one Http2SolrClient object per hostname:port
>> combination used to access Solr, and you should only need to create them
>> when the application starts and close them when the application ends.
>>
>>

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Tim Funk <fu...@apache.org>.
Cool - For now I'll either revert to HttpSolrClient or use a single client
(depending
 on what I have to refactor)

My only concern with a shared client is if one calls close() "accidently",
i don't
see an easy way to query the client to see if it was closed so I can
destroy it
and create a new one. (Without resorting to an webapp restart)

-Tim

On Tue, Aug 22, 2023 at 6:42 PM Shawn Heisey <ap...@elyograg.org> wrote:

>
> That kind of try-with-resources approach should take care of the
> problem, because it would run the close() method on the SolrClient object.
>
> The classes in the error are Jetty classes.  This probably means that
> the problem is in Jetty, but I couldn't guarantee that.
>
> You do not need multiple client objects just because you have multiple
> cores.  You only need one Http2SolrClient object per hostname:port
> combination used to access Solr, and you should only need to create them
> when the application starts and close them when the application ends.
>
> One thing I found about Http2SolrClient compared to HttpSolrClient:  The
> latter creates the inner http client threads as Daemon threads, so they
> are automatically cleaned up by Java.  The former doesn't.  Here's some
> code to change the Http2SolrClient creation so that it creates the
> internal jetty http client threads as Daemon threads:
>
>

Re: solrj client memory leak via ThreadLocal in solrj.impl.Http2SolrClient?

Posted by Shawn Heisey <ap...@elyograg.org>.
On 8/22/23 11:06, Tim Funk wrote:
> I've tried to switch from the 8.X to 9.3 solrj client library. At the same
> time - I switched to Http2SolrClient since the other was marked deprecated.
> We use the client in the pattern ...
> 
> try (SolrClient client =  createSolrClient()) {
>    response = client.query(solrQuery);
>    // do stuff with response
> }
> 
> Which should  auto close the client to clean things up. But we've noticed
> on tomcat shutdown this error in the logs ...

That kind of try-with-resources approach should take care of the 
problem, because it would run the close() method on the SolrClient object.

The classes in the error are Jetty classes.  This probably means that 
the problem is in Jetty, but I couldn't guarantee that.

You do not need multiple client objects just because you have multiple 
cores.  You only need one Http2SolrClient object per hostname:port 
combination used to access Solr, and you should only need to create them 
when the application starts and close them when the application ends.

One thing I found about Http2SolrClient compared to HttpSolrClient:  The 
latter creates the inner http client threads as Daemon threads, so they 
are automatically cleaned up by Java.  The former doesn't.  Here's some 
code to change the Http2SolrClient creation so that it creates the 
internal jetty http client threads as Daemon threads:

//-------------------------

final AtomicInteger scThreadCounter = new AtomicInteger();

// <snip>

final ExecutorService executorService = 
Executors.newFixedThreadPool(256, runnable -> {
   final Thread thread = 
Executors.defaultThreadFactory().newThread(runnable);
   thread.setDaemon(true); // Mark the thread as a daemon
   thread.setName("h2sc-" + scThreadCounter.incrementAndGet());
   return thread;
});

final Http2SolrClient.Builder clientBuilder = new Http2SolrClient.Builder(
   "http://localhost:8983/solr").withExecutor(executorService);
final Http2SolrClient client = clientBuilder.build();

//-------------------------

One final note:  You can still use HttpSolrClient.  It will be removed 
from 10.0, but will still be there for all 9.x releases.

Thanks,
Shawn