You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Samuel Garcia Martinez <sa...@inditex.com> on 2020/05/02 14:26:44 UTC
Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

I created two different issues: one for the Content-Type issue in the server and another one for the reliability issue in the SolrClient for unexpected/malformed responses.

ContentType issue: https://issues.apache.org/jira/browse/SOLR-14456
Client issue: https://issues.apache.org/jira/browse/SOLR-14457
________________________________
From: Jason Gerlowski <ge...@gmail.com>
Sent: Wednesday, April 22, 2020 4:43 PM
To: solr-user@lucene.apache.org <so...@lucene.apache.org>
Subject: Re: SolrJ connection leak with SolrCloud and Jetty Gzip compression enabled

Hi Samuel,

Thanks for the very detailed description of the problem here.  Very
thorough!  I don't think you're missing anything obvious, please file the
jira tickets if you haven't already.

Best,

Jason

On Mon, Apr 13, 2020 at 6:12 PM Samuel Garcia Martinez <
samuelgma@inditex.com> wrote:

> Reading again the last two paragraphs I realized that, those two
> specially, are very poorly worded (grammar 😓). I tried to rephrase them
> and correct some of the errors below.
>
> Here I can see three different problems:
>
> * HttpSolrCall should not use HttpServletResponse#setCharacterEncoding to
> set the Content-Encoding header. This is obviously a mistake.
> * HttpSolrClient, specifically the HttpClientUtil, should be modified to
> prevent that if the Content-Encoding header lies about the actual content,
> the connection is leaked forever. It should the exception though.
> * HttpSolrClient should allow clients to customize HttpClient's
> connectionRequestTimeout, preventing the application to be blocked forever
> waiting for a connection to be available. This way, the application could
> respond to requests that won’t use Solr instead of rejecting any incoming
> requests because all threads are blocked forever for a connection that
> won’t be available ever.
>
> I think the two first points are bugs that should be fixed.  The third one
> is a feature improvement to me.
>
> Unless I missed something, I'll file the two bugs and provide a patch for
> them. The same goes for the the feature improvement.
>
>
>
> Get Outlook for iOS<https://clicktime.symantec.com/3HCbuRyy1nsrbJk46YT1vS76H2?u=https%3A%2F%2Faka.ms%2Fo0ukef>
>
>
>
> En el caso de haber recibido este mensaje por error, le rogamos que nos lo
> comunique por esta misma vía, proceda a su eliminación y se abstenga de
> utilizarlo en modo alguno.
> If you receive this message by error, please notify the sender by return
> e-mail and delete it. Its use is forbidden.
>
>
>
> ________________________________
> From: Samuel Garcia Martinez <sa...@inditex.com>
> Sent: Monday, April 13, 2020 10:08:36 PM
> To: solr-user@lucene.apache.orG <so...@lucene.apache.orG>
> Subject: SolrJ connection leak with SolrCloud and Jetty Gzip compression
> enabled
>
> Hi!
>
> Today, I've seen a weird issue in production workloads when the gzip
> compression was enabled. After some minutes, the client app ran out of
> connections and stopped responding.
>
> The cluster setup is pretty simple:
> Solr version: 7.7.2
> Solr cloud enabled
> Cluster topology: 6 nodes, 1 single collection, 10 shards and 3 replicas.
> 1 HTTP LB using Round Robin over all nodes
> All cluster nodes have gzip enabled for all paths, all HTTP verbs and all
> MIME types.
> Solr client: HttpSolrClient targeting the HTTP LB
>
> Problem description: when the Solr node that receives the request has to
> forward the request to a Solr Node that actually can perform the query, the
> response headers are added incorrectly to the client response, causing the
> SolrJ client to fail and to never release the connection back to the pool.
>
> To simplify the case, let's try to start from the following repro scenario:
>
>   *   Start one node with cloud mode and port 8983
>   *   Create one single collection (1 shard, 1 replica)
>   *   Start another node with port 8984 and the previusly started zk (-z
> localhost:9983)
>   *   Start a java application and query the cluster using the node on
> port 8984 (the one that doesn't host the collection)
>
> So, the steps occur like:
>
>   *   The application queries node:8984 with compression enabled
> ("Accept-Encoding: gzip") and wt=javabin
>   *   Node:8984 can't perform the query and creates a http request behind
> the scenes to node:8983
>   *   Node:8983 returns a gzipped response with "Content-Encoding: gzip"
> and "Content-Type: application/octet-stream"
>   *   Node:8984 adds the "Content-Encoding: gzip" header as character
> stream to the response (it should be forwarded as "Content-Encoding"
> header, not character encoding)
>   *   HttpSolrClient receives a "Content-Type:
> application/octet-stream;charset=gzip", causing an exception.
>   *   HttpSolrClient tries to quietly close the connection, but since the
> stream is broken, the Utils.consumeFully fails to actually consume the
> entity (it throws another exception in GzipDecompressingEntity#getContent()
> with "not in GZIP format")
>
> The exception thrown by HttpSolrClient is:
> java.nio.charset.UnsupportedCharsetException: gzip
>                at java.nio.charset.Charset.forName(Charset.java:531)
>                at
> org.apache.http.entity.ContentType.create(ContentType.java:271)
>                at
> org.apache.http.entity.ContentType.create(ContentType.java:261)
>                at
> org.apache.http.entity.ContentType.parse(ContentType.java:319)
>                at
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:591)
>                at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
>                at
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
>                at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
>                at
> org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1015)
>                at
> org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1031)
>                at
> org.apache.solr.client.solrj.SolrClient$$FastClassBySpringCGLIB$$7fcf36a0.invoke(<generated>)
>                at
> org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
>
> Here I can see three different problems:
>
>   *   HttpSolrCall should not use HttpServletResponse#setCharacterEncoding
> to set the Content-Encoding header. This is obviously a typo.
>   *   HttpSolrClient, specially the HttpClientUtil should be modified to
> prevent that if the Content-Encoding header lies about the actual content,
> there should be an exception, but shouldn't leak the connection forever.
>   *   HttpSolrClient should allow clients to customize HttpClient's
> connectionRequestTimeout, preventing the application to respond to any
> other incoming request because all requests that used could be forever
> blocked waiting for a free connection that will never be free.
>
> I think the two points are to bugs and the third one is a feature
> improvement. Unless I missed something, I'll file the two bugs and provide
> a patch for them. The same goes for the the feature improvement.
>
>
>
> En el caso de haber recibido este mensaje por error, le rogamos que nos lo
> comunique por esta misma v?a, proceda a su eliminaci?n y se abstenga de
> utilizarlo en modo alguno.
> If you receive this message by error, please notify the sender by return
> e-mail and delete it. Its use is forbidden.
>
>
>