You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Johannes Siegert <jo...@offerista.com> on 2020/05/05 08:32:50 UTC

Re: gzip compression solr 8.4.1

Hi,

We did further tests to see where the problem exactly is. These are our
outcomes:

The content-length is calculated correctly, a quick test with curl showed
this.
The problem is that the stream with the gzip data is not fully consumed and
afterwards not closed.

Using the debugger with a breakpoint at
org/apache/solr/common/util/Utils.java:575 shows that it won't enter the
function readFully((entity.getContent()) most likely due to how the gzip
stream content is wrapped and extracted beforehand.

On line org/apache/solr/common/util/Utils.java:582 the
consumeQuietly(entity) should close the stream but does not because of a
silent exception.

This seems to be the same as it is described in
https://issues.apache.org/jira/browse/SOLR-14457

We saw that the problem happened also with correct GZIP responses from
jetty. Not only with non-GZIP as described within the jira issue.

Best,

Johannes

Am Do., 23. Apr. 2020 um 09:55 Uhr schrieb Johannes Siegert <
johannes.siegert@offerista.com>:

> Hi,
>
> we want to use gzip-compression between our application and the solr
> server.
>
> We use a standalone solr server version 8.4.1 and the prepackaged jetty as
> application server.
>
> We have enabled the jetty gzip module by adding these two files:
>
> {path_to_solr}/server/modules/gzip.mod (see below the question)
> {path_to_solr}/server/etc/jetty-gzip.xml (see below the question)
>
> Within the application we use a HttpSolrServer that is configured with
> allowCompression=true.
>
> After we had released our application we saw that the number of
> connections within the TCP-state CLOSE_WAIT rising up until the application
> was not able to open new connections.
>
>
> After a long debugging session we think the problem is that the header
> "Content-Length" that is returned by the jetty is sometimes wrong when
> gzip-compression is enabled.
>
> The solrj client uses a ContentLengthInputStream, that uses the header
> "Content-Lenght" to detect if all data was received. But the InputStream
> can not be fully consumed because the value of the header "Content-Lenght"
> is higher than the actual content-length.
>
> Usually the method PoolingHttpClientConnectionManager.releaseConnection is
> called after the InputStream was fully consumed. This give the connection
> free to be reused or to be closed by the application.
>
> Due to the incorrect header "Content-Length" the
> PoolingHttpClientConnectionManager.releaseConnection method is never called
> and the connection stays active. After the connection-timeout of the jetty
> is reached, it closes the connection from the server-side and the TCP-state
> switches into CLOSE_WAIT. The client never closes the connection and so the
> number of connections in use rises up.
>
>
> Currently we try to configure the jetty gzip module to return no
> "Content-Length" if gzip-compression was used. We hope that in this case
> another InputStream implementation is used that uses the NULL-terminator to
> see when the InputStream was fully consumed.
>
> Do you have any experiences with this problem or any suggestions for us?
>
> Thanks,
>
> Johannes
>
>
> gzip.mod
>
> -----
>
> DO NOT EDIT - See:
> https://www.eclipse.org/jetty/documentation/current/startup-modules.html
>
>         [description]
>         Enable GzipHandler for dynamic gzip compression
>         for the entire server.
>
>         [tags]
>         handler
>
>         [depend]
>         server
>
>         [xml]
>         etc/jetty-gzip.xml
>
>         [ini-template]
>         ## Minimum content length after which gzip is enabled
>         jetty.gzip.minGzipSize=2048
>
>         ## Check whether a file with *.gz extension exists
>         jetty.gzip.checkGzExists=false
>
>         ## Gzip compression level (-1 for default)
>         jetty.gzip.compressionLevel=-1
>
>         ## User agents for which gzip is disabled
>         jetty.gzip.excludedUserAgent=.*MSIE.6\.0.*
>
> -----
>
> jetty-gzip.xml
>
> -----
>
> <?xml version="1.0"?>
> <!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "
> http://www.eclipse.org/jetty/configure_9_3.dtd">
>
> <!-- =============================================================== -->
> <!-- Mixin the GZIP Handler                                          -->
> <!-- This applies the GZIP Handler to the entire server              -->
> <!-- If a GZIP handler is required for an individual context, then   -->
> <!-- use a context XML (see test.xml example in distribution)        -->
> <!-- =============================================================== -->
>
> <Configure id="Server" class="org.eclipse.jetty.server.Server">
>     <Call name="insertHandler">
>         <Arg>
>             <New id="GzipHandler"
> class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
>                 <Set name="minGzipSize">
>                     <Property name="jetty.gzip.minGzipSize"
> deprecated="gzip.minGzipSize" default="2048" />
>                 </Set>
>                 <Set name="checkGzExists">
>                     <Property name="jetty.gzip.checkGzExists"
> deprecated="gzip.checkGzExists" default="false" />
>                 </Set>
>                 <Set name="compressionLevel">
>                     <Property name="jetty.gzip.compressionLevel"
> deprecated="gzip.compressionLevel" default="-1" />
>                 </Set>
>                 <Set name="inflateBufferSize">
>                     <Property name="jetty.gzip.inflateBufferSize"
> default="0" />
>                 </Set>
>                 <Set name="deflaterPoolCapacity">
>                     <Property name="jetty.gzip.deflaterPoolCapacity"
> default="-1" />
>                 </Set>
>                 <Set name="syncFlush">
>                     <Property name="jetty.gzip.syncFlush" default="false"
> />
>                 </Set>
>
>                 <Set name="excludedAgentPatterns">
>                     <Array type="String">
>                         <Item>
>                             <Property name="jetty.gzip.excludedUserAgent"
> deprecated="gzip.excludedUserAgent" default=".*MSIE.6\.0.*" />
>                         </Item>
>                     </Array>
>                 </Set>
>
>                 <Set name="includedMethodList">
>                     <Property name="jetty.gzip.includedMethodList"
> default="GET,POST" />
>                 </Set>
>                 <Set name="excludedMethodList">
>                     <Property name="jetty.gzip.excludedMethodList"
> default="" />
>                 </Set>
>             </New>
>         </Arg>
>     </Call>
> </Configure>
>
> -----
>

Re: gzip compression solr 8.4.1

Posted by ART GALLERY <al...@goretoy.com>.
check out the videos on this website TROO.TUBE don't be such a
sheep/zombie/loser/NPC. Much love!
https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219

On Tue, May 5, 2020 at 3:33 AM Johannes Siegert
<jo...@offerista.com> wrote:
>
> Hi,
>
> We did further tests to see where the problem exactly is. These are our
> outcomes:
>
> The content-length is calculated correctly, a quick test with curl showed
> this.
> The problem is that the stream with the gzip data is not fully consumed and
> afterwards not closed.
>
> Using the debugger with a breakpoint at
> org/apache/solr/common/util/Utils.java:575 shows that it won't enter the
> function readFully((entity.getContent()) most likely due to how the gzip
> stream content is wrapped and extracted beforehand.
>
> On line org/apache/solr/common/util/Utils.java:582 the
> consumeQuietly(entity) should close the stream but does not because of a
> silent exception.
>
> This seems to be the same as it is described in
> https://issues.apache.org/jira/browse/SOLR-14457
>
> We saw that the problem happened also with correct GZIP responses from
> jetty. Not only with non-GZIP as described within the jira issue.
>
> Best,
>
> Johannes
>
> Am Do., 23. Apr. 2020 um 09:55 Uhr schrieb Johannes Siegert <
> johannes.siegert@offerista.com>:
>
> > Hi,
> >
> > we want to use gzip-compression between our application and the solr
> > server.
> >
> > We use a standalone solr server version 8.4.1 and the prepackaged jetty as
> > application server.
> >
> > We have enabled the jetty gzip module by adding these two files:
> >
> > {path_to_solr}/server/modules/gzip.mod (see below the question)
> > {path_to_solr}/server/etc/jetty-gzip.xml (see below the question)
> >
> > Within the application we use a HttpSolrServer that is configured with
> > allowCompression=true.
> >
> > After we had released our application we saw that the number of
> > connections within the TCP-state CLOSE_WAIT rising up until the application
> > was not able to open new connections.
> >
> >
> > After a long debugging session we think the problem is that the header
> > "Content-Length" that is returned by the jetty is sometimes wrong when
> > gzip-compression is enabled.
> >
> > The solrj client uses a ContentLengthInputStream, that uses the header
> > "Content-Lenght" to detect if all data was received. But the InputStream
> > can not be fully consumed because the value of the header "Content-Lenght"
> > is higher than the actual content-length.
> >
> > Usually the method PoolingHttpClientConnectionManager.releaseConnection is
> > called after the InputStream was fully consumed. This give the connection
> > free to be reused or to be closed by the application.
> >
> > Due to the incorrect header "Content-Length" the
> > PoolingHttpClientConnectionManager.releaseConnection method is never called
> > and the connection stays active. After the connection-timeout of the jetty
> > is reached, it closes the connection from the server-side and the TCP-state
> > switches into CLOSE_WAIT. The client never closes the connection and so the
> > number of connections in use rises up.
> >
> >
> > Currently we try to configure the jetty gzip module to return no
> > "Content-Length" if gzip-compression was used. We hope that in this case
> > another InputStream implementation is used that uses the NULL-terminator to
> > see when the InputStream was fully consumed.
> >
> > Do you have any experiences with this problem or any suggestions for us?
> >
> > Thanks,
> >
> > Johannes
> >
> >
> > gzip.mod
> >
> > -----
> >
> > DO NOT EDIT - See:
> > https://www.eclipse.org/jetty/documentation/current/startup-modules.html
> >
> >         [description]
> >         Enable GzipHandler for dynamic gzip compression
> >         for the entire server.
> >
> >         [tags]
> >         handler
> >
> >         [depend]
> >         server
> >
> >         [xml]
> >         etc/jetty-gzip.xml
> >
> >         [ini-template]
> >         ## Minimum content length after which gzip is enabled
> >         jetty.gzip.minGzipSize=2048
> >
> >         ## Check whether a file with *.gz extension exists
> >         jetty.gzip.checkGzExists=false
> >
> >         ## Gzip compression level (-1 for default)
> >         jetty.gzip.compressionLevel=-1
> >
> >         ## User agents for which gzip is disabled
> >         jetty.gzip.excludedUserAgent=.*MSIE.6\.0.*
> >
> > -----
> >
> > jetty-gzip.xml
> >
> > -----
> >
> > <?xml version="1.0"?>
> > <!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "
> > http://www.eclipse.org/jetty/configure_9_3.dtd">
> >
> > <!-- =============================================================== -->
> > <!-- Mixin the GZIP Handler                                          -->
> > <!-- This applies the GZIP Handler to the entire server              -->
> > <!-- If a GZIP handler is required for an individual context, then   -->
> > <!-- use a context XML (see test.xml example in distribution)        -->
> > <!-- =============================================================== -->
> >
> > <Configure id="Server" class="org.eclipse.jetty.server.Server">
> >     <Call name="insertHandler">
> >         <Arg>
> >             <New id="GzipHandler"
> > class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
> >                 <Set name="minGzipSize">
> >                     <Property name="jetty.gzip.minGzipSize"
> > deprecated="gzip.minGzipSize" default="2048" />
> >                 </Set>
> >                 <Set name="checkGzExists">
> >                     <Property name="jetty.gzip.checkGzExists"
> > deprecated="gzip.checkGzExists" default="false" />
> >                 </Set>
> >                 <Set name="compressionLevel">
> >                     <Property name="jetty.gzip.compressionLevel"
> > deprecated="gzip.compressionLevel" default="-1" />
> >                 </Set>
> >                 <Set name="inflateBufferSize">
> >                     <Property name="jetty.gzip.inflateBufferSize"
> > default="0" />
> >                 </Set>
> >                 <Set name="deflaterPoolCapacity">
> >                     <Property name="jetty.gzip.deflaterPoolCapacity"
> > default="-1" />
> >                 </Set>
> >                 <Set name="syncFlush">
> >                     <Property name="jetty.gzip.syncFlush" default="false"
> > />
> >                 </Set>
> >
> >                 <Set name="excludedAgentPatterns">
> >                     <Array type="String">
> >                         <Item>
> >                             <Property name="jetty.gzip.excludedUserAgent"
> > deprecated="gzip.excludedUserAgent" default=".*MSIE.6\.0.*" />
> >                         </Item>
> >                     </Array>
> >                 </Set>
> >
> >                 <Set name="includedMethodList">
> >                     <Property name="jetty.gzip.includedMethodList"
> > default="GET,POST" />
> >                 </Set>
> >                 <Set name="excludedMethodList">
> >                     <Property name="jetty.gzip.excludedMethodList"
> > default="" />
> >                 </Set>
> >             </New>
> >         </Arg>
> >     </Call>
> > </Configure>
> >
> > -----
> >