You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@solr.apache.org by Alex Deparvu <st...@apache.org> on 2023/04/20 16:41:08 UTC

Http2SolrClient response compression

Hi,

I am trying to figure out how to enable compression on the Http2SolrClient
to understand if it can help or hurt our workflows.

This question is more about the client support for it, or why it's missing
so far. The server side seems to support compression, I tested with curl
'--compressed' and 'Accept-Encoding: gzip, deflate' and it seems to work ok.


First off, the old and deprecated HttpSolrClient has a flag to enable
compression which seems to inject the correct headers into the request
(Accept-Encoding: gzip, deflate), but I am looking for a non-deprecated way
using the new http2 client.

Also, my observation based on very little benchmarking is that even with
HttpSolrClient and compression enabled, the SolrCloud inter-node query
requests are using the htt2client which does not propagate the request
headers to 'ask' the server for compression. So it seems the only
compressed part is the 'final leg' of the distributed query (final query
result -> client). is this by design? I can imagine inter-node requests
being very fast so not needing compression, while the final response part
going over the wire to the client being slow, so potentially more benefits
if compression is on.
As for the benchmark results, they showed no improvement, but I don't want
to spend more time on a deprecated option.

Now for the main question. The Http2SolrClient has no compression flag
option on the builder and it explicitly sets
req.header(HttpHeader.ACCEPT_ENCODING, null); [0]. Is this by design, is
compression explicitly disabled for any reason? I tried Jira searches but
could not find anything meaningful.

I would also appreciate any feedback on enabling compression from the wider
community. Is it worth it, has anyone done some benchmarks to find that
sweet spot where enabling compression makes sense.

best,
alex


[0]
https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L588

Re: Http2SolrClient response compression

Posted by Alex Deparvu <st...@apache.org>.
Thank you Ishan and Mark for chiming in.

From the feedback I received so far it looks like there is very little gain
to be had enabling compression.
I tried running some tests locally and this fits, I don't see any gains,
whatever you might save on transfer time, you waste on compression.

Together with the question of "is compression worth it", I also had the
question of "why does http2 client not support this". it looks like one
informs the other, if compression is not useful, then it makes sense to not
expose it as a flag on the client - at least that is how I am reading the
current state.
And to answer my own question, it looks like the http2 client might set the
ACCEPT_ENCODING to null because the jetty client will auto-add the
'accept-encoding:gzip' header by default [0] - I am reading this as someone
caught this and manually disabled this auto gzip behavior by setting the
header to null.

best,
alex

[0] https://github.com/eclipse/jetty.project/issues/7681



On Fri, Apr 21, 2023 at 7:52 PM Mark Miller <ma...@gmail.com> wrote:

> If you are looking for performance, you probably want to do some tests to
> verify you will get it.
>
> Most of the binary protocols seem to avoid compression beyond what Solrs
> JavaBin does, which is very simple numerical compression, with the idea the
> cost should be small enough to maintain a performance win. The guy that did
> protobufs stopped doing even that in his latest project in the name of
> performance. I think maybe that’s fixed size structures where you can
> almost directly load native types from it though.
>
> My assumption is that it’s going to depend on the protocol and the typical
> data going over that protocol whether it would make any sense or not, so
> would have to test it out.
>

Re: Http2SolrClient response compression

Posted by Mark Miller <ma...@gmail.com>.
If you are looking for performance, you probably want to do some tests to
verify you will get it.

Most of the binary protocols seem to avoid compression beyond what Solrs
JavaBin does, which is very simple numerical compression, with the idea the
cost should be small enough to maintain a performance win. The guy that did
protobufs stopped doing even that in his latest project in the name of
performance. I think maybe that’s fixed size structures where you can
almost directly load native types from it though.

My assumption is that it’s going to depend on the protocol and the typical
data going over that protocol whether it would make any sense or not, so
would have to test it out.

Re: Http2SolrClient response compression

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
My guess is that javabin is already fairly compact, hence doesn't benefit
much from compression. But, would be good to test out the hypothesis, and
would love to be proven wrong here.

On Thu, 20 Apr, 2023, 10:12 pm Alex Deparvu, <st...@apache.org> wrote:

> Hi,
>
> I am trying to figure out how to enable compression on the Http2SolrClient
> to understand if it can help or hurt our workflows.
>
> This question is more about the client support for it, or why it's missing
> so far. The server side seems to support compression, I tested with curl
> '--compressed' and 'Accept-Encoding: gzip, deflate' and it seems to work
> ok.
>
>
> First off, the old and deprecated HttpSolrClient has a flag to enable
> compression which seems to inject the correct headers into the request
> (Accept-Encoding: gzip, deflate), but I am looking for a non-deprecated way
> using the new http2 client.
>
> Also, my observation based on very little benchmarking is that even with
> HttpSolrClient and compression enabled, the SolrCloud inter-node query
> requests are using the htt2client which does not propagate the request
> headers to 'ask' the server for compression. So it seems the only
> compressed part is the 'final leg' of the distributed query (final query
> result -> client). is this by design? I can imagine inter-node requests
> being very fast so not needing compression, while the final response part
> going over the wire to the client being slow, so potentially more benefits
> if compression is on.
> As for the benchmark results, they showed no improvement, but I don't want
> to spend more time on a deprecated option.
>
> Now for the main question. The Http2SolrClient has no compression flag
> option on the builder and it explicitly sets
> req.header(HttpHeader.ACCEPT_ENCODING, null); [0]. Is this by design, is
> compression explicitly disabled for any reason? I tried Jira searches but
> could not find anything meaningful.
>
> I would also appreciate any feedback on enabling compression from the wider
> community. Is it worth it, has anyone done some benchmarks to find that
> sweet spot where enabling compression makes sense.
>
> best,
> alex
>
>
> [0]
>
> https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/impl/Http2SolrClient.java#L588
>