You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Ivan Zhakov <iv...@visualsvn.com> on 2012/10/17 16:40:01 UTC

Re: [serf-dev] Double compression over HTTPS

On Fri, Oct 5, 2012 at 1:51 PM, Lieven Govaerts <lg...@apache.org> wrote:
> Hi,
>
> when OpenSSL is built with zlib, it will automatically compress all data
> sent over an SSL connection. You can see this in the initial handshake
> "Client Hello" and "Server Hello" where client and server agree on the
> compression mechanism to be used.
>
> If the data being sent or received is already compressed, OpenSSL will
> compress it a second time. This can happen when already compressed binaries
> like .gif or .zip are sent, or when the server uses gzip encoding for a http
> response. This can have impact on performance and memory usage. See Paul
> Querna's blog post about this topic in [1]. This also has been mentioned
> before on svn-dev by Justin in [2].
>
> Since OpenSSL 1.0 this automatic compression can be disabled at runtime.
>
> Compression by OpenSSL has some advantages and disadvantages:
> + OpenSSL will compress the full data stream, so for https that includes all
> headers + all small requests and responses which mod_deflate skips.
> + OpenSSL compression is stateful, it will not reset its dictionary between
> every response like gzip/deflate-encoding does, so it will reach better
> compression ratio when content of multiple consecutive requests or responses
> are similar within a 32KB window. Side note: I have done tests with using a
> preset dictionary for zlib for http(s) responses and found the difference
> can be up to 50% extra compression.
> - Where content is already compressed by the application layer (e.g. gzip
> encoding or transferring binary files), OpenSSL will compress these again.
>
>
> I have been doing some small-scale testing to see what difference this all
> makes. My test case was using svn to checkout a copy of the subversion trunk
> branch in the asf repository.
>
> I have tested 4 different scenario's:
> 1. As-is setup, OpenSSL compression enabled + gzip encoding enabled. (double
> compression)
> 2. OpenSSL compression disabled + gzip encoding enabled. (compression
> handled by the application)
> 3. OpenSSL compression disabled + gzip encoding disabled. (no compression at
> all)
> 4. OpenSSL compression enabled + gzip encoding disabled (compression handled
> by OpenSSL)
>
> I found this particular scenario too small to see a measurable difference in
> memory or cpu usage, although this is interesting to test further.
>
> Difference in total times are more interesting:
>    | bytes read | bytes written | total time
> 1: | 17.50MB    | 233-284KB     | 59s
> 2: | 18.67MB    | 2.13-2.43MB   | 1m9s-1m18s
> 3: | 50.35MB    | 2.34MB        | 103s-108s
> 4: | 15.27MB    | 235-260KB     | 50s-56s
>
> You can see from the above reasoning and my test results that it would be
> beneficial to disable gzip encoding when using https if OpenSSL was built
> with zlib. However, in the scenario where large compressed binary files are
> stored in a svn repository, I suppose disabling both OpenSSL compression and
> gzip encoding will provide the best results.
>
Also note that Subversion deltas (svndiff) already compressed. Deltas
compression can be disabled using  "SVNCompressionLevel 0" directive
in Apache configuration file. What I'm thinking about is to add code
to mod_dav_svn to detective compressed OpenSSL connection and disable
svndiff compression regardless SVNCompressionLevel. Does it make sense?

-- 
Ivan Zhakov

Re: [serf-dev] Double compression over HTTPS

Posted by Stefan Fuhrmann <st...@wandisco.com>.
On Wed, Oct 17, 2012 at 4:40 PM, Ivan Zhakov <iv...@visualsvn.com> wrote:

> On Fri, Oct 5, 2012 at 1:51 PM, Lieven Govaerts <lg...@apache.org> wrote:
> > Hi,
> >
> > when OpenSSL is built with zlib, it will automatically compress all data
> > sent over an SSL connection. You can see this in the initial handshake
> > "Client Hello" and "Server Hello" where client and server agree on the
> > compression mechanism to be used.
> >
> > If the data being sent or received is already compressed, OpenSSL will
> > compress it a second time. This can happen when already compressed
> binaries
> > like .gif or .zip are sent, or when the server uses gzip encoding for a
> http
> > response. This can have impact on performance and memory usage. See Paul
> > Querna's blog post about this topic in [1]. This also has been mentioned
> > before on svn-dev by Justin in [2].
> >
> > Since OpenSSL 1.0 this automatic compression can be disabled at runtime.
> >
> > Compression by OpenSSL has some advantages and disadvantages:
> > + OpenSSL will compress the full data stream, so for https that includes
> all
> > headers + all small requests and responses which mod_deflate skips.
> > + OpenSSL compression is stateful, it will not reset its dictionary
> between
> > every response like gzip/deflate-encoding does, so it will reach better
> > compression ratio when content of multiple consecutive requests or
> responses
> > are similar within a 32KB window. Side note: I have done tests with
> using a
> > preset dictionary for zlib for http(s) responses and found the difference
> > can be up to 50% extra compression.
> > - Where content is already compressed by the application layer (e.g. gzip
> > encoding or transferring binary files), OpenSSL will compress these
> again.
> >
> >
> > I have been doing some small-scale testing to see what difference this
> all
> > makes. My test case was using svn to checkout a copy of the subversion
> trunk
> > branch in the asf repository.
> >
> > I have tested 4 different scenario's:
> > 1. As-is setup, OpenSSL compression enabled + gzip encoding enabled.
> (double
> > compression)
> > 2. OpenSSL compression disabled + gzip encoding enabled. (compression
> > handled by the application)
> > 3. OpenSSL compression disabled + gzip encoding disabled. (no
> compression at
> > all)
> > 4. OpenSSL compression enabled + gzip encoding disabled (compression
> handled
> > by OpenSSL)
> >
> > I found this particular scenario too small to see a measurable
> difference in
> > memory or cpu usage, although this is interesting to test further.
> >
> > Difference in total times are more interesting:
> >    | bytes read | bytes written | total time
> > 1: | 17.50MB    | 233-284KB     | 59s
> > 2: | 18.67MB    | 2.13-2.43MB   | 1m9s-1m18s
> > 3: | 50.35MB    | 2.34MB        | 103s-108s
> > 4: | 15.27MB    | 235-260KB     | 50s-56s
> >
> > You can see from the above reasoning and my test results that it would be
> > beneficial to disable gzip encoding when using https if OpenSSL was built
> > with zlib. However, in the scenario where large compressed binary files
> are
> > stored in a svn repository, I suppose disabling both OpenSSL compression
> and
> > gzip encoding will provide the best results.
> >
> Also note that Subversion deltas (svndiff) already compressed. Deltas
> compression can be disabled using  "SVNCompressionLevel 0" directive
> in Apache configuration file. What I'm thinking about is to add code
> to mod_dav_svn to detective compressed OpenSSL connection and disable
> svndiff compression regardless SVNCompressionLevel. Does it make sense?
>

Yes, it does. Maybe test that compression will still be used
though http:// while another client uses https:// on the same
repo and server.

--Stefan^2.

-- 
*

Join us this October at Subversion Live
2012<http://www.wandisco.com/svn-live-2012>
 for two days of best practice SVN training, networking, live demos,
committer meet and greet, and more! Space is limited, so get signed up
today<http://www.wandisco.com/svn-live-2012>
!
*