You are viewing a plain text version of this content. The canonical link for it is here.
Posted to batik-dev@xmlgraphics.apache.org by David Vernet <dc...@gmail.com> on 2016/11/14 00:44:17 UTC

Patch suggestion for 4.8x throughput improvement for PNG encoding

Hello,

I am a researcher in Java code optimization, and in my research I have
found an optimization for Batik (version 1.7) that provides a 4.8x
throughput improvement according to the Dacapo Benchmark Suite
<http://dacapobench.org/>.

The optimization is obtained by changing this line
<http://svn.apache.org/viewvc/xmlgraphics/batik/trunk/batik-codec/src/main/java/org/apache/batik/ext/awt/image/codec/png/PNGImageEncoder.java?view=markup#l451>
in
the PNGImageEncoder class:
private void writeIDAT() throws IOException {
        IDATOutputStream ios = new IDATOutputStream(dataOutput, 8192);
        // This line is the bottleneck.
        DeflaterOutputStream dos =
            new DeflaterOutputStream(ios, new Deflater(9));

By specifying a deflation/compression level of 1 instead of 9, throughput
is improved by 4.8x when run on the Dacapo benchmarking suite (on my
six-core Xeon e5-2620 v3 machine with a 15 MB L3 cache and 16 GB of RAM).
It appears that deflating a compressed image was a big bottleneck in the
system. Both my profiler, as well as Xprof and Hprof identified this
bottleneck.

I have not tried running the benchmark on the latest code release (I ran it
on 1.7), but I see that the deflation metric is still set to 9 even on the
latest branch.

I have no experience at all using Batik, but I thought you would be
interested in hearing about this potential very simple and significant
performance improvement for what seems like an important feature in the
library. Please let me know if you would like me to try re-running my
optimization on the latest codepath, and I will do so and let you know what
my results are.

Regards,

-- 
David Vernet
Master of Science in Computer Science Student
Carnegie Mellon University
Class of 2016

Re: Patch suggestion for 4.8x throughput improvement for PNG encoding

Posted by David Vernet <dc...@gmail.com>.
>I think that if we are generating PNG images for later use then using the
value 9 makes sense.

Sounds good. Just so you're aware, however, there is an "inflection point"
in that choosing a deflation metric of 7 still gives you a speedup of
3.33x, while a deflation metric of 8 gives you about a 2x speedup. I don't
know exactly what the expected compression ratio differences are for levels
7, 8, and 9, but if the differences are generally on the lower side, the
benefit of the lower latency may outweigh the benefit of the extra
compression.

On Mon, Nov 14, 2016 at 3:58 AM Glenn Adams <gl...@skynav.com> wrote:

> On Mon, Nov 14, 2016 at 1:29 AM, Luis Bernardo <lm...@gmail.com>
> wrote:
>
> Compression is preferred for archival and packaging purposes (for instance
> debian packages require info documents to use compression 9). I think that
> if we are generating PNG images for later use then using the value 9 makes
> sense. Probably this should be configurable with the current value as the
> default.
>
>
> +1
>
>
>
> On Mon, Nov 14, 2016 at 6:08 AM, David Vernet <dc...@gmail.com> wrote:
>
> The tradeoff between using lower or higher compression / deflation metrics
> is latency / compression time -> file size. The higher the compression
> metric, the smaller the file but the longer it takes to compress /
> decompress it, and vice versa. In zlib land (which is what the Java
> Deflator class uses), the compression levels are:
>
> *#define Z_NO_COMPRESSION         0
> #define Z_BEST_SPEED             1
> #define Z_BEST_COMPRESSION       9
> #define Z_DEFAULT_COMPRESSION  (-1)*
>
> so changing the deflation metric to 1 optimizes for speed over compression
> / file size. I suppose what metric to use depends on the size of the file
> being encoded and the memory bandwidth of the system.
>
> I cannot speak to the reason it was originally set to 9, though I imagine
> it was either because optimizing for speed wasn't tried, or when that patch
> was first pushed, memory I/O was still the bottleneck in the system.
> Unfortunately, the commit associated with that change is enormous and has
> no comments:
> https://github.com/apache/batik/commit/aab0f40ddfab5b5dea926b5b88a8b11803dfa4b6.
> I can't imagine any scenario where compression would be preferred over
> speed unless you're shipping the image over a slow network (though to be
> honest, I don't know enough about Batik to know if that's even possible /
> relevant -- my experience is limited to the Dacapo benchmark).
>
> On Sun, Nov 13, 2016 at 11:51 PM Glenn Adams <gl...@skynav.com> wrote:
>
> Can you comment on the functional differences (if any) between using a
> deflation metric of 1 or 9? Care to speculate why it is set to 9 at present?
>
> On Sun, Nov 13, 2016 at 5:44 PM, David Vernet <dc...@gmail.com> wrote:
>
> Hello,
>
> I am a researcher in Java code optimization, and in my research I have
> found an optimization for Batik (version 1.7) that provides a 4.8x
> throughput improvement according to the Dacapo Benchmark Suite
> <http://dacapobench.org/>.
>
> The optimization is obtained by changing this line
> <http://svn.apache.org/viewvc/xmlgraphics/batik/trunk/batik-codec/src/main/java/org/apache/batik/ext/awt/image/codec/png/PNGImageEncoder.java?view=markup#l451> in
> the PNGImageEncoder class:
> private void writeIDAT() throws IOException {
>         IDATOutputStream ios = new IDATOutputStream(dataOutput, 8192);
>         // This line is the bottleneck.
>         DeflaterOutputStream dos =
>             new DeflaterOutputStream(ios, new Deflater(9));
>
> By specifying a deflation/compression level of 1 instead of 9, throughput
> is improved by 4.8x when run on the Dacapo benchmarking suite (on my
> six-core Xeon e5-2620 v3 machine with a 15 MB L3 cache and 16 GB of RAM).
> It appears that deflating a compressed image was a big bottleneck in the
> system. Both my profiler, as well as Xprof and Hprof identified this
> bottleneck.
>
> I have not tried running the benchmark on the latest code release (I ran
> it on 1.7), but I see that the deflation metric is still set to 9 even on
> the latest branch.
>
> I have no experience at all using Batik, but I thought you would be
> interested in hearing about this potential very simple and significant
> performance improvement for what seems like an important feature in the
> library. Please let me know if you would like me to try re-running my
> optimization on the latest codepath, and I will do so and let you know what
> my results are.
>
> Regards,
>
> --
> David Vernet
> Master of Science in Computer Science Student
> Carnegie Mellon University
> Class of 2016
>
>
>
>

Re: Patch suggestion for 4.8x throughput improvement for PNG encoding

Posted by Glenn Adams <gl...@skynav.com>.
On Mon, Nov 14, 2016 at 1:29 AM, Luis Bernardo <lm...@gmail.com>
wrote:

> Compression is preferred for archival and packaging purposes (for instance
> debian packages require info documents to use compression 9). I think that
> if we are generating PNG images for later use then using the value 9 makes
> sense. Probably this should be configurable with the current value as the
> default.
>

+1


>
> On Mon, Nov 14, 2016 at 6:08 AM, David Vernet <dc...@gmail.com> wrote:
>
>> The tradeoff between using lower or higher compression / deflation
>> metrics is latency / compression time -> file size. The higher the
>> compression metric, the smaller the file but the longer it takes to
>> compress / decompress it, and vice versa. In zlib land (which is what the
>> Java Deflator class uses), the compression levels are:
>>
>> *#define Z_NO_COMPRESSION         0
>> #define Z_BEST_SPEED             1
>> #define Z_BEST_COMPRESSION       9
>> #define Z_DEFAULT_COMPRESSION  (-1)*
>>
>> so changing the deflation metric to 1 optimizes for speed over
>> compression / file size. I suppose what metric to use depends on the size
>> of the file being encoded and the memory bandwidth of the system.
>>
>> I cannot speak to the reason it was originally set to 9, though I imagine
>> it was either because optimizing for speed wasn't tried, or when that patch
>> was first pushed, memory I/O was still the bottleneck in the system.
>> Unfortunately, the commit associated with that change is enormous and has
>> no comments: https://github.com/apache/batik/commit/aab0f40ddfab5b5dea926
>> b5b88a8b11803dfa4b6. I can't imagine any scenario where compression
>> would be preferred over speed unless you're shipping the image over a slow
>> network (though to be honest, I don't know enough about Batik to know if
>> that's even possible / relevant -- my experience is limited to the Dacapo
>> benchmark).
>>
>> On Sun, Nov 13, 2016 at 11:51 PM Glenn Adams <gl...@skynav.com> wrote:
>>
>>> Can you comment on the functional differences (if any) between using a
>>> deflation metric of 1 or 9? Care to speculate why it is set to 9 at present?
>>>
>>> On Sun, Nov 13, 2016 at 5:44 PM, David Vernet <dc...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>> I am a researcher in Java code optimization, and in my research I have
>>> found an optimization for Batik (version 1.7) that provides a 4.8x
>>> throughput improvement according to the Dacapo Benchmark Suite
>>> <http://dacapobench.org/>.
>>>
>>> The optimization is obtained by changing this line
>>> <http://svn.apache.org/viewvc/xmlgraphics/batik/trunk/batik-codec/src/main/java/org/apache/batik/ext/awt/image/codec/png/PNGImageEncoder.java?view=markup#l451> in
>>> the PNGImageEncoder class:
>>> private void writeIDAT() throws IOException {
>>>         IDATOutputStream ios = new IDATOutputStream(dataOutput, 8192);
>>>         // This line is the bottleneck.
>>>         DeflaterOutputStream dos =
>>>             new DeflaterOutputStream(ios, new Deflater(9));
>>>
>>> By specifying a deflation/compression level of 1 instead of 9,
>>> throughput is improved by 4.8x when run on the Dacapo benchmarking suite
>>> (on my six-core Xeon e5-2620 v3 machine with a 15 MB L3 cache and 16 GB of
>>> RAM). It appears that deflating a compressed image was a big bottleneck in
>>> the system. Both my profiler, as well as Xprof and Hprof identified this
>>> bottleneck.
>>>
>>> I have not tried running the benchmark on the latest code release (I ran
>>> it on 1.7), but I see that the deflation metric is still set to 9 even on
>>> the latest branch.
>>>
>>> I have no experience at all using Batik, but I thought you would be
>>> interested in hearing about this potential very simple and significant
>>> performance improvement for what seems like an important feature in the
>>> library. Please let me know if you would like me to try re-running my
>>> optimization on the latest codepath, and I will do so and let you know what
>>> my results are.
>>>
>>> Regards,
>>>
>>> --
>>> David Vernet
>>> Master of Science in Computer Science Student
>>> Carnegie Mellon University
>>> Class of 2016
>>>
>>>
>>>
>

Re: Patch suggestion for 4.8x throughput improvement for PNG encoding

Posted by Luis Bernardo <lm...@gmail.com>.
Compression is preferred for archival and packaging purposes (for instance
debian packages require info documents to use compression 9). I think that
if we are generating PNG images for later use then using the value 9 makes
sense. Probably this should be configurable with the current value as the
default.

On Mon, Nov 14, 2016 at 6:08 AM, David Vernet <dc...@gmail.com> wrote:

> The tradeoff between using lower or higher compression / deflation metrics
> is latency / compression time -> file size. The higher the compression
> metric, the smaller the file but the longer it takes to compress /
> decompress it, and vice versa. In zlib land (which is what the Java
> Deflator class uses), the compression levels are:
>
> *#define Z_NO_COMPRESSION         0
> #define Z_BEST_SPEED             1
> #define Z_BEST_COMPRESSION       9
> #define Z_DEFAULT_COMPRESSION  (-1)*
>
> so changing the deflation metric to 1 optimizes for speed over compression
> / file size. I suppose what metric to use depends on the size of the file
> being encoded and the memory bandwidth of the system.
>
> I cannot speak to the reason it was originally set to 9, though I imagine
> it was either because optimizing for speed wasn't tried, or when that patch
> was first pushed, memory I/O was still the bottleneck in the system.
> Unfortunately, the commit associated with that change is enormous and has
> no comments: https://github.com/apache/batik/commit/
> aab0f40ddfab5b5dea926b5b88a8b11803dfa4b6. I can't imagine any scenario
> where compression would be preferred over speed unless you're shipping the
> image over a slow network (though to be honest, I don't know enough about
> Batik to know if that's even possible / relevant -- my experience is
> limited to the Dacapo benchmark).
>
> On Sun, Nov 13, 2016 at 11:51 PM Glenn Adams <gl...@skynav.com> wrote:
>
>> Can you comment on the functional differences (if any) between using a
>> deflation metric of 1 or 9? Care to speculate why it is set to 9 at present?
>>
>> On Sun, Nov 13, 2016 at 5:44 PM, David Vernet <dc...@gmail.com> wrote:
>>
>> Hello,
>>
>> I am a researcher in Java code optimization, and in my research I have
>> found an optimization for Batik (version 1.7) that provides a 4.8x
>> throughput improvement according to the Dacapo Benchmark Suite
>> <http://dacapobench.org/>.
>>
>> The optimization is obtained by changing this line
>> <http://svn.apache.org/viewvc/xmlgraphics/batik/trunk/batik-codec/src/main/java/org/apache/batik/ext/awt/image/codec/png/PNGImageEncoder.java?view=markup#l451> in
>> the PNGImageEncoder class:
>> private void writeIDAT() throws IOException {
>>         IDATOutputStream ios = new IDATOutputStream(dataOutput, 8192);
>>         // This line is the bottleneck.
>>         DeflaterOutputStream dos =
>>             new DeflaterOutputStream(ios, new Deflater(9));
>>
>> By specifying a deflation/compression level of 1 instead of 9, throughput
>> is improved by 4.8x when run on the Dacapo benchmarking suite (on my
>> six-core Xeon e5-2620 v3 machine with a 15 MB L3 cache and 16 GB of RAM).
>> It appears that deflating a compressed image was a big bottleneck in the
>> system. Both my profiler, as well as Xprof and Hprof identified this
>> bottleneck.
>>
>> I have not tried running the benchmark on the latest code release (I ran
>> it on 1.7), but I see that the deflation metric is still set to 9 even on
>> the latest branch.
>>
>> I have no experience at all using Batik, but I thought you would be
>> interested in hearing about this potential very simple and significant
>> performance improvement for what seems like an important feature in the
>> library. Please let me know if you would like me to try re-running my
>> optimization on the latest codepath, and I will do so and let you know what
>> my results are.
>>
>> Regards,
>>
>> --
>> David Vernet
>> Master of Science in Computer Science Student
>> Carnegie Mellon University
>> Class of 2016
>>
>>
>>

Re: Patch suggestion for 4.8x throughput improvement for PNG encoding

Posted by David Vernet <dc...@gmail.com>.
The tradeoff between using lower or higher compression / deflation metrics
is latency / compression time -> file size. The higher the compression
metric, the smaller the file but the longer it takes to compress /
decompress it, and vice versa. In zlib land (which is what the Java
Deflator class uses), the compression levels are:

*#define Z_NO_COMPRESSION         0
#define Z_BEST_SPEED             1
#define Z_BEST_COMPRESSION       9
#define Z_DEFAULT_COMPRESSION  (-1)*

so changing the deflation metric to 1 optimizes for speed over compression
/ file size. I suppose what metric to use depends on the size of the file
being encoded and the memory bandwidth of the system.

I cannot speak to the reason it was originally set to 9, though I imagine
it was either because optimizing for speed wasn't tried, or when that patch
was first pushed, memory I/O was still the bottleneck in the system.
Unfortunately, the commit associated with that change is enormous and has
no comments:
https://github.com/apache/batik/commit/aab0f40ddfab5b5dea926b5b88a8b11803dfa4b6.
I can't imagine any scenario where compression would be preferred over
speed unless you're shipping the image over a slow network (though to be
honest, I don't know enough about Batik to know if that's even possible /
relevant -- my experience is limited to the Dacapo benchmark).

On Sun, Nov 13, 2016 at 11:51 PM Glenn Adams <gl...@skynav.com> wrote:

> Can you comment on the functional differences (if any) between using a
> deflation metric of 1 or 9? Care to speculate why it is set to 9 at present?
>
> On Sun, Nov 13, 2016 at 5:44 PM, David Vernet <dc...@gmail.com> wrote:
>
> Hello,
>
> I am a researcher in Java code optimization, and in my research I have
> found an optimization for Batik (version 1.7) that provides a 4.8x
> throughput improvement according to the Dacapo Benchmark Suite
> <http://dacapobench.org/>.
>
> The optimization is obtained by changing this line
> <http://svn.apache.org/viewvc/xmlgraphics/batik/trunk/batik-codec/src/main/java/org/apache/batik/ext/awt/image/codec/png/PNGImageEncoder.java?view=markup#l451> in
> the PNGImageEncoder class:
> private void writeIDAT() throws IOException {
>         IDATOutputStream ios = new IDATOutputStream(dataOutput, 8192);
>         // This line is the bottleneck.
>         DeflaterOutputStream dos =
>             new DeflaterOutputStream(ios, new Deflater(9));
>
> By specifying a deflation/compression level of 1 instead of 9, throughput
> is improved by 4.8x when run on the Dacapo benchmarking suite (on my
> six-core Xeon e5-2620 v3 machine with a 15 MB L3 cache and 16 GB of RAM).
> It appears that deflating a compressed image was a big bottleneck in the
> system. Both my profiler, as well as Xprof and Hprof identified this
> bottleneck.
>
> I have not tried running the benchmark on the latest code release (I ran
> it on 1.7), but I see that the deflation metric is still set to 9 even on
> the latest branch.
>
> I have no experience at all using Batik, but I thought you would be
> interested in hearing about this potential very simple and significant
> performance improvement for what seems like an important feature in the
> library. Please let me know if you would like me to try re-running my
> optimization on the latest codepath, and I will do so and let you know what
> my results are.
>
> Regards,
>
> --
> David Vernet
> Master of Science in Computer Science Student
> Carnegie Mellon University
> Class of 2016
>
>
>

Re: Patch suggestion for 4.8x throughput improvement for PNG encoding

Posted by Glenn Adams <gl...@skynav.com>.
Can you comment on the functional differences (if any) between using a
deflation metric of 1 or 9? Care to speculate why it is set to 9 at present?

On Sun, Nov 13, 2016 at 5:44 PM, David Vernet <dc...@gmail.com> wrote:

> Hello,
>
> I am a researcher in Java code optimization, and in my research I have
> found an optimization for Batik (version 1.7) that provides a 4.8x
> throughput improvement according to the Dacapo Benchmark Suite
> <http://dacapobench.org/>.
>
> The optimization is obtained by changing this line
> <http://svn.apache.org/viewvc/xmlgraphics/batik/trunk/batik-codec/src/main/java/org/apache/batik/ext/awt/image/codec/png/PNGImageEncoder.java?view=markup#l451> in
> the PNGImageEncoder class:
> private void writeIDAT() throws IOException {
>         IDATOutputStream ios = new IDATOutputStream(dataOutput, 8192);
>         // This line is the bottleneck.
>         DeflaterOutputStream dos =
>             new DeflaterOutputStream(ios, new Deflater(9));
>
> By specifying a deflation/compression level of 1 instead of 9, throughput
> is improved by 4.8x when run on the Dacapo benchmarking suite (on my
> six-core Xeon e5-2620 v3 machine with a 15 MB L3 cache and 16 GB of RAM).
> It appears that deflating a compressed image was a big bottleneck in the
> system. Both my profiler, as well as Xprof and Hprof identified this
> bottleneck.
>
> I have not tried running the benchmark on the latest code release (I ran
> it on 1.7), but I see that the deflation metric is still set to 9 even on
> the latest branch.
>
> I have no experience at all using Batik, but I thought you would be
> interested in hearing about this potential very simple and significant
> performance improvement for what seems like an important feature in the
> library. Please let me know if you would like me to try re-running my
> optimization on the latest codepath, and I will do so and let you know what
> my results are.
>
> Regards,
>
> --
> David Vernet
> Master of Science in Computer Science Student
> Carnegie Mellon University
> Class of 2016
>