You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@maven.apache.org by STEFAN REICH <sr...@mac.com.INVALID> on 2020/08/16 04:40:11 UTC

Why is old IO API used in maven resolver?

Hi there!

I am working on a very large code base, and build performance issues made me look at the maven-resolver source code. In terms of File usages, there are a lot of InputStreams being copied around using ByteBuffer, instead of using FileChannel.transferTo. Affected classes are DefaultFileProcessor, ChecksumCalculator, WagonTransporter, AbstractTransporter and potentially more. Was it a conscious decision to use this pattern over the more efficient transferTo? Would you accept a PR with more modern NIO API that still works with JDK 7?

Here are the throughput results from a JMH benchmark, copying a 22MB file around using the pattern currently used in maven, and transferTo, as measured on macOS with JDK 11 on an SSD.

Result "MyBenchmark.resolverCopy":
  291.362 ±(99.9%) 5.443 ops/s [Average]
  (min, avg, max) = (276.923, 291.362, 302.911), stdev = 7.266
  CI (99.9%): [285.919, 296.804] (assumes normal distribution)

Result "MyBenchmark.transferTo":
  325.188 ±(99.9%) 8.838 ops/s [Average]
  (min, avg, max) = (306.978, 325.188, 355.784), stdev = 11.799
  CI (99.9%): [316.350, 334.026] (assumes normal distribution)


Thanks!
Stefan

Re: Why is old IO API used in maven resolver?

Posted by Michael Osipov <mi...@apache.org>.
Am 2020-08-16 um 06:40 schrieb STEFAN REICH:
> Hi there!
> 
> I am working on a very large code base, and build performance issues made me look at the maven-resolver source code. In terms of File usages, there are a lot of InputStreams being copied around using ByteBuffer, instead of using FileChannel.transferTo. Affected classes are DefaultFileProcessor, ChecksumCalculator, WagonTransporter, AbstractTransporter and potentially more. Was it a conscious decision to use this pattern over the more efficient transferTo? Would you accept a PR with more modern NIO API that still works with JDK 7?

Hi Stefan,

I have been working on Resolver for the last couple of releases. Also on 
Wagon -- which goes hand in hand with Resolver. I don't think that there 
was a decision to go a suboptimal route. No one just stood up to make it 
better.

Please start with Wagon first, because Resolver sits on top of Wagon 
which is low level.

> Here are the throughput results from a JMH benchmark, copying a 22MB file around using the pattern currently used in maven, and transferTo, as measured on macOS with JDK 11 on an SSD.
> 
> Result "MyBenchmark.resolverCopy":
>    291.362 ±(99.9%) 5.443 ops/s [Average]
>    (min, avg, max) = (276.923, 291.362, 302.911), stdev = 7.266
>    CI (99.9%): [285.919, 296.804] (assumes normal distribution)
> 
> Result "MyBenchmark.transferTo":
>    325.188 ±(99.9%) 8.838 ops/s [Average]
>    (min, avg, max) = (306.978, 325.188, 355.784), stdev = 11.799
>    CI (99.9%): [316.350, 334.026] (assumes normal distribution)

Please provide your clean room tests, I have four different operating 
systems and JDKs to test.

I will happily accept every PR which makes Resolver and Wagon better as 
long as the PRs are logically decoupled and wellreasoned.

Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


AW: Why is old IO API used in maven resolver?

Posted by Markus KARG <ma...@headcrashing.eu>.
I used Files.copy a lot in the real world with different sizes, and it was always superior to old-school solutions, but I did not benchmark it with multi-GiB files.
-Markus

-----Ursprüngliche Nachricht-----
Von: Romain Manni-Bucau [mailto:rmannibucau@gmail.com] 
Gesendet: Mittwoch, 19. August 2020 15:08
An: Maven Developers List
Betreff: Re: Why is old IO API used in maven resolver?

Did you test the newer Files.copy?

It seems to use a plain char[] forwarding impl too ([1]) vs the sendfile64
of the channel impl ([2]).
Also did you try with some Gigs files? IIRC sendfile had a limitation of a
few gigas - and we sadly have artifacts of > 4G even if rare.
That said if the size is available it can be tested and the optimization
used quite easily I guess.

[1]
https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/unix/native/libnio/fs/UnixCopyFile.c#L53
[2]
https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/unix/native/libnio/ch/FileChannelImpl.c#L135

Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>


Le mer. 19 août 2020 à 14:35, Markus KARG <ma...@headcrashing.eu> a écrit :

> I do not support your judgement. That bug you mention was fixed nine years
> ago, so even the most exotic platforms should have adopted the fix
> meanwhile. This IS reliable and not more nor less buggy than any other part
> of the JRE. What we should do is give intensive tensting of Stefan's PR,
> but not overhastily abstain from it due to FUD. As a long time user of NIO
> API on Windows and Linux (speaking of years here) I strongly believe that
> the performance benefits will definitively outweigh the potential risks.
> -Markus
>
> -----Ursprüngliche Nachricht-----
> Von: Jochen Wiedmann [mailto:jochen.wiedmann@gmail.com]
> Gesendet: Mittwoch, 19. August 2020 13:05
> An: Maven Developers List
> Betreff: Re: Why is old IO API used in maven resolver?
>
> On Wed, Aug 19, 2020 at 11:49 AM Michael Osipov <mi...@apache.org>
> wrote:
> >
> > Am 2020-08-19 um 11:27 schrieb Jochen Wiedmann:
> > > Looking at
> > >
> > >      https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6822107
> > >
> > > I would very much like us to abstain from every attempt to go into the
> > > direction, that Stefan suggests. The consequence could be, that we
> > > need to bother with exotic corner cases. Let's stay simple, but
> > > reliable.
> >
> >
> > You proposal is not to change the code at all?
>
> If it's working fine: Yes.
>
> --
>
> Look, that's why there's rules, understand? So that you think before
> you break 'em.
>
>     -- (Terry Pratchett, Thief of Time)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Why is old IO API used in maven resolver?

Posted by Romain Manni-Bucau <rm...@gmail.com>.
Did you test the newer Files.copy?

It seems to use a plain char[] forwarding impl too ([1]) vs the sendfile64
of the channel impl ([2]).
Also did you try with some Gigs files? IIRC sendfile had a limitation of a
few gigas - and we sadly have artifacts of > 4G even if rare.
That said if the size is available it can be tested and the optimization
used quite easily I guess.

[1]
https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/unix/native/libnio/fs/UnixCopyFile.c#L53
[2]
https://github.com/AdoptOpenJDK/openjdk-jdk11/blob/master/src/java.base/unix/native/libnio/ch/FileChannelImpl.c#L135

Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>


Le mer. 19 août 2020 à 14:35, Markus KARG <ma...@headcrashing.eu> a écrit :

> I do not support your judgement. That bug you mention was fixed nine years
> ago, so even the most exotic platforms should have adopted the fix
> meanwhile. This IS reliable and not more nor less buggy than any other part
> of the JRE. What we should do is give intensive tensting of Stefan's PR,
> but not overhastily abstain from it due to FUD. As a long time user of NIO
> API on Windows and Linux (speaking of years here) I strongly believe that
> the performance benefits will definitively outweigh the potential risks.
> -Markus
>
> -----Ursprüngliche Nachricht-----
> Von: Jochen Wiedmann [mailto:jochen.wiedmann@gmail.com]
> Gesendet: Mittwoch, 19. August 2020 13:05
> An: Maven Developers List
> Betreff: Re: Why is old IO API used in maven resolver?
>
> On Wed, Aug 19, 2020 at 11:49 AM Michael Osipov <mi...@apache.org>
> wrote:
> >
> > Am 2020-08-19 um 11:27 schrieb Jochen Wiedmann:
> > > Looking at
> > >
> > >      https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6822107
> > >
> > > I would very much like us to abstain from every attempt to go into the
> > > direction, that Stefan suggests. The consequence could be, that we
> > > need to bother with exotic corner cases. Let's stay simple, but
> > > reliable.
> >
> >
> > You proposal is not to change the code at all?
>
> If it's working fine: Yes.
>
> --
>
> Look, that's why there's rules, understand? So that you think before
> you break 'em.
>
>     -- (Terry Pratchett, Thief of Time)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

AW: Why is old IO API used in maven resolver?

Posted by Markus KARG <ma...@headcrashing.eu>.
I do not support your judgement. That bug you mention was fixed nine years ago, so even the most exotic platforms should have adopted the fix meanwhile. This IS reliable and not more nor less buggy than any other part of the JRE. What we should do is give intensive tensting of Stefan's PR, but not overhastily abstain from it due to FUD. As a long time user of NIO API on Windows and Linux (speaking of years here) I strongly believe that the performance benefits will definitively outweigh the potential risks.
-Markus

-----Ursprüngliche Nachricht-----
Von: Jochen Wiedmann [mailto:jochen.wiedmann@gmail.com] 
Gesendet: Mittwoch, 19. August 2020 13:05
An: Maven Developers List
Betreff: Re: Why is old IO API used in maven resolver?

On Wed, Aug 19, 2020 at 11:49 AM Michael Osipov <mi...@apache.org> wrote:
>
> Am 2020-08-19 um 11:27 schrieb Jochen Wiedmann:
> > Looking at
> >
> >      https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6822107
> >
> > I would very much like us to abstain from every attempt to go into the
> > direction, that Stefan suggests. The consequence could be, that we
> > need to bother with exotic corner cases. Let's stay simple, but
> > reliable.
>
>
> You proposal is not to change the code at all?

If it's working fine: Yes.

-- 

Look, that's why there's rules, understand? So that you think before
you break 'em.

    -- (Terry Pratchett, Thief of Time)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Why is old IO API used in maven resolver?

Posted by Jochen Wiedmann <jo...@gmail.com>.
On Wed, Aug 19, 2020 at 11:49 AM Michael Osipov <mi...@apache.org> wrote:
>
> Am 2020-08-19 um 11:27 schrieb Jochen Wiedmann:
> > Looking at
> >
> >      https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6822107
> >
> > I would very much like us to abstain from every attempt to go into the
> > direction, that Stefan suggests. The consequence could be, that we
> > need to bother with exotic corner cases. Let's stay simple, but
> > reliable.
>
>
> You proposal is not to change the code at all?

If it's working fine: Yes.

-- 

Look, that's why there's rules, understand? So that you think before
you break 'em.

    -- (Terry Pratchett, Thief of Time)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Why is old IO API used in maven resolver?

Posted by Michael Osipov <mi...@apache.org>.
Am 2020-08-19 um 11:27 schrieb Jochen Wiedmann:
> Looking at
> 
>      https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6822107
> 
> I would very much like us to abstain from every attempt to go into the
> direction, that Stefan suggests. The consequence could be, that we
> need to bother with exotic corner cases. Let's stay simple, but
> reliable.


You proposal is not to change the code at all?



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Why is old IO API used in maven resolver?

Posted by Jochen Wiedmann <jo...@gmail.com>.
Looking at

    https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6822107

I would very much like us to abstain from every attempt to go into the
direction, that Stefan suggests. The consequence could be, that we
need to bother with exotic corner cases. Let's stay simple, but
reliable.

Jochen



On Sun, Aug 16, 2020 at 10:07 AM STEFAN REICH <sr...@mac.com.invalid> wrote:
>
> Hi there!
>
> I am working on a very large code base, and build performance issues made me look at the maven-resolver source code. In terms of File usages, there are a lot of InputStreams being copied around using ByteBuffer, instead of using FileChannel.transferTo. Affected classes are DefaultFileProcessor, ChecksumCalculator, WagonTransporter, AbstractTransporter and potentially more. Was it a conscious decision to use this pattern over the more efficient transferTo? Would you accept a PR with more modern NIO API that still works with JDK 7?
>
> Here are the throughput results from a JMH benchmark, copying a 22MB file around using the pattern currently used in maven, and transferTo, as measured on macOS with JDK 11 on an SSD.
>
> Result "MyBenchmark.resolverCopy":
>   291.362 ±(99.9%) 5.443 ops/s [Average]
>   (min, avg, max) = (276.923, 291.362, 302.911), stdev = 7.266
>   CI (99.9%): [285.919, 296.804] (assumes normal distribution)
>
> Result "MyBenchmark.transferTo":
>   325.188 ±(99.9%) 8.838 ops/s [Average]
>   (min, avg, max) = (306.978, 325.188, 355.784), stdev = 11.799
>   CI (99.9%): [316.350, 334.026] (assumes normal distribution)
>
>
> Thanks!
> Stefan



-- 

Look, that's why there's rules, understand? So that you think before
you break 'em.

    -- (Terry Pratchett, Thief of Time)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: Why is old IO API used in maven resolver?

Posted by Xeno Amess <xe...@gmail.com>.
Hi.
I wanna know how much it improve.
could you please also run a corresponding jmh bencark for the old codes?

Olivier Lamy <ol...@apache.org> 于 2020年8月16日周日 下午4:36写道:

> On Sun, 16 Aug 2020 at 16:07, STEFAN REICH <sr...@mac.com.invalid> wrote:
>
> > Hi there!
> >
> > I am working on a very large code base, and build performance issues made
> > me look at the maven-resolver source code. In terms of File usages, there
> > are a lot of InputStreams being copied around using ByteBuffer, instead
> of
> > using FileChannel.transferTo. Affected classes are DefaultFileProcessor,
> > ChecksumCalculator, WagonTransporter, AbstractTransporter and potentially
> > more. Was it a conscious decision to use this pattern over the more
> > efficient transferTo? Would you accept a PR with more modern NIO API that
> > still works with JDK 7?
> >
>
> yes please.
>
>
> > Here are the throughput results from a JMH benchmark, copying a 22MB file
> > around using the pattern currently used in maven, and transferTo, as
> > measured on macOS with JDK 11 on an SSD.
> >
> > Result "MyBenchmark.resolverCopy":
> >   291.362 ±(99.9%) 5.443 ops/s [Average]
> >   (min, avg, max) = (276.923, 291.362, 302.911), stdev = 7.266
> >   CI (99.9%): [285.919, 296.804] (assumes normal distribution)
> >
> > Result "MyBenchmark.transferTo":
> >   325.188 ±(99.9%) 8.838 ops/s [Average]
> >   (min, avg, max) = (306.978, 325.188, 355.784), stdev = 11.799
> >   CI (99.9%): [316.350, 334.026] (assumes normal distribution)
> >
> >
> sounds like a good result.
> Will it be the same for all OS? (windows, linux. osx)
>
>
>
> >
> > Thanks!
> > Stefan
>
>
>
> --
> Olivier Lamy
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>

Re: Why is old IO API used in maven resolver?

Posted by Olivier Lamy <ol...@apache.org>.
On Sun, 16 Aug 2020 at 16:07, STEFAN REICH <sr...@mac.com.invalid> wrote:

> Hi there!
>
> I am working on a very large code base, and build performance issues made
> me look at the maven-resolver source code. In terms of File usages, there
> are a lot of InputStreams being copied around using ByteBuffer, instead of
> using FileChannel.transferTo. Affected classes are DefaultFileProcessor,
> ChecksumCalculator, WagonTransporter, AbstractTransporter and potentially
> more. Was it a conscious decision to use this pattern over the more
> efficient transferTo? Would you accept a PR with more modern NIO API that
> still works with JDK 7?
>

yes please.


> Here are the throughput results from a JMH benchmark, copying a 22MB file
> around using the pattern currently used in maven, and transferTo, as
> measured on macOS with JDK 11 on an SSD.
>
> Result "MyBenchmark.resolverCopy":
>   291.362 ±(99.9%) 5.443 ops/s [Average]
>   (min, avg, max) = (276.923, 291.362, 302.911), stdev = 7.266
>   CI (99.9%): [285.919, 296.804] (assumes normal distribution)
>
> Result "MyBenchmark.transferTo":
>   325.188 ±(99.9%) 8.838 ops/s [Average]
>   (min, avg, max) = (306.978, 325.188, 355.784), stdev = 11.799
>   CI (99.9%): [316.350, 334.026] (assumes normal distribution)
>
>
sounds like a good result.
Will it be the same for all OS? (windows, linux. osx)



>
> Thanks!
> Stefan



-- 
Olivier Lamy
http://twitter.com/olamy | http://linkedin.com/in/olamy