You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Andy LoPresto <al...@apache.org> on 2018/06/26 03:34:16 UTC

[DISCUSS] Tar + Gzip vs. Zip

Hi folks,

I do not want to start a long-running argument or entrenched battle. However, having just performed the RM duties for the latest release, I believe I have identified a resource inefficiency in the fact that we generate, upload, host, and distribute two compressed archives of the binary which are functionally equivalent. For 1.7.0, both the .tar.gz and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs. 1_224_392_000 bytes for zip). The time to build and sign these is substantial, but the true cost comes in uploading and hosting them. While the fabled extension registry will save all of us from this burden, it isn’t arriving tomorrow, and I think we could drastically improve this before the next release.

I have no personal preference between the two formats. In earlier days, there were platform inconsistencies and the tools weren’t available on all systems, but now they are pretty standard for all users. This [1] is an interesting article I found which had some good info on the origins, and here are some additional resources for anyone interested [2][3]. I don’t care which we pick, but I propose removing one of the options for the build going forward (toolkit as well).

That said, if someone has a good reason that both are necessary, I would love to hear it. I didn’t find anything on the Apache Release Policy which stated we must offer both, but maybe I missed it. Thanks.

[1] https://itsfoss.com/tar-vs-zip-vs-gz/ <https://itsfoss.com/tar-vs-zip-vs-gz/>
[2] https://superuser.com/a/1257441/40003
[3] https://superuser.com/a/173995/40003
[4] https://www.apache.org/legal/release-policy.html#artifacts <https://www.apache.org/legal/release-policy.html#artifacts>


Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69


Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Peter Wilcsinszky <pe...@gmail.com>.
Hi!

I've created a PR to include the toolkit in the NiFi Docker image and also
to add the changes discussed in this topic:
- use multistage build to avoid doubling the image size
- use Zip instead of the Tar+Gzip

https://issues.apache.org/jira/browse/NIFI-5468
https://github.com/apache/nifi/pull/2921

Cheers,
Peter

On Fri, Jun 29, 2018 at 1:41 PM Peter Wilcsinszky <
peterwilcsinszky@gmail.com> wrote:

> Yes, I mean with this (multistage build) we cannot get rid of the two
> separate modules (maven and dockerhub) but we can get rid of the ADD
> instruction which I think has the benefit of making the build clearer and
> more explicit as well.
>
> On Fri, Jun 29, 2018 at 1:23 PM Aldrin Piri <al...@gmail.com> wrote:
>
>> Hi Peter,
>>
>> I remember seeing this but the criteria about working only on Mac and
>> Windows makes it a challenge, in my opinion.
>>
>> I also need to apologize as I certainly confused the Dockerfiles between
>> the Maven plugin and the Docker Hub.  My prior email should have been
>> directed toward the Maven scenario as that is using the ADD.  Docker Hub
>> will just require an updating of the curl command to the .zip extension
>> and
>> we should be set.  Regardless, Andy, when you make the issue for this
>> change feel free to create a subtask of that to update the Dockerfiles.
>> Looks like Peter is up to the task but I am also happy to help make the
>> adjustments and verify.  The first linked item you provided is the
>> multistage approach mentioned.  Multistage builds allow you to effectively
>> create throw away images only selecting specific artifacts from them to
>> use
>> in a new image.
>>
>> Thanks!
>> --aldrin
>>
>> On Fri, Jun 29, 2018 at 7:11 AM Peter Wilcsinszky <
>> peterwilcsinszky@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > I wrote about a different solution for which I implemented a PoC for in
>> >
>> >
>> https://lists.apache.org/thread.html/6122674030b8f99a63d586dcdbdaf6b31841572aed63fcc9dcfb5eea@%3Cdev.nifi.apache.org%3E
>> > but multistage build could be a better option and I'm happy to create an
>> > issue and fix it for the next release.
>> >
>> > On Fri, Jun 29, 2018 at 3:42 AM Andy LoPresto <al...@apache.org>
>> > wrote:
>> >
>> > > Thanks Aldrin. I am not knowledgeable on Docker — do either of these
>> > > options help us? We could also use a RUN to curl the Zip resource and
>> > COPY
>> > > the unzipped directory?
>> > >
>> > > [1] https://github.com/moby/moby/issues/15036#issuecomment-322177465
>> > > [2] https://github.com/jlhawn/dockramp
>> > >
>> > >
>> > > Andy LoPresto
>> > > alopresto@apache.org
>> > > *alopresto.apache@gmail.com <al...@gmail.com>*
>> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> > >
>> > > On Jun 28, 2018, at 6:22 PM, Aldrin Piri <al...@gmail.com>
>> wrote:
>> > >
>> > > Be mindful to also update the Dockerfile used for Docker Hub as this
>> will
>> > > require some adjustments.  Unfortunately, the ADD instruction does not
>> > > support zip files.  This isn't a major inconvenience but will require
>> a
>> > > multi-stage build to help keep our image size svelte.  I believe we
>> > should
>> > > be safe as we have been publishing both tarballs and zips for prior
>> > > releases, so the Dockerfile should still work in that scenario.
>> > >
>> > > On Wed, Jun 27, 2018 at 4:06 PM Andy LoPresto <al...@apache.org>
>> > > wrote:
>> > >
>> > > Thanks for everyone’s input. It seems to be a clear consensus to
>> > eliminate
>> > > .tar.gz and only provide .zip moving forward. I’d like to keep this
>> > > discussion thread going for another day or two to field any
>> objections.
>> > > After that time (Friday-ish), I’ll create a Jira to do this unless
>> things
>> > > change.
>> > >
>> > > I will probably keep the possibility to generate the .tar.gz through
>> an
>> > > inactive profile to allow people who need that offering to use it.
>> There
>> > > will be a subtask Jira to update the release guide moving forward as
>> > well.
>> > >
>> > >
>> > > Andy LoPresto
>> > > alopresto@apache.org
>> > > *alopresto.apache@gmail.com <al...@gmail.com>*
>> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> > >
>> > > On Jun 26, 2018, at 7:52 PM, James Wing <jv...@gmail.com> wrote:
>> > >
>> > > It's a great idea, Andy, I strongly support just one format.  I think
>> Zip
>> > > is a good choice.
>> > >
>> > > On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ottobackwards@gmail.com
>> >
>> > > wrote:
>> > >
>> > > I end up using zip all the time.  zip +1
>> > >
>> > >
>> > > On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
>> > >
>> > > My preference is zip.
>> > >
>> > > On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
>> > >
>> > >
>> > >
>> > > On 6/25/18 11:34 PM, Andy LoPresto wrote:
>> > >
>> > > Hi folks,
>> > >
>> > > I do not want to start a long-running argument or entrenched battle.
>> > > However, having just performed the RM duties for the latest release, I
>> > > believe I have identified a resource inefficiency in the fact that we
>> > > generate, upload, host, and distribute two compressed archives of the
>> > > binary which are functionally equivalent. For 1.7.0, both the .tar.gz
>> > > and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
>> > > 1_224_392_000 bytes for zip). The time to build and sign these is
>> > > substantial, but the true cost comes in uploading and hosting them.
>> > > While the fabled extension registry will save all of us from this
>> > > burden, it isn’t arriving tomorrow, and I think we could drastically
>> > > improve this before the next release.
>> > >
>> > > I have no personal preference between the two formats. In earlier
>> days,
>> > > there were platform inconsistencies and the tools weren’t available on
>> > > all systems, but now they are pretty standard for all users. This [1]
>> > >
>> > > is
>> > >
>> > > an interesting article I found which had some good info on the
>> origins,
>> > > and here are some additional resources for anyone interested [2][3]. I
>> > > don’t care which we pick, but I propose removing one of the options
>> for
>> > > the build going forward (toolkit as well).
>> > >
>> > > That said, if someone has a good reason that both are necessary, I
>> > >
>> > > would
>> > >
>> > > love to hear it. I didn’t find anything on the Apache Release Policy
>> > > which stated we must offer both, but maybe I missed it. Thanks.
>> > >
>> > >
>> > > I'm not aware of any ASF policy. I think it mostly stems from default
>> > > convention you get out of the maven-assembly-plugin.
>> > >
>> > > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
>> > > [2] https://superuser.com/a/1257441/40003
>> > > [3] https://superuser.com/a/173995/40003
>> > > [4] https://www.apache.org/legal/release-policy.html#artifacts
>> > >
>> > >
>> > > Andy LoPresto
>> > > alopresto@apache.org <mailto:alopresto@apache.org <
>> alopresto@apache.org
>> > >>
>> > > /alopresto.apache@gmail.com <mailto:alopresto.apache@gmail.com
>> > > <al...@gmail.com>>/
>> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>>
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Peter Wilcsinszky <pe...@gmail.com>.
Yes, I mean with this (multistage build) we cannot get rid of the two
separate modules (maven and dockerhub) but we can get rid of the ADD
instruction which I think has the benefit of making the build clearer and
more explicit as well.

On Fri, Jun 29, 2018 at 1:23 PM Aldrin Piri <al...@gmail.com> wrote:

> Hi Peter,
>
> I remember seeing this but the criteria about working only on Mac and
> Windows makes it a challenge, in my opinion.
>
> I also need to apologize as I certainly confused the Dockerfiles between
> the Maven plugin and the Docker Hub.  My prior email should have been
> directed toward the Maven scenario as that is using the ADD.  Docker Hub
> will just require an updating of the curl command to the .zip extension and
> we should be set.  Regardless, Andy, when you make the issue for this
> change feel free to create a subtask of that to update the Dockerfiles.
> Looks like Peter is up to the task but I am also happy to help make the
> adjustments and verify.  The first linked item you provided is the
> multistage approach mentioned.  Multistage builds allow you to effectively
> create throw away images only selecting specific artifacts from them to use
> in a new image.
>
> Thanks!
> --aldrin
>
> On Fri, Jun 29, 2018 at 7:11 AM Peter Wilcsinszky <
> peterwilcsinszky@gmail.com> wrote:
>
> > Hi,
> >
> > I wrote about a different solution for which I implemented a PoC for in
> >
> >
> https://lists.apache.org/thread.html/6122674030b8f99a63d586dcdbdaf6b31841572aed63fcc9dcfb5eea@%3Cdev.nifi.apache.org%3E
> > but multistage build could be a better option and I'm happy to create an
> > issue and fix it for the next release.
> >
> > On Fri, Jun 29, 2018 at 3:42 AM Andy LoPresto <al...@apache.org>
> > wrote:
> >
> > > Thanks Aldrin. I am not knowledgeable on Docker — do either of these
> > > options help us? We could also use a RUN to curl the Zip resource and
> > COPY
> > > the unzipped directory?
> > >
> > > [1] https://github.com/moby/moby/issues/15036#issuecomment-322177465
> > > [2] https://github.com/jlhawn/dockramp
> > >
> > >
> > > Andy LoPresto
> > > alopresto@apache.org
> > > *alopresto.apache@gmail.com <al...@gmail.com>*
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> > >
> > > On Jun 28, 2018, at 6:22 PM, Aldrin Piri <al...@gmail.com> wrote:
> > >
> > > Be mindful to also update the Dockerfile used for Docker Hub as this
> will
> > > require some adjustments.  Unfortunately, the ADD instruction does not
> > > support zip files.  This isn't a major inconvenience but will require a
> > > multi-stage build to help keep our image size svelte.  I believe we
> > should
> > > be safe as we have been publishing both tarballs and zips for prior
> > > releases, so the Dockerfile should still work in that scenario.
> > >
> > > On Wed, Jun 27, 2018 at 4:06 PM Andy LoPresto <al...@apache.org>
> > > wrote:
> > >
> > > Thanks for everyone’s input. It seems to be a clear consensus to
> > eliminate
> > > .tar.gz and only provide .zip moving forward. I’d like to keep this
> > > discussion thread going for another day or two to field any objections.
> > > After that time (Friday-ish), I’ll create a Jira to do this unless
> things
> > > change.
> > >
> > > I will probably keep the possibility to generate the .tar.gz through an
> > > inactive profile to allow people who need that offering to use it.
> There
> > > will be a subtask Jira to update the release guide moving forward as
> > well.
> > >
> > >
> > > Andy LoPresto
> > > alopresto@apache.org
> > > *alopresto.apache@gmail.com <al...@gmail.com>*
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> > >
> > > On Jun 26, 2018, at 7:52 PM, James Wing <jv...@gmail.com> wrote:
> > >
> > > It's a great idea, Andy, I strongly support just one format.  I think
> Zip
> > > is a good choice.
> > >
> > > On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ot...@gmail.com>
> > > wrote:
> > >
> > > I end up using zip all the time.  zip +1
> > >
> > >
> > > On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
> > >
> > > My preference is zip.
> > >
> > > On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
> > >
> > >
> > >
> > > On 6/25/18 11:34 PM, Andy LoPresto wrote:
> > >
> > > Hi folks,
> > >
> > > I do not want to start a long-running argument or entrenched battle.
> > > However, having just performed the RM duties for the latest release, I
> > > believe I have identified a resource inefficiency in the fact that we
> > > generate, upload, host, and distribute two compressed archives of the
> > > binary which are functionally equivalent. For 1.7.0, both the .tar.gz
> > > and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
> > > 1_224_392_000 bytes for zip). The time to build and sign these is
> > > substantial, but the true cost comes in uploading and hosting them.
> > > While the fabled extension registry will save all of us from this
> > > burden, it isn’t arriving tomorrow, and I think we could drastically
> > > improve this before the next release.
> > >
> > > I have no personal preference between the two formats. In earlier days,
> > > there were platform inconsistencies and the tools weren’t available on
> > > all systems, but now they are pretty standard for all users. This [1]
> > >
> > > is
> > >
> > > an interesting article I found which had some good info on the origins,
> > > and here are some additional resources for anyone interested [2][3]. I
> > > don’t care which we pick, but I propose removing one of the options for
> > > the build going forward (toolkit as well).
> > >
> > > That said, if someone has a good reason that both are necessary, I
> > >
> > > would
> > >
> > > love to hear it. I didn’t find anything on the Apache Release Policy
> > > which stated we must offer both, but maybe I missed it. Thanks.
> > >
> > >
> > > I'm not aware of any ASF policy. I think it mostly stems from default
> > > convention you get out of the maven-assembly-plugin.
> > >
> > > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> > > [2] https://superuser.com/a/1257441/40003
> > > [3] https://superuser.com/a/173995/40003
> > > [4] https://www.apache.org/legal/release-policy.html#artifacts
> > >
> > >
> > > Andy LoPresto
> > > alopresto@apache.org <mailto:alopresto@apache.org <
> alopresto@apache.org
> > >>
> > > /alopresto.apache@gmail.com <mailto:alopresto.apache@gmail.com
> > > <al...@gmail.com>>/
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Aldrin Piri <al...@gmail.com>.
Hi Peter,

I remember seeing this but the criteria about working only on Mac and
Windows makes it a challenge, in my opinion.

I also need to apologize as I certainly confused the Dockerfiles between
the Maven plugin and the Docker Hub.  My prior email should have been
directed toward the Maven scenario as that is using the ADD.  Docker Hub
will just require an updating of the curl command to the .zip extension and
we should be set.  Regardless, Andy, when you make the issue for this
change feel free to create a subtask of that to update the Dockerfiles.
Looks like Peter is up to the task but I am also happy to help make the
adjustments and verify.  The first linked item you provided is the
multistage approach mentioned.  Multistage builds allow you to effectively
create throw away images only selecting specific artifacts from them to use
in a new image.

Thanks!
--aldrin

On Fri, Jun 29, 2018 at 7:11 AM Peter Wilcsinszky <
peterwilcsinszky@gmail.com> wrote:

> Hi,
>
> I wrote about a different solution for which I implemented a PoC for in
>
> https://lists.apache.org/thread.html/6122674030b8f99a63d586dcdbdaf6b31841572aed63fcc9dcfb5eea@%3Cdev.nifi.apache.org%3E
> but multistage build could be a better option and I'm happy to create an
> issue and fix it for the next release.
>
> On Fri, Jun 29, 2018 at 3:42 AM Andy LoPresto <al...@apache.org>
> wrote:
>
> > Thanks Aldrin. I am not knowledgeable on Docker — do either of these
> > options help us? We could also use a RUN to curl the Zip resource and
> COPY
> > the unzipped directory?
> >
> > [1] https://github.com/moby/moby/issues/15036#issuecomment-322177465
> > [2] https://github.com/jlhawn/dockramp
> >
> >
> > Andy LoPresto
> > alopresto@apache.org
> > *alopresto.apache@gmail.com <al...@gmail.com>*
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >
> > On Jun 28, 2018, at 6:22 PM, Aldrin Piri <al...@gmail.com> wrote:
> >
> > Be mindful to also update the Dockerfile used for Docker Hub as this will
> > require some adjustments.  Unfortunately, the ADD instruction does not
> > support zip files.  This isn't a major inconvenience but will require a
> > multi-stage build to help keep our image size svelte.  I believe we
> should
> > be safe as we have been publishing both tarballs and zips for prior
> > releases, so the Dockerfile should still work in that scenario.
> >
> > On Wed, Jun 27, 2018 at 4:06 PM Andy LoPresto <al...@apache.org>
> > wrote:
> >
> > Thanks for everyone’s input. It seems to be a clear consensus to
> eliminate
> > .tar.gz and only provide .zip moving forward. I’d like to keep this
> > discussion thread going for another day or two to field any objections.
> > After that time (Friday-ish), I’ll create a Jira to do this unless things
> > change.
> >
> > I will probably keep the possibility to generate the .tar.gz through an
> > inactive profile to allow people who need that offering to use it. There
> > will be a subtask Jira to update the release guide moving forward as
> well.
> >
> >
> > Andy LoPresto
> > alopresto@apache.org
> > *alopresto.apache@gmail.com <al...@gmail.com>*
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >
> > On Jun 26, 2018, at 7:52 PM, James Wing <jv...@gmail.com> wrote:
> >
> > It's a great idea, Andy, I strongly support just one format.  I think Zip
> > is a good choice.
> >
> > On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ot...@gmail.com>
> > wrote:
> >
> > I end up using zip all the time.  zip +1
> >
> >
> > On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
> >
> > My preference is zip.
> >
> > On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
> >
> >
> >
> > On 6/25/18 11:34 PM, Andy LoPresto wrote:
> >
> > Hi folks,
> >
> > I do not want to start a long-running argument or entrenched battle.
> > However, having just performed the RM duties for the latest release, I
> > believe I have identified a resource inefficiency in the fact that we
> > generate, upload, host, and distribute two compressed archives of the
> > binary which are functionally equivalent. For 1.7.0, both the .tar.gz
> > and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
> > 1_224_392_000 bytes for zip). The time to build and sign these is
> > substantial, but the true cost comes in uploading and hosting them.
> > While the fabled extension registry will save all of us from this
> > burden, it isn’t arriving tomorrow, and I think we could drastically
> > improve this before the next release.
> >
> > I have no personal preference between the two formats. In earlier days,
> > there were platform inconsistencies and the tools weren’t available on
> > all systems, but now they are pretty standard for all users. This [1]
> >
> > is
> >
> > an interesting article I found which had some good info on the origins,
> > and here are some additional resources for anyone interested [2][3]. I
> > don’t care which we pick, but I propose removing one of the options for
> > the build going forward (toolkit as well).
> >
> > That said, if someone has a good reason that both are necessary, I
> >
> > would
> >
> > love to hear it. I didn’t find anything on the Apache Release Policy
> > which stated we must offer both, but maybe I missed it. Thanks.
> >
> >
> > I'm not aware of any ASF policy. I think it mostly stems from default
> > convention you get out of the maven-assembly-plugin.
> >
> > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> > [2] https://superuser.com/a/1257441/40003
> > [3] https://superuser.com/a/173995/40003
> > [4] https://www.apache.org/legal/release-policy.html#artifacts
> >
> >
> > Andy LoPresto
> > alopresto@apache.org <mailto:alopresto@apache.org <alopresto@apache.org
> >>
> > /alopresto.apache@gmail.com <mailto:alopresto.apache@gmail.com
> > <al...@gmail.com>>/
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> >
> >
> >
> >
> >
> >
> >
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Peter Wilcsinszky <pe...@gmail.com>.
Hi,

I wrote about a different solution for which I implemented a PoC for in
https://lists.apache.org/thread.html/6122674030b8f99a63d586dcdbdaf6b31841572aed63fcc9dcfb5eea@%3Cdev.nifi.apache.org%3E
but multistage build could be a better option and I'm happy to create an
issue and fix it for the next release.

On Fri, Jun 29, 2018 at 3:42 AM Andy LoPresto <al...@apache.org> wrote:

> Thanks Aldrin. I am not knowledgeable on Docker — do either of these
> options help us? We could also use a RUN to curl the Zip resource and COPY
> the unzipped directory?
>
> [1] https://github.com/moby/moby/issues/15036#issuecomment-322177465
> [2] https://github.com/jlhawn/dockramp
>
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jun 28, 2018, at 6:22 PM, Aldrin Piri <al...@gmail.com> wrote:
>
> Be mindful to also update the Dockerfile used for Docker Hub as this will
> require some adjustments.  Unfortunately, the ADD instruction does not
> support zip files.  This isn't a major inconvenience but will require a
> multi-stage build to help keep our image size svelte.  I believe we should
> be safe as we have been publishing both tarballs and zips for prior
> releases, so the Dockerfile should still work in that scenario.
>
> On Wed, Jun 27, 2018 at 4:06 PM Andy LoPresto <al...@apache.org>
> wrote:
>
> Thanks for everyone’s input. It seems to be a clear consensus to eliminate
> .tar.gz and only provide .zip moving forward. I’d like to keep this
> discussion thread going for another day or two to field any objections.
> After that time (Friday-ish), I’ll create a Jira to do this unless things
> change.
>
> I will probably keep the possibility to generate the .tar.gz through an
> inactive profile to allow people who need that offering to use it. There
> will be a subtask Jira to update the release guide moving forward as well.
>
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jun 26, 2018, at 7:52 PM, James Wing <jv...@gmail.com> wrote:
>
> It's a great idea, Andy, I strongly support just one format.  I think Zip
> is a good choice.
>
> On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ot...@gmail.com>
> wrote:
>
> I end up using zip all the time.  zip +1
>
>
> On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
>
> My preference is zip.
>
> On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
>
>
>
> On 6/25/18 11:34 PM, Andy LoPresto wrote:
>
> Hi folks,
>
> I do not want to start a long-running argument or entrenched battle.
> However, having just performed the RM duties for the latest release, I
> believe I have identified a resource inefficiency in the fact that we
> generate, upload, host, and distribute two compressed archives of the
> binary which are functionally equivalent. For 1.7.0, both the .tar.gz
> and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
> 1_224_392_000 bytes for zip). The time to build and sign these is
> substantial, but the true cost comes in uploading and hosting them.
> While the fabled extension registry will save all of us from this
> burden, it isn’t arriving tomorrow, and I think we could drastically
> improve this before the next release.
>
> I have no personal preference between the two formats. In earlier days,
> there were platform inconsistencies and the tools weren’t available on
> all systems, but now they are pretty standard for all users. This [1]
>
> is
>
> an interesting article I found which had some good info on the origins,
> and here are some additional resources for anyone interested [2][3]. I
> don’t care which we pick, but I propose removing one of the options for
> the build going forward (toolkit as well).
>
> That said, if someone has a good reason that both are necessary, I
>
> would
>
> love to hear it. I didn’t find anything on the Apache Release Policy
> which stated we must offer both, but maybe I missed it. Thanks.
>
>
> I'm not aware of any ASF policy. I think it mostly stems from default
> convention you get out of the maven-assembly-plugin.
>
> [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> [2] https://superuser.com/a/1257441/40003
> [3] https://superuser.com/a/173995/40003
> [4] https://www.apache.org/legal/release-policy.html#artifacts
>
>
> Andy LoPresto
> alopresto@apache.org <mailto:alopresto@apache.org <al...@apache.org>>
> /alopresto.apache@gmail.com <mailto:alopresto.apache@gmail.com
> <al...@gmail.com>>/
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
>
>
>
>
>
>
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Andy LoPresto <al...@apache.org>.
Thanks Aldrin. I am not knowledgeable on Docker — do either of these options help us? We could also use a RUN to curl the Zip resource and COPY the unzipped directory?

[1] https://github.com/moby/moby/issues/15036#issuecomment-322177465
[2] https://github.com/jlhawn/dockramp <https://github.com/jlhawn/dockramp>


Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jun 28, 2018, at 6:22 PM, Aldrin Piri <al...@gmail.com> wrote:
> 
> Be mindful to also update the Dockerfile used for Docker Hub as this will
> require some adjustments.  Unfortunately, the ADD instruction does not
> support zip files.  This isn't a major inconvenience but will require a
> multi-stage build to help keep our image size svelte.  I believe we should
> be safe as we have been publishing both tarballs and zips for prior
> releases, so the Dockerfile should still work in that scenario.
> 
> On Wed, Jun 27, 2018 at 4:06 PM Andy LoPresto <al...@apache.org> wrote:
> 
>> Thanks for everyone’s input. It seems to be a clear consensus to eliminate
>> .tar.gz and only provide .zip moving forward. I’d like to keep this
>> discussion thread going for another day or two to field any objections.
>> After that time (Friday-ish), I’ll create a Jira to do this unless things
>> change.
>> 
>> I will probably keep the possibility to generate the .tar.gz through an
>> inactive profile to allow people who need that offering to use it. There
>> will be a subtask Jira to update the release guide moving forward as well.
>> 
>> 
>> Andy LoPresto
>> alopresto@apache.org
>> *alopresto.apache@gmail.com <al...@gmail.com>*
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>> 
>> On Jun 26, 2018, at 7:52 PM, James Wing <jv...@gmail.com> wrote:
>> 
>> It's a great idea, Andy, I strongly support just one format.  I think Zip
>> is a good choice.
>> 
>> On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ot...@gmail.com>
>> wrote:
>> 
>> I end up using zip all the time.  zip +1
>> 
>> 
>> On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
>> 
>> My preference is zip.
>> 
>> On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
>> 
>> 
>> 
>> On 6/25/18 11:34 PM, Andy LoPresto wrote:
>> 
>> Hi folks,
>> 
>> I do not want to start a long-running argument or entrenched battle.
>> However, having just performed the RM duties for the latest release, I
>> believe I have identified a resource inefficiency in the fact that we
>> generate, upload, host, and distribute two compressed archives of the
>> binary which are functionally equivalent. For 1.7.0, both the .tar.gz
>> and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
>> 1_224_392_000 bytes for zip). The time to build and sign these is
>> substantial, but the true cost comes in uploading and hosting them.
>> While the fabled extension registry will save all of us from this
>> burden, it isn’t arriving tomorrow, and I think we could drastically
>> improve this before the next release.
>> 
>> I have no personal preference between the two formats. In earlier days,
>> there were platform inconsistencies and the tools weren’t available on
>> all systems, but now they are pretty standard for all users. This [1]
>> 
>> is
>> 
>> an interesting article I found which had some good info on the origins,
>> and here are some additional resources for anyone interested [2][3]. I
>> don’t care which we pick, but I propose removing one of the options for
>> the build going forward (toolkit as well).
>> 
>> That said, if someone has a good reason that both are necessary, I
>> 
>> would
>> 
>> love to hear it. I didn’t find anything on the Apache Release Policy
>> which stated we must offer both, but maybe I missed it. Thanks.
>> 
>> 
>> I'm not aware of any ASF policy. I think it mostly stems from default
>> convention you get out of the maven-assembly-plugin.
>> 
>> [1] https://itsfoss.com/tar-vs-zip-vs-gz/
>> [2] https://superuser.com/a/1257441/40003
>> [3] https://superuser.com/a/173995/40003
>> [4] https://www.apache.org/legal/release-policy.html#artifacts
>> 
>> 
>> Andy LoPresto
>> alopresto@apache.org <mailto:alopresto@apache.org <al...@apache.org>>
>> /alopresto.apache@gmail.com <mailto:alopresto.apache@gmail.com
>> <al...@gmail.com>>/
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
>> 
>> 
>> 
>> 
>> 


Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Aldrin Piri <al...@gmail.com>.
Be mindful to also update the Dockerfile used for Docker Hub as this will
require some adjustments.  Unfortunately, the ADD instruction does not
support zip files.  This isn't a major inconvenience but will require a
multi-stage build to help keep our image size svelte.  I believe we should
be safe as we have been publishing both tarballs and zips for prior
releases, so the Dockerfile should still work in that scenario.

On Wed, Jun 27, 2018 at 4:06 PM Andy LoPresto <al...@apache.org> wrote:

> Thanks for everyone’s input. It seems to be a clear consensus to eliminate
> .tar.gz and only provide .zip moving forward. I’d like to keep this
> discussion thread going for another day or two to field any objections.
> After that time (Friday-ish), I’ll create a Jira to do this unless things
> change.
>
> I will probably keep the possibility to generate the .tar.gz through an
> inactive profile to allow people who need that offering to use it. There
> will be a subtask Jira to update the release guide moving forward as well.
>
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Jun 26, 2018, at 7:52 PM, James Wing <jv...@gmail.com> wrote:
>
> It's a great idea, Andy, I strongly support just one format.  I think Zip
> is a good choice.
>
> On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ot...@gmail.com>
> wrote:
>
> I end up using zip all the time.  zip +1
>
>
> On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
>
> My preference is zip.
>
> On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
>
>
>
> On 6/25/18 11:34 PM, Andy LoPresto wrote:
>
> Hi folks,
>
> I do not want to start a long-running argument or entrenched battle.
> However, having just performed the RM duties for the latest release, I
> believe I have identified a resource inefficiency in the fact that we
> generate, upload, host, and distribute two compressed archives of the
> binary which are functionally equivalent. For 1.7.0, both the .tar.gz
> and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
> 1_224_392_000 bytes for zip). The time to build and sign these is
> substantial, but the true cost comes in uploading and hosting them.
> While the fabled extension registry will save all of us from this
> burden, it isn’t arriving tomorrow, and I think we could drastically
> improve this before the next release.
>
> I have no personal preference between the two formats. In earlier days,
> there were platform inconsistencies and the tools weren’t available on
> all systems, but now they are pretty standard for all users. This [1]
>
> is
>
> an interesting article I found which had some good info on the origins,
> and here are some additional resources for anyone interested [2][3]. I
> don’t care which we pick, but I propose removing one of the options for
> the build going forward (toolkit as well).
>
> That said, if someone has a good reason that both are necessary, I
>
> would
>
> love to hear it. I didn’t find anything on the Apache Release Policy
> which stated we must offer both, but maybe I missed it. Thanks.
>
>
> I'm not aware of any ASF policy. I think it mostly stems from default
> convention you get out of the maven-assembly-plugin.
>
> [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> [2] https://superuser.com/a/1257441/40003
> [3] https://superuser.com/a/173995/40003
> [4] https://www.apache.org/legal/release-policy.html#artifacts
>
>
> Andy LoPresto
> alopresto@apache.org <mailto:alopresto@apache.org <al...@apache.org>>
> /alopresto.apache@gmail.com <mailto:alopresto.apache@gmail.com
> <al...@gmail.com>>/
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
>
>
>
>
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Andy LoPresto <al...@apache.org>.
Thanks for everyone’s input. It seems to be a clear consensus to eliminate .tar.gz and only provide .zip moving forward. I’d like to keep this discussion thread going for another day or two to field any objections. After that time (Friday-ish), I’ll create a Jira to do this unless things change.

I will probably keep the possibility to generate the .tar.gz through an inactive profile to allow people who need that offering to use it. There will be a subtask Jira to update the release guide moving forward as well.


Andy LoPresto
alopresto@apache.org
alopresto.apache@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jun 26, 2018, at 7:52 PM, James Wing <jv...@gmail.com> wrote:
> 
> It's a great idea, Andy, I strongly support just one format.  I think Zip
> is a good choice.
> 
> On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ot...@gmail.com>
> wrote:
> 
>> I end up using zip all the time.  zip +1
>> 
>> 
>> On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
>> 
>> My preference is zip.
>> 
>> On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
>> 
>>> 
>>> 
>>> On 6/25/18 11:34 PM, Andy LoPresto wrote:
>>>> Hi folks,
>>>> 
>>>> I do not want to start a long-running argument or entrenched battle.
>>>> However, having just performed the RM duties for the latest release, I
>>>> believe I have identified a resource inefficiency in the fact that we
>>>> generate, upload, host, and distribute two compressed archives of the
>>>> binary which are functionally equivalent. For 1.7.0, both the .tar.gz
>>>> and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
>>>> 1_224_392_000 bytes for zip). The time to build and sign these is
>>>> substantial, but the true cost comes in uploading and hosting them.
>>>> While the fabled extension registry will save all of us from this
>>>> burden, it isn’t arriving tomorrow, and I think we could drastically
>>>> improve this before the next release.
>>>> 
>>>> I have no personal preference between the two formats. In earlier days,
>>>> there were platform inconsistencies and the tools weren’t available on
>>>> all systems, but now they are pretty standard for all users. This [1]
>> is
>>>> an interesting article I found which had some good info on the origins,
>>>> and here are some additional resources for anyone interested [2][3]. I
>>>> don’t care which we pick, but I propose removing one of the options for
>>>> the build going forward (toolkit as well).
>>>> 
>>>> That said, if someone has a good reason that both are necessary, I
>> would
>>>> love to hear it. I didn’t find anything on the Apache Release Policy
>>>> which stated we must offer both, but maybe I missed it. Thanks.
>>> 
>>> I'm not aware of any ASF policy. I think it mostly stems from default
>>> convention you get out of the maven-assembly-plugin.
>>> 
>>>> [1] https://itsfoss.com/tar-vs-zip-vs-gz/
>>>> [2] https://superuser.com/a/1257441/40003
>>>> [3] https://superuser.com/a/173995/40003
>>>> [4] https://www.apache.org/legal/release-policy.html#artifacts
>>>> 
>>>> 
>>>> Andy LoPresto
>>>> alopresto@apache.org <ma...@apache.org>
>>>> /alopresto.apache@gmail.com <ma...@gmail.com>/
>>>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
>>>> 
>>> 
>> 


Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by James Wing <jv...@gmail.com>.
It's a great idea, Andy, I strongly support just one format.  I think Zip
is a good choice.

On Tue, Jun 26, 2018 at 11:16 AM Otto Fowler <ot...@gmail.com>
wrote:

> I end up using zip all the time.  zip +1
>
>
> On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:
>
> My preference is zip.
>
> On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:
>
> >
> >
> > On 6/25/18 11:34 PM, Andy LoPresto wrote:
> > > Hi folks,
> > >
> > > I do not want to start a long-running argument or entrenched battle.
> > > However, having just performed the RM duties for the latest release, I
> > > believe I have identified a resource inefficiency in the fact that we
> > > generate, upload, host, and distribute two compressed archives of the
> > > binary which are functionally equivalent. For 1.7.0, both the .tar.gz
> > > and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
> > > 1_224_392_000 bytes for zip). The time to build and sign these is
> > > substantial, but the true cost comes in uploading and hosting them.
> > > While the fabled extension registry will save all of us from this
> > > burden, it isn’t arriving tomorrow, and I think we could drastically
> > > improve this before the next release.
> > >
> > > I have no personal preference between the two formats. In earlier days,
> > > there were platform inconsistencies and the tools weren’t available on
> > > all systems, but now they are pretty standard for all users. This [1]
> is
> > > an interesting article I found which had some good info on the origins,
> > > and here are some additional resources for anyone interested [2][3]. I
> > > don’t care which we pick, but I propose removing one of the options for
> > > the build going forward (toolkit as well).
> > >
> > > That said, if someone has a good reason that both are necessary, I
> would
> > > love to hear it. I didn’t find anything on the Apache Release Policy
> > > which stated we must offer both, but maybe I missed it. Thanks.
> >
> > I'm not aware of any ASF policy. I think it mostly stems from default
> > convention you get out of the maven-assembly-plugin.
> >
> > > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> > > [2] https://superuser.com/a/1257441/40003
> > > [3] https://superuser.com/a/173995/40003
> > > [4] https://www.apache.org/legal/release-policy.html#artifacts
> > >
> > >
> > > Andy LoPresto
> > > alopresto@apache.org <ma...@apache.org>
> > > /alopresto.apache@gmail.com <ma...@gmail.com>/
> > > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> > >
> >
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Otto Fowler <ot...@gmail.com>.
I end up using zip all the time.  zip +1


On June 26, 2018 at 13:30:33, Tony Kurc (tkurc@apache.org) wrote:

My preference is zip.

On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:

>
>
> On 6/25/18 11:34 PM, Andy LoPresto wrote:
> > Hi folks,
> >
> > I do not want to start a long-running argument or entrenched battle.
> > However, having just performed the RM duties for the latest release, I
> > believe I have identified a resource inefficiency in the fact that we
> > generate, upload, host, and distribute two compressed archives of the
> > binary which are functionally equivalent. For 1.7.0, both the .tar.gz
> > and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
> > 1_224_392_000 bytes for zip). The time to build and sign these is
> > substantial, but the true cost comes in uploading and hosting them.
> > While the fabled extension registry will save all of us from this
> > burden, it isn’t arriving tomorrow, and I think we could drastically
> > improve this before the next release.
> >
> > I have no personal preference between the two formats. In earlier days,
> > there were platform inconsistencies and the tools weren’t available on
> > all systems, but now they are pretty standard for all users. This [1]
is
> > an interesting article I found which had some good info on the origins,
> > and here are some additional resources for anyone interested [2][3]. I
> > don’t care which we pick, but I propose removing one of the options for
> > the build going forward (toolkit as well).
> >
> > That said, if someone has a good reason that both are necessary, I
would
> > love to hear it. I didn’t find anything on the Apache Release Policy
> > which stated we must offer both, but maybe I missed it. Thanks.
>
> I'm not aware of any ASF policy. I think it mostly stems from default
> convention you get out of the maven-assembly-plugin.
>
> > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> > [2] https://superuser.com/a/1257441/40003
> > [3] https://superuser.com/a/173995/40003
> > [4] https://www.apache.org/legal/release-policy.html#artifacts
> >
> >
> > Andy LoPresto
> > alopresto@apache.org <ma...@apache.org>
> > /alopresto.apache@gmail.com <ma...@gmail.com>/
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69
> >
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Tony Kurc <tk...@apache.org>.
My preference is zip.

On Tue, Jun 26, 2018, 9:21 AM Josh Elser <el...@apache.org> wrote:

>
>
> On 6/25/18 11:34 PM, Andy LoPresto wrote:
> > Hi folks,
> >
> > I do not want to start a long-running argument or entrenched battle.
> > However, having just performed the RM duties for the latest release, I
> > believe I have identified a resource inefficiency in the fact that we
> > generate, upload, host, and distribute two compressed archives of the
> > binary which are functionally equivalent. For 1.7.0, both the .tar.gz
> > and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs.
> > 1_224_392_000 bytes for zip). The time to build and sign these is
> > substantial, but the true cost comes in uploading and hosting them.
> > While the fabled extension registry will save all of us from this
> > burden, it isn’t arriving tomorrow, and I think we could drastically
> > improve this before the next release.
> >
> > I have no personal preference between the two formats. In earlier days,
> > there were platform inconsistencies and the tools weren’t available on
> > all systems, but now they are pretty standard for all users. This [1] is
> > an interesting article I found which had some good info on the origins,
> > and here are some additional resources for anyone interested [2][3]. I
> > don’t care which we pick, but I propose removing one of the options for
> > the build going forward (toolkit as well).
> >
> > That said, if someone has a good reason that both are necessary, I would
> > love to hear it. I didn’t find anything on the Apache Release Policy
> > which stated we must offer both, but maybe I missed it. Thanks.
>
> I'm not aware of any ASF policy. I think it mostly stems from default
> convention you get out of the maven-assembly-plugin.
>
> > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> > [2] https://superuser.com/a/1257441/40003
> > [3] https://superuser.com/a/173995/40003
> > [4] https://www.apache.org/legal/release-policy.html#artifacts
> >
> >
> > Andy LoPresto
> > alopresto@apache.org <ma...@apache.org>
> > /alopresto.apache@gmail.com <ma...@gmail.com>/
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Josh Elser <el...@apache.org>.

On 6/25/18 11:34 PM, Andy LoPresto wrote:
> Hi folks,
> 
> I do not want to start a long-running argument or entrenched battle. 
> However, having just performed the RM duties for the latest release, I 
> believe I have identified a resource inefficiency in the fact that we 
> generate, upload, host, and distribute two compressed archives of the 
> binary which are functionally equivalent. For 1.7.0, both the .tar.gz 
> and .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs. 
> 1_224_392_000 bytes for zip). The time to build and sign these is 
> substantial, but the true cost comes in uploading and hosting them. 
> While the fabled extension registry will save all of us from this 
> burden, it isn’t arriving tomorrow, and I think we could drastically 
> improve this before the next release.
> 
> I have no personal preference between the two formats. In earlier days, 
> there were platform inconsistencies and the tools weren’t available on 
> all systems, but now they are pretty standard for all users. This [1] is 
> an interesting article I found which had some good info on the origins, 
> and here are some additional resources for anyone interested [2][3]. I 
> don’t care which we pick, but I propose removing one of the options for 
> the build going forward (toolkit as well).
> 
> That said, if someone has a good reason that both are necessary, I would 
> love to hear it. I didn’t find anything on the Apache Release Policy 
> which stated we must offer both, but maybe I missed it. Thanks.

I'm not aware of any ASF policy. I think it mostly stems from default 
convention you get out of the maven-assembly-plugin.

> [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> [2] https://superuser.com/a/1257441/40003
> [3] https://superuser.com/a/173995/40003
> [4] https://www.apache.org/legal/release-policy.html#artifacts
> 
> 
> Andy LoPresto
> alopresto@apache.org <ma...@apache.org>
> /alopresto.apache@gmail.com <ma...@gmail.com>/
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Jeff Zemerick <jz...@apache.org>.
As a user I always download the zip file. Echoing Mike's reply, I work
across Linux, Windows, and OSX and my mouse always goes toward the zip.
I've never run into any file permission/attribute issues with the zip
distribution. Everything that should be executable always has been. So if
you axed one, my non-binding, FWIW vote would be to keep zip. :)

Jeff


On Tue, Jun 26, 2018 at 5:28 AM Mike Thomsen <mi...@gmail.com> wrote:

> I would lean toward Zip because it is the format that is supported by
> Windows, macOS and Linux out of the box. I think the ease of use for
> Windows users is particularly important.
>
> On Mon, Jun 25, 2018 at 11:34 PM Andy LoPresto <al...@apache.org>
> wrote:
>
> > Hi folks,
> >
> > I do not want to start a long-running argument or entrenched battle.
> > However, having just performed the RM duties for the latest release, I
> > believe I have identified a resource inefficiency in the fact that we
> > generate, upload, host, and distribute two compressed archives of the
> > binary which are functionally equivalent. For 1.7.0, both the .tar.gz and
> > .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs. 1_224_392_000
> > bytes for zip). The time to build and sign these is substantial, but the
> > true cost comes in uploading and hosting them. While the fabled extension
> > registry will save all of us from this burden, it isn’t arriving
> tomorrow,
> > and I think we could drastically improve this before the next release.
> >
> > I have no personal preference between the two formats. In earlier days,
> > there were platform inconsistencies and the tools weren’t available on
> all
> > systems, but now they are pretty standard for all users. This [1] is an
> > interesting article I found which had some good info on the origins, and
> > here are some additional resources for anyone interested [2][3]. I don’t
> > care which we pick, but I propose removing one of the options for the
> build
> > going forward (toolkit as well).
> >
> > That said, if someone has a good reason that both are necessary, I would
> > love to hear it. I didn’t find anything on the Apache Release Policy
> which
> > stated we must offer both, but maybe I missed it. Thanks.
> >
> > [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> > [2] https://superuser.com/a/1257441/40003
> > [3] https://superuser.com/a/173995/40003
> > [4] https://www.apache.org/legal/release-policy.html#artifacts
> >
> >
> > Andy LoPresto
> > alopresto@apache.org
> > *alopresto.apache@gmail.com <al...@gmail.com>*
> > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> >
> >
>

Re: [DISCUSS] Tar + Gzip vs. Zip

Posted by Mike Thomsen <mi...@gmail.com>.
I would lean toward Zip because it is the format that is supported by
Windows, macOS and Linux out of the box. I think the ease of use for
Windows users is particularly important.

On Mon, Jun 25, 2018 at 11:34 PM Andy LoPresto <al...@apache.org> wrote:

> Hi folks,
>
> I do not want to start a long-running argument or entrenched battle.
> However, having just performed the RM duties for the latest release, I
> believe I have identified a resource inefficiency in the fact that we
> generate, upload, host, and distribute two compressed archives of the
> binary which are functionally equivalent. For 1.7.0, both the .tar.gz and
> .zip files are 1.2 GB (1_224_352_000 bytes for tar.gz vs. 1_224_392_000
> bytes for zip). The time to build and sign these is substantial, but the
> true cost comes in uploading and hosting them. While the fabled extension
> registry will save all of us from this burden, it isn’t arriving tomorrow,
> and I think we could drastically improve this before the next release.
>
> I have no personal preference between the two formats. In earlier days,
> there were platform inconsistencies and the tools weren’t available on all
> systems, but now they are pretty standard for all users. This [1] is an
> interesting article I found which had some good info on the origins, and
> here are some additional resources for anyone interested [2][3]. I don’t
> care which we pick, but I propose removing one of the options for the build
> going forward (toolkit as well).
>
> That said, if someone has a good reason that both are necessary, I would
> love to hear it. I didn’t find anything on the Apache Release Policy which
> stated we must offer both, but maybe I missed it. Thanks.
>
> [1] https://itsfoss.com/tar-vs-zip-vs-gz/
> [2] https://superuser.com/a/1257441/40003
> [3] https://superuser.com/a/173995/40003
> [4] https://www.apache.org/legal/release-policy.html#artifacts
>
>
> Andy LoPresto
> alopresto@apache.org
> *alopresto.apache@gmail.com <al...@gmail.com>*
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
>