You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Matthew Benedict de Detrich <ma...@aiven.io.INVALID> on 2023/02/08 16:47:20 UTC

Re: Reason behind using Apache Nightlies directory instead of Apache Nexus Snapshot Repository

> But as I am not a Java developer I would be open to changes if there is a
clear benefit for the devs using the nightlies.

So the benefits of using the Apache Nexus repository I can list quickly here

* Since the repo is managed by Sonatype nexus software, it handles a lot of
things that are considered default for Java developers (which as I
mentioned before, almost all Java developers resolve artifacts, SNAPSHOT or
not via a repository). This can include cases like setting up a proxy (work
environment), redirects and other less typical ways of resolving JVM
artifacts
* If there are any issues with resolving the SNAPSHOT artifacts, the
resolution of such problems would be easier via a Apache Nexus repo since
there is an Apache INFRA team that manages it specifically for JVM
artifacts and the Sonatype software itself is designed to handle this case
* The documentation for actually resolving the nightlies would be a lot
more clear since you just need to add the Apache Snapshots repo which is
actually mentioned on the main Apache site at
https://infra.apache.org/repository-faq.html#basic
* The Apache Nexus repo handles expiry automatically for all artifacts,
this means you don't need to manually setup snapshot expiry as you have
done in arrows github action.

I am not sure if this is classified as a "clear benefit", but I would
repeat my earlier point that this setup is not typical whatsoever and does
look odd. Almost every modern Java/JVM developer would have similar
sentiments. It is also recommended from the INFRA team to deploy JVM
artifacts to this repo (with the proviso of keeping snapshots to minimum as
you note)

> Looking at your slack conversation with infra it seems that
the nexus repository is a bit tight on space currently so it like would not
be the best to move our ~450MB/day nightlies there.

Are the 450mb you talk about just for the Java/JVM jars or your entire
nightlies? Would just like to clarify because I only recommend moving
Java/JVM snapshots specifically to the Apache Nexus repository, not all of
the SNAPSHOT for all of the binaries of other languages.

If you are only talking about JVM artifacts being ~450mb then that seems
suspiciously large however I am not that familiar with Apache arrow this
could be intended.



On Wed, Jan 25, 2023 at 8:20 PM Jacob Wujciak <ja...@voltrondata.com.invalid>
wrote:

> Hello Matthew,
>
> The main reason we don't use nexus is that we were unaware of it and as
> Raúl said the Java contributors adapted the R workflow that publishes to
> nightlies.a.o Looking at your slack conversation with infra it seems that
> the nexus repository is a bit tight on space currently so it like would not
> be the best to move our ~450MB/day nightlies there.
>
> But as I am not a Java developer I would be open to changes if there is a
> clear benefit for the devs using the nightlies.
>
> Best
> Jacob
>
> On Wed, Jan 25, 2023 at 11:56 AM Matthew Benedict de Detrich
> <ma...@aiven.io.invalid> wrote:
>
> > From my understanding as a general design perspective, if you are
> uploading
> > Java/JVM jars with the intention of them being resolved as a library you
> > should be putting them into the Apache Nexus repo (in this case the
> > snapshots repo). Of course nothing is stopping you from uploading it to a
> > remote folder via rsync (as you are doing) but the whole idea behind
> Nexus
> > is it gives tools for managing JVM libraries/dependencies (mainly for the
> > host of the repository, less so for users uploading to it). In short
> there
> > isn't a very strong technical reason (if there was presumably you
> wouldn't
> > be using it as a solution right now), but as far as I can tell it is an
> > expectation to use it for JVM library jars judging by the response at
> > https://the-asf.slack.com/archives/CBX4TSBQ8/p1674497939153149 (I would
> > say
> > that if you want more details then just ask for more detail in that slack
> > channel). One example is that you had to implement snapshot expiry
> > yourself, whereas Nexus automatically handles this for everyone (there
> is a
> > job in Nexus which does snapshot expiry so as a user you don't even have
> to
> > worry about it).
> >
> > As a Java/JVM developer, personally I would also say that putting library
> > jars via rsync into a remote folder is highly unorthodox given that an
> > Apache Nexus repository exists (if it didn't that would be another
> > story). The impression I have is that the nightlies folder (at least from
> > the other projects that use it) is mainly for the result of builds (e.g.
> > projects that are "run" which produce executables/fatjars i.e.
> > applications, databases), docs, microsites, etc etc.
> >
> > On Wed, Jan 25, 2023 at 11:03 AM Raúl Cumplido <ra...@gmail.com>
> > wrote:
> >
> > > Hi Matthew,
> > >
> > > Uploading java jars to nightlies.apache.org was implemented on the
> > > following PR: https://github.com/apache/arrow/pull/13328
> > >
> > > From what I remember the solution of uploading to nightlies.apache.org
> > was
> > > to follow the same workflow as our R nightly builds. More info on when
> > the
> > > R nightly builds were first implemented on this JIRA:
> > > https://issues.apache.org/jira/browse/ARROW-16400
> > >
> > > I don't remember investigating the Apache Nexus snapshot repository
> when
> > we
> > > added the Java jars. We followed the the same workflow we already had.
> > >
> > > Do you know if there is a technical reason for preferring the Nexus
> > > Snapshot Repository for JVM artifacts or is it for consistency with
> other
> > > projects?
> > >
> > > Thanks,
> > > Raúl
> > >
> > >
> > >
> > > El mié, 25 ene 2023 a las 10:38, Matthew Benedict de Detrich
> > > (<ma...@aiven.io.invalid>) escribió:
> > >
> > > > Hello Arrow community!
> > > >
> > > > I am coming from the Apache Pekko community where we are trying to
> > > > bootstrap the newly approved incubator project. Currently we are
> > working
> > > on
> > > > snapshots/nightlies, specifically where to upload them and also the
> > > process
> > > > of generating the snapshots (i.e. nightly, after merge into main etc
> > > etc).
> > > >
> > > > As of now we are uploading into Apache nightlies (i.e.
> > > > https://nightlies.apache.org/ with PR
> > > > https://github.com/apache/incubator-pekko/pull/60) and the method of
> > > > uploading was actually derived from Apache Arrow. However after
> further
> > > > cursory investigation I found out that Apache Arrow seems to be the
> > > > only Apache project that seems to upload snapshots into
> > > > https://nightlies.apache.org/. Every other project uploads snapshots
> > > into
> > > > Apache's Nexus snapshot repository (i.e.
> > > > https://repository.apache.org/content/groups/snapshots/) and after
> > > > discussion with Apache's #askinfra slack channel, they also recommend
> > > that
> > > > any JVM library snapshot artifacts be published to this Nexus
> Snapshot
> > > > Repository.
> > > >
> > > > So the question is, is there a reason why this project is uploading
> > > > snapshot jars into the nightlies? The only technical reason I can
> come
> > up
> > > > with is that with nightlies you have control over the expiry (i.e.
> > > setting
> > > > it to 30 days), whereas with Apache's Snapshot Repo there is a
> default
> > > > global job that cleans snapshots. Pekko is currently considering
> > > uploading
> > > > to Apache's Nexus Snapshot Repo instead of nightlies but I want to
> make
> > > > sure we haven't missed anything, for context of the discussion in
> pekko
> > > see
> > > > https://lists.apache.org/thread/p5s8ysypyd2l2slb3o9f2v5vrf8dgsx8.
> > > >
> > > > Many thanks
> > > >
> > > > --
> > > >
> > > > Matthew de Detrich
> > > >
> > > > *Aiven Deutschland GmbH*
> > > >
> > > > Immanuelkirchstraße 26, 10405 Berlin
> > > >
> > > > Amtsgericht Charlottenburg, HRB 209739 B
> > > >
> > > > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> > > >
> > > > *m:* +491603708037
> > > >
> > > > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> > > >
> > >
> >
> >
> > --
> >
> > Matthew de Detrich
> >
> > *Aiven Deutschland GmbH*
> >
> > Immanuelkirchstraße 26, 10405 Berlin
> >
> > Amtsgericht Charlottenburg, HRB 209739 B
> >
> > Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
> >
> > *m:* +491603708037
> >
> > *w:* aiven.io *e:* matthew.dedetrich@aiven.io
> >
>


-- 

Matthew de Detrich

*Aiven Deutschland GmbH*

Immanuelkirchstraße 26, 10405 Berlin

Amtsgericht Charlottenburg, HRB 209739 B

Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen

*m:* +491603708037

*w:* aiven.io *e:* matthew.dedetrich@aiven.io