You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Krisztián Szűcs <sz...@gmail.com> on 2020/04/17 01:51:36 UTC

Thoughts and issues regarding the release procedure

Hi,

While our release scripts have improved a lot lately, cutting the first release
candidate still takes multiple days. I wouldn't consider the overall experience
bad - especially given the complexity of the project and the number of
artifacts we produce - but we definitely need to develop more automatisms and
tests supporting it.
I'm not sure what's the right way to have an action plan, but having more
manpower here would be great.

If you don't mind, I'd like to specially thank Kou for maintaining most of the
release scripts and (when not being the RM) always helping out with the
upcoming issues, I really appreciate it.

I tried to collect the problems, inconveniences I had with 0.17.0-RC0:

00-prepare.sh
-------------

*PREPARE_CHANGELOG* phase:

- need to set ARROW_HOME because changelog.py requires it
- changelog.py stopped working since adding support for parquet tickets [1],
because it requires the actual version to have a git tag, which is not yet
available during the release procedure (called from prepare.sh)

Extra *PREPARE_DEB_PACKAGE_NAMES* phase:

This is usually not required but the previous `so` versions were set to .100,
so I had to downgrade them:

```bash
PREPARE_DEFAULT=0 PREPARE_DEB_PACKAGE_NAMES=1 \
dev/release/00-prepare.sh 1.0.0 0.17.0
```

We should add this step to ensure that the so versions in the linux package
are properly set, and also consider to remove the previous version from the
pattern.

*PREPARE_TAG* phase:

The outstanding issue was the JNI ORC crash we have discussed on the mailing
list and I have a reproducer PR available for [6]. I had to `@Ignore` the
crashing test to be able to release.

Minor issues:

- on OSX brew installed ORC doesn't work so need to use the bundled source,
passing `-DORC_SOURCE=BUNDLED` fixes it
- need to update the maven versions to match <version>-SNAPSHOT with command
`mvn versions:set -DnewVersion=0.17.0-SNAPSHOT`
- gandiva has deprication warnings, which was complicating the debugging of
the java jni orc problem [2]
- for me only OpenJDK 8 and Maven 3.5 version combination works, with newer
versions the build fails once a javadoc another time with an unknown error
- if I rerun the script without removing the version tag then the maven
process raises an error *after* compiling everything again, effectively
losing 20 minutes.

I'm generally frustrated about the compile time required iterate with the maven
build issues - I suppose there are better ways to invoke maven which I'm not
aware of. It would be nice to have a Java developer guide listing the
recommended
commands in certain scenarios.

Another thing I dislike about the release procedure is that the source code
tagging is done by / centered around maven. Preferably I would like to fire the
`git tag` command explicitly rather than letting one of the many
package managers
to do it implicitly.

02-source.sh
------------

The previous step produces a directory with the same name as the tag:
apache-arrow-0.17.0 which makes the script failing [3]

Binary Packaging
----------------

I had to apply two patches to fix the linux packaging builds.

[Packaging][deb] Support RC version numbers for apache-arrow-archive-keyring [4]
The packaging scripts were not properly supporting the -RC0 postfixed version
number which is a special case because the linux binaries are built agains
the apache source release rather than a git tag. While I managed to fix it,
we probably need a follow-up after the rebase.

[Packaging][rpm] Fix CentOS 6 build [5]
This issue has surfaced today with the nightly builds as well, seems like
devtoolset-6 is no longer available for CentOS 6 so I had to update it.

Building the 4 windows wheels on Appveyor takes 4 hours, because we don't have
any parallelism there. We should port the windows scripts to either Azure or
Github Actions.


Looking forward to the improvement ideas!

Thanks Everyone!


[1] https://github.com/apache/arrow/commit/636a912c4bef6803fe3fede8a050d82124b18136#diff-fc9c73b2cf4e254206ac116714cfdbf4
[2] https://gist.github.com/kszucs/08b1582ca60a86c8dd8a1ab50bb6faad
[3] https://gist.github.com/kszucs/3337e475ce751cfbf11ea45a5a8817d2
[4] https://github.com/kszucs/arrow/commit/2c4cb4576a04b930a295ce6838179a8cf5a16058
[5] https://github.com/kszucs/arrow/commit/0b245aa3404bf016488e36e22a0140813b661f40
[6] https://github.com/apache/arrow/pull/6953

Re: Thoughts and issues regarding the release procedure

Posted by Sutou Kouhei <ko...@clear-code.com>.
Hi,

Krisztián, thanks for collecting the problems!

> *PREPARE_CHANGELOG* phase:
> 
> - need to set ARROW_HOME because changelog.py requires it

https://github.com/apache/arrow/pull/6975

> - changelog.py stopped working since adding support for parquet tickets [1],
> because it requires the actual version to have a git tag, which is not yet
> available during the release procedure (called from prepare.sh)

We can test PREPARE_CHANGELOG phase by
dev/release/00-prepare-test.rb if dev/release/changelog.py
doesn't require JIRA account.

> - on OSX brew installed ORC doesn't work so need to use the bundled source,
> passing `-DORC_SOURCE=BUNDLED` fixes it

I may take a look at this later.

> - for me only OpenJDK 8 and Maven 3.5 version combination works, with newer
> versions the build fails once a javadoc another time with an unknown error

I opened https://issues.apache.org/jira/browse/ARROW-5764
for this about a year ago. But nobody works on this yet.
Could any Java developer work on this?

> [Packaging][deb] Support RC version numbers for apache-arrow-archive-keyring [4]

Sorry. I've fixed this:
https://github.com/apache/arrow/commit/3e1680ed2a0d62ab3d86c32f53a942af4fefcc5d

> [Packaging][rpm] Fix CentOS 6 build [5]

Sorry again. I've fixed this too:
https://github.com/apache/arrow/commit/37bf6e3a5b17da9c5328f173de2d6577921a429a


Thanks,
--
kou

In <CA...@mail.gmail.com>
  "Thoughts and issues regarding the release procedure" on Fri, 17 Apr 2020 03:51:36 +0200,
  Krisztián Szűcs <sz...@gmail.com> wrote:

> Hi,
> 
> While our release scripts have improved a lot lately, cutting the first release
> candidate still takes multiple days. I wouldn't consider the overall experience
> bad - especially given the complexity of the project and the number of
> artifacts we produce - but we definitely need to develop more automatisms and
> tests supporting it.
> I'm not sure what's the right way to have an action plan, but having more
> manpower here would be great.
> 
> If you don't mind, I'd like to specially thank Kou for maintaining most of the
> release scripts and (when not being the RM) always helping out with the
> upcoming issues, I really appreciate it.
> 
> I tried to collect the problems, inconveniences I had with 0.17.0-RC0:
> 
> 00-prepare.sh
> -------------
> 
> *PREPARE_CHANGELOG* phase:
> 
> - need to set ARROW_HOME because changelog.py requires it
> - changelog.py stopped working since adding support for parquet tickets [1],
> because it requires the actual version to have a git tag, which is not yet
> available during the release procedure (called from prepare.sh)
> 
> Extra *PREPARE_DEB_PACKAGE_NAMES* phase:
> 
> This is usually not required but the previous `so` versions were set to .100,
> so I had to downgrade them:
> 
> ```bash
> PREPARE_DEFAULT=0 PREPARE_DEB_PACKAGE_NAMES=1 \
> dev/release/00-prepare.sh 1.0.0 0.17.0
> ```
> 
> We should add this step to ensure that the so versions in the linux package
> are properly set, and also consider to remove the previous version from the
> pattern.
> 
> *PREPARE_TAG* phase:
> 
> The outstanding issue was the JNI ORC crash we have discussed on the mailing
> list and I have a reproducer PR available for [6]. I had to `@Ignore` the
> crashing test to be able to release.
> 
> Minor issues:
> 
> - on OSX brew installed ORC doesn't work so need to use the bundled source,
> passing `-DORC_SOURCE=BUNDLED` fixes it
> - need to update the maven versions to match <version>-SNAPSHOT with command
> `mvn versions:set -DnewVersion=0.17.0-SNAPSHOT`
> - gandiva has deprication warnings, which was complicating the debugging of
> the java jni orc problem [2]
> - for me only OpenJDK 8 and Maven 3.5 version combination works, with newer
> versions the build fails once a javadoc another time with an unknown error
> - if I rerun the script without removing the version tag then the maven
> process raises an error *after* compiling everything again, effectively
> losing 20 minutes.
> 
> I'm generally frustrated about the compile time required iterate with the maven
> build issues - I suppose there are better ways to invoke maven which I'm not
> aware of. It would be nice to have a Java developer guide listing the
> recommended
> commands in certain scenarios.
> 
> Another thing I dislike about the release procedure is that the source code
> tagging is done by / centered around maven. Preferably I would like to fire the
> `git tag` command explicitly rather than letting one of the many
> package managers
> to do it implicitly.
> 
> 02-source.sh
> ------------
> 
> The previous step produces a directory with the same name as the tag:
> apache-arrow-0.17.0 which makes the script failing [3]
> 
> Binary Packaging
> ----------------
> 
> I had to apply two patches to fix the linux packaging builds.
> 
> [Packaging][deb] Support RC version numbers for apache-arrow-archive-keyring [4]
> The packaging scripts were not properly supporting the -RC0 postfixed version
> number which is a special case because the linux binaries are built agains
> the apache source release rather than a git tag. While I managed to fix it,
> we probably need a follow-up after the rebase.
> 
> [Packaging][rpm] Fix CentOS 6 build [5]
> This issue has surfaced today with the nightly builds as well, seems like
> devtoolset-6 is no longer available for CentOS 6 so I had to update it.
> 
> Building the 4 windows wheels on Appveyor takes 4 hours, because we don't have
> any parallelism there. We should port the windows scripts to either Azure or
> Github Actions.
> 
> 
> Looking forward to the improvement ideas!
> 
> Thanks Everyone!
> 
> 
> [1] https://github.com/apache/arrow/commit/636a912c4bef6803fe3fede8a050d82124b18136#diff-fc9c73b2cf4e254206ac116714cfdbf4
> [2] https://gist.github.com/kszucs/08b1582ca60a86c8dd8a1ab50bb6faad
> [3] https://gist.github.com/kszucs/3337e475ce751cfbf11ea45a5a8817d2
> [4] https://github.com/kszucs/arrow/commit/2c4cb4576a04b930a295ce6838179a8cf5a16058
> [5] https://github.com/kszucs/arrow/commit/0b245aa3404bf016488e36e22a0140813b661f40
> [6] https://github.com/apache/arrow/pull/6953