You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/02/11 19:12:08 UTC

[GitHub] [arrow-datafusion] alamb edited a comment on pull request #1807: Update CHANGELOG.md, update release scripts

alamb edited a comment on pull request #1807:
URL: https://github.com/apache/arrow-datafusion/pull/1807#issuecomment-1036531280


   > I'm interested in learning a bit more about the lifecycle of issues / tags / releases.
   > once i understand better maybe i could help to get things in order for the next release.
   
   🎉  that would be wonderful @matthewmturner 
   
   > @houqp @alamb could you provide a little more info? i didnt see anything mentioned in the developers guide.
   
   The basic release instructions https://github.com/apache/arrow-datafusion/blob/master/dev/release/README.md#update-changelogmd
   
   Talk about 
   ```shell
   # create the changelog
   CHANGELOG_GITHUB_TOKEN=<TOKEN> ./dev/release/update_change_log-all.sh
   # review change log / edit issues and labels if needed, rerun until you are happy with the result
   git commit -a -m 'Create changelog for release'
   ```
   
   But the "until you are happy with the result" leaves a lot to the imagination 😆 
   
   Basically what I did was run that script (it updates the `CHANGELOG.md` file locally) and then looked at the output and tried to make something that looked coherent.
   
   Examples of things that I did:
   1. Found issues that did not have `datafusion` but were related to datafusion and put them in (otherwise they don't show up in the release notes)
   2. Changed  titles of PRs / issues so they were more specified (e.g. from `Fixed bug in select` to `Fixed bug when there are order by number`)
   3. Applied a liberal dose of judgement to what issues / PRs should have `enhancement` `bug` or `api-change` on them. 
   
   The most questionable thing related to labels were:
   1. API changes get listed under the "Beaking Changes" labels, and some tickets had new apis, etc that I didn't feel were breaking changes or if there were several PRs that together made a "single" breaking change from the user's point of view (e.g. breaking LogicalPlan into enums)
   2. Changes with multiple labels got put under a single heading so I removed some labels that were accurate but were obscuring what I felt was the "most important" part. For example, a ticket with an `enhancement` and a `documentation` label ended up under the Documentation heading, even when it also had code changes. I removed the `documentation` ticket for that one
   
   It would be great to have some more help here. Some thoughts:
   1. We could probably automate adding `datafusion` labels for issues that are closed via PR that also has the `datafusion` label (and the `datafusion` label is applied automatically based on the path that was fixed)
   
   Something else I haven't even been looking into is releasing the python bindings and releasing ballista lol -- so any help you want to lend in there would be awesome


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org