You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Ian Cook <ia...@ursacomputing.com> on 2022/11/22 22:56:27 UTC

Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Hi all,

Our biweekly sync call is tomorrow at 12:00 noon Eastern time.

The Zoom meeting URL for this and other biweekly Arrow sync calls is:
https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09

Alternatively, enter this information into the Zoom website or app to
join the call:
Meeting ID: 876 4903 3008
Passcode: 958092

Thanks,
Ian

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by David Li <li...@apache.org>.
This is now configurable, but you have to ask Infra to do it. (You can have GitHub always use the PR title + description.)

See https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/configuring-commit-squashing-for-pull-requests

I assume GitHub also handles multiple Co-authored-by: declarations across multiple commits correctly, IIRC the merge script didn't always handle this? so it might be worth checking.

-David

On Fri, Nov 25, 2022, at 20:39, Neal Richardson wrote:
>> - This creates an immediate need to modify the PR merge script; Raúl
>> opened an issue for this after the call [6]; this also raises the
>> question of whether we still need the PR merge script or whether
>> committers can use the "Squash and merge" button in the GitHub web UI
>> instead
>>
>
> I think the only thing the merge script will do better than the UI button
> is that the script always gets the merge commit title correct. IIRC in the
> GitHub UI, if you merge a PR that has only one commit, it uses the commit
> message from that commit and not the PR title. Maybe GitHub has fixed this
> (or let it be configurable).
>
> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
>
> Neal

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Jacob Wujciak <ja...@voltrondata.com.INVALID>.
> The merge script would still be useful to flag issues such as a missing
component label, or to ensure the fix version (milestone) is set.

It would be possible to turn the current checks for these into a PR check
that is obligatory and will block merging until green. [1][2]

[1]:
https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Branchprotection
[2]:
https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/managing-a-branch-protection-rule

On Mon, Nov 28, 2022 at 2:40 PM Antoine Pitrou <an...@python.org> wrote:

>
> Also a note that discussing this in a thread entitled "Arrow sync call
> November 23" might not raise the attention of all interested parties :-)
>
>
> Le 28/11/2022 à 14:38, Antoine Pitrou a écrit :
> >
> > The merge script would still be useful to flag issues such as a missing
> > component label, or to ensure the fix version (milestone) is set.
> >
> >
> > Le 28/11/2022 à 12:09, Joris Van den Bossche a écrit :
> >> FYI: Raúl also already opened a PR to update the merge script to work
> >> with github issues: https://github.com/apache/arrow/pull/14731
> >>
> >> Personally I also think that we should consider using the merge button
> >> instead of our script (or at least re-evaluate what the script still
> >> does better, or might now we redundant). But so given that the merge
> >> script is almost ready to handle github issues, that is a discussion
> >> we can have separate from the direct action to start using github
> >> issues (it probably needs to wait until the actual JIRA->github
> >> migration of the issues has happened, anyway, since until then the
> >> script is needed to handle the mix of JIRA and github issues)
> >>
> >> Joris
> >>
> >> On Mon, 28 Nov 2022 at 11:44, Andrew Lamb <al...@influxdata.com> wrote:
> >>>
> >>>> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle
> this?
> >>>
> >>> arrow-rs and arrow-datafusion use the squash-and-merge button in the
> github
> >>> UI.
> >>>
> >>> In general we don't have the same level of curation in commit titles
> as the
> >>> main arrow repo. However, I have not heard anyone ask for better commit
> >>> titles and the github UI / API does a good job tying all commits back
> to
> >>> the PR they came from.
> >>>
> >>> Andrew
> >>>
> >>> On Fri, Nov 25, 2022 at 8:39 PM Neal Richardson <
> neal.p.richardson@gmail.com>
> >>> wrote:
> >>>
> >>>>> - This creates an immediate need to modify the PR merge script; Raúl
> >>>>> opened an issue for this after the call [6]; this also raises the
> >>>>> question of whether we still need the PR merge script or whether
> >>>>> committers can use the "Squash and merge" button in the GitHub web UI
> >>>>> instead
> >>>>>
> >>>>
> >>>> I think the only thing the merge script will do better than the UI
> button
> >>>> is that the script always gets the merge commit title correct. IIRC
> in the
> >>>> GitHub UI, if you merge a PR that has only one commit, it uses the
> commit
> >>>> message from that commit and not the PR title. Maybe GitHub has fixed
> this
> >>>> (or let it be configurable).
> >>>>
> >>>> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle
> this?
> >>>>
> >>>> Neal
> >>>>
>

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Antoine Pitrou <an...@python.org>.
Also a note that discussing this in a thread entitled "Arrow sync call 
November 23" might not raise the attention of all interested parties :-)


Le 28/11/2022 à 14:38, Antoine Pitrou a écrit :
> 
> The merge script would still be useful to flag issues such as a missing
> component label, or to ensure the fix version (milestone) is set.
> 
> 
> Le 28/11/2022 à 12:09, Joris Van den Bossche a écrit :
>> FYI: Raúl also already opened a PR to update the merge script to work
>> with github issues: https://github.com/apache/arrow/pull/14731
>>
>> Personally I also think that we should consider using the merge button
>> instead of our script (or at least re-evaluate what the script still
>> does better, or might now we redundant). But so given that the merge
>> script is almost ready to handle github issues, that is a discussion
>> we can have separate from the direct action to start using github
>> issues (it probably needs to wait until the actual JIRA->github
>> migration of the issues has happened, anyway, since until then the
>> script is needed to handle the mix of JIRA and github issues)
>>
>> Joris
>>
>> On Mon, 28 Nov 2022 at 11:44, Andrew Lamb <al...@influxdata.com> wrote:
>>>
>>>> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
>>>
>>> arrow-rs and arrow-datafusion use the squash-and-merge button in the github
>>> UI.
>>>
>>> In general we don't have the same level of curation in commit titles as the
>>> main arrow repo. However, I have not heard anyone ask for better commit
>>> titles and the github UI / API does a good job tying all commits back to
>>> the PR they came from.
>>>
>>> Andrew
>>>
>>> On Fri, Nov 25, 2022 at 8:39 PM Neal Richardson <ne...@gmail.com>
>>> wrote:
>>>
>>>>> - This creates an immediate need to modify the PR merge script; Raúl
>>>>> opened an issue for this after the call [6]; this also raises the
>>>>> question of whether we still need the PR merge script or whether
>>>>> committers can use the "Squash and merge" button in the GitHub web UI
>>>>> instead
>>>>>
>>>>
>>>> I think the only thing the merge script will do better than the UI button
>>>> is that the script always gets the merge commit title correct. IIRC in the
>>>> GitHub UI, if you merge a PR that has only one commit, it uses the commit
>>>> message from that commit and not the PR title. Maybe GitHub has fixed this
>>>> (or let it be configurable).
>>>>
>>>> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
>>>>
>>>> Neal
>>>>

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Antoine Pitrou <an...@python.org>.
The merge script would still be useful to flag issues such as a missing 
component label, or to ensure the fix version (milestone) is set.


Le 28/11/2022 à 12:09, Joris Van den Bossche a écrit :
> FYI: Raúl also already opened a PR to update the merge script to work
> with github issues: https://github.com/apache/arrow/pull/14731
> 
> Personally I also think that we should consider using the merge button
> instead of our script (or at least re-evaluate what the script still
> does better, or might now we redundant). But so given that the merge
> script is almost ready to handle github issues, that is a discussion
> we can have separate from the direct action to start using github
> issues (it probably needs to wait until the actual JIRA->github
> migration of the issues has happened, anyway, since until then the
> script is needed to handle the mix of JIRA and github issues)
> 
> Joris
> 
> On Mon, 28 Nov 2022 at 11:44, Andrew Lamb <al...@influxdata.com> wrote:
>>
>>> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
>>
>> arrow-rs and arrow-datafusion use the squash-and-merge button in the github
>> UI.
>>
>> In general we don't have the same level of curation in commit titles as the
>> main arrow repo. However, I have not heard anyone ask for better commit
>> titles and the github UI / API does a good job tying all commits back to
>> the PR they came from.
>>
>> Andrew
>>
>> On Fri, Nov 25, 2022 at 8:39 PM Neal Richardson <ne...@gmail.com>
>> wrote:
>>
>>>> - This creates an immediate need to modify the PR merge script; Raúl
>>>> opened an issue for this after the call [6]; this also raises the
>>>> question of whether we still need the PR merge script or whether
>>>> committers can use the "Squash and merge" button in the GitHub web UI
>>>> instead
>>>>
>>>
>>> I think the only thing the merge script will do better than the UI button
>>> is that the script always gets the merge commit title correct. IIRC in the
>>> GitHub UI, if you merge a PR that has only one commit, it uses the commit
>>> message from that commit and not the PR title. Maybe GitHub has fixed this
>>> (or let it be configurable).
>>>
>>> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
>>>
>>> Neal
>>>

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Weston Pace <we...@gmail.com>.
One thing to note is that you need to have something like "closes #123" in
the PR description or a comment in order for GitHub to close the relevant
issue when the PR is merged.  This isn't too much of a burden to check I
think but took a bit of getting used to for me in Substrait where we use
the merge button.

On Mon, Nov 28, 2022, 3:20 AM Joris Van den Bossche <
jorisvandenbossche@gmail.com> wrote:

> On Mon, 28 Nov 2022 at 12:09, Joris Van den Bossche
> <jo...@gmail.com> wrote:
> >
> > FYI: Raúl also already opened a PR to update the merge script to work
> > with github issues: https://github.com/apache/arrow/pull/14731
>
> (sorry, that PR is to update the github actions workflow (the bot that
> comments on PRs), not the merge script)
>
> >
> > Personally I also think that we should consider using the merge button
> > instead of our script (or at least re-evaluate what the script still
> > does better, or might now we redundant). But so given that the merge
> > script is almost ready to handle github issues, that is a discussion
> > we can have separate from the direct action to start using github
> > issues (it probably needs to wait until the actual JIRA->github
> > migration of the issues has happened, anyway, since until then the
> > script is needed to handle the mix of JIRA and github issues)
> >
> > Joris
> >
> > On Mon, 28 Nov 2022 at 11:44, Andrew Lamb <al...@influxdata.com> wrote:
> > >
> > > > How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle
> this?
> > >
> > > arrow-rs and arrow-datafusion use the squash-and-merge button in the
> github
> > > UI.
> > >
> > > In general we don't have the same level of curation in commit titles
> as the
> > > main arrow repo. However, I have not heard anyone ask for better commit
> > > titles and the github UI / API does a good job tying all commits back
> to
> > > the PR they came from.
> > >
> > > Andrew
> > >
> > > On Fri, Nov 25, 2022 at 8:39 PM Neal Richardson <
> neal.p.richardson@gmail.com>
> > > wrote:
> > >
> > > > > - This creates an immediate need to modify the PR merge script;
> Raúl
> > > > > opened an issue for this after the call [6]; this also raises the
> > > > > question of whether we still need the PR merge script or whether
> > > > > committers can use the "Squash and merge" button in the GitHub web
> UI
> > > > > instead
> > > > >
> > > >
> > > > I think the only thing the merge script will do better than the UI
> button
> > > > is that the script always gets the merge commit title correct. IIRC
> in the
> > > > GitHub UI, if you merge a PR that has only one commit, it uses the
> commit
> > > > message from that commit and not the PR title. Maybe GitHub has
> fixed this
> > > > (or let it be configurable).
> > > >
> > > > How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle
> this?
> > > >
> > > > Neal
> > > >
>

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Joris Van den Bossche <jo...@gmail.com>.
On Mon, 28 Nov 2022 at 12:09, Joris Van den Bossche
<jo...@gmail.com> wrote:
>
> FYI: Raúl also already opened a PR to update the merge script to work
> with github issues: https://github.com/apache/arrow/pull/14731

(sorry, that PR is to update the github actions workflow (the bot that
comments on PRs), not the merge script)

>
> Personally I also think that we should consider using the merge button
> instead of our script (or at least re-evaluate what the script still
> does better, or might now we redundant). But so given that the merge
> script is almost ready to handle github issues, that is a discussion
> we can have separate from the direct action to start using github
> issues (it probably needs to wait until the actual JIRA->github
> migration of the issues has happened, anyway, since until then the
> script is needed to handle the mix of JIRA and github issues)
>
> Joris
>
> On Mon, 28 Nov 2022 at 11:44, Andrew Lamb <al...@influxdata.com> wrote:
> >
> > > How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
> >
> > arrow-rs and arrow-datafusion use the squash-and-merge button in the github
> > UI.
> >
> > In general we don't have the same level of curation in commit titles as the
> > main arrow repo. However, I have not heard anyone ask for better commit
> > titles and the github UI / API does a good job tying all commits back to
> > the PR they came from.
> >
> > Andrew
> >
> > On Fri, Nov 25, 2022 at 8:39 PM Neal Richardson <ne...@gmail.com>
> > wrote:
> >
> > > > - This creates an immediate need to modify the PR merge script; Raúl
> > > > opened an issue for this after the call [6]; this also raises the
> > > > question of whether we still need the PR merge script or whether
> > > > committers can use the "Squash and merge" button in the GitHub web UI
> > > > instead
> > > >
> > >
> > > I think the only thing the merge script will do better than the UI button
> > > is that the script always gets the merge commit title correct. IIRC in the
> > > GitHub UI, if you merge a PR that has only one commit, it uses the commit
> > > message from that commit and not the PR title. Maybe GitHub has fixed this
> > > (or let it be configurable).
> > >
> > > How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
> > >
> > > Neal
> > >

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Joris Van den Bossche <jo...@gmail.com>.
FYI: Raúl also already opened a PR to update the merge script to work
with github issues: https://github.com/apache/arrow/pull/14731

Personally I also think that we should consider using the merge button
instead of our script (or at least re-evaluate what the script still
does better, or might now we redundant). But so given that the merge
script is almost ready to handle github issues, that is a discussion
we can have separate from the direct action to start using github
issues (it probably needs to wait until the actual JIRA->github
migration of the issues has happened, anyway, since until then the
script is needed to handle the mix of JIRA and github issues)

Joris

On Mon, 28 Nov 2022 at 11:44, Andrew Lamb <al...@influxdata.com> wrote:
>
> > How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
>
> arrow-rs and arrow-datafusion use the squash-and-merge button in the github
> UI.
>
> In general we don't have the same level of curation in commit titles as the
> main arrow repo. However, I have not heard anyone ask for better commit
> titles and the github UI / API does a good job tying all commits back to
> the PR they came from.
>
> Andrew
>
> On Fri, Nov 25, 2022 at 8:39 PM Neal Richardson <ne...@gmail.com>
> wrote:
>
> > > - This creates an immediate need to modify the PR merge script; Raúl
> > > opened an issue for this after the call [6]; this also raises the
> > > question of whether we still need the PR merge script or whether
> > > committers can use the "Squash and merge" button in the GitHub web UI
> > > instead
> > >
> >
> > I think the only thing the merge script will do better than the UI button
> > is that the script always gets the merge commit title correct. IIRC in the
> > GitHub UI, if you merge a PR that has only one commit, it uses the commit
> > message from that commit and not the PR title. Maybe GitHub has fixed this
> > (or let it be configurable).
> >
> > How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
> >
> > Neal
> >

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Andrew Lamb <al...@influxdata.com>.
> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?

arrow-rs and arrow-datafusion use the squash-and-merge button in the github
UI.

In general we don't have the same level of curation in commit titles as the
main arrow repo. However, I have not heard anyone ask for better commit
titles and the github UI / API does a good job tying all commits back to
the PR they came from.

Andrew

On Fri, Nov 25, 2022 at 8:39 PM Neal Richardson <ne...@gmail.com>
wrote:

> > - This creates an immediate need to modify the PR merge script; Raúl
> > opened an issue for this after the call [6]; this also raises the
> > question of whether we still need the PR merge script or whether
> > committers can use the "Squash and merge" button in the GitHub web UI
> > instead
> >
>
> I think the only thing the merge script will do better than the UI button
> is that the script always gets the merge commit title correct. IIRC in the
> GitHub UI, if you merge a PR that has only one commit, it uses the commit
> message from that commit and not the PR title. Maybe GitHub has fixed this
> (or let it be configurable).
>
> How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?
>
> Neal
>

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Neal Richardson <ne...@gmail.com>.
> - This creates an immediate need to modify the PR merge script; Raúl
> opened an issue for this after the call [6]; this also raises the
> question of whether we still need the PR merge script or whether
> committers can use the "Squash and merge" button in the GitHub web UI
> instead
>

I think the only thing the merge script will do better than the UI button
is that the script always gets the merge commit title correct. IIRC in the
GitHub UI, if you merge a PR that has only one commit, it uses the commit
message from that commit and not the PR title. Maybe GitHub has fixed this
(or let it be configurable).

How do apache/arrow-rs, arrow-datafusion, arrow-julia, et al. handle this?

Neal

Re: Arrow sync call November 23 at 12:00 US/Eastern, 17:00 UTC

Posted by Ian Cook <ia...@ursacomputing.com>.
Attendees:

- Percy T. Aucahuasi
- Ian Cook
- Raúl Cumplido
- James Duong
- Todd Farmer
- Alenka Frim
- Stephanie Hazlitt
- Ian Joiner
- David Li
- Rok Mihevc
- Matthew Topol
- Joris Van den Bossche
- Jacob Wujciak

Discussion:

Migration from Jira to GitHub issues

- ASF Infra has disabled creation of new Jira accounts [1]
- The Arrow PMC has voted to move issue tracking to GitHub Issues [2]
- There is ongoing work to migrate existing Jira issues to GitHub
Issues and to improve the user and developer experience with GitHub
Issues [3][4]
- There was some discussion about whether we should stop new Jira
issue creation now and begin to use GitHub Issues for all new issues
(even before the existing Jira issues are migrated)
- An alternative approach would be to have Arrow maintainers open a
Jira issue to represent each GitHub Issue (if there is a fix PR) until
the next release (11.0.0) at which point we can fully migrate off of
Jira. This might make the release process more straightforward but
would create extra work for maintainers in the interim.
- The general consensus on the call was that we should stop new Jira
issue creation now (pending a vote); Todd started a vote on the ML
after the call [5]
- This creates an immediate need to modify the PR merge script; Raúl
opened an issue for this after the call [6]; this also raises the
question of whether we still need the PR merge script or whether
committers can use the "Squash and merge" button in the GitHub web UI
instead
- There was a discussion about whether we should still require
contributors to open Issues before they open PRs; the general
consensus was that this will be unnecessary in many cases (because
GitHub PRs have all the same fields as GitHub Issues) however changes
that require community discussion before implementation should still
be opened as Issues first
- The consensus was that we should put in practice a policy asking
people to create meaningful descriptions in their PRs; at the next
release when we are finalizing the migration we could consider
adopting a standard convention for PR messages, such as Conventional
Commits [7]; Jacob will send out an ML post about this
- Todd expects that the mechanism for importing Jira issues to GitHub
Issues should be mostly ready in about a week
- Communications and docs updates will be required to inform
contributors and maintainers of the changes; these are listed in [3]
- We will need to do a lot of communications and docs updates


Proposal for catalog support in Flight SQL

- James started a discussion on the ML about improving support for
catalogs in Flight SQL [8]
- There are open questions in the ML discussion about how this should
be implemented; additional comments are welcome


Flight SQL name

- Some people outside the core Arrow developer community have reported
confusion about the differences between Flight SQL and ADBC, despite
our explanations of the differences (for example: [9])
- Some people have also reported confusion about the Flight SQL name
because it supports Substrait, not only SQL
- There was some discussion about whether we might consider using
"ADBC" as an umbrella name encompassing the client API, driver client
driver, and wire protocol; there were some concerns about whether this
makes logical sense and about investments in the current "Flight SQL"
name; more ideas and discussion welcome


[1] https://infra.apache.org/blog/jira-public-signup-disabled.html
[2] https://lists.apache.org/thread/8pmlx3186b32hm36fkqxfj6vp2ltwkf7
[3] https://docs.google.com/document/d/1UaSJs-oyuq8QvlUPoQ9GeiwP19LK5ZzF_5-HLfHDCIg/
[4] https://github.com/apache/arrow/issues?q=is%3Aissue+is%3Aopen+MIGRATION
[5] https://lists.apache.org/thread/v9sjwx8mdg0bfssbrlqz7c0wxwc8dx49
[6] https://github.com/apache/arrow/issues/14720
[7] https://www.conventionalcommits.org/
[8] https://lists.apache.org/thread/fd6r1n7vt91sg2c7fr35wcrsqz6x4645
[9] https://voltrondata.com/resources/update/2022/08/25/simplifying-database-connectivity-with-arrow-flight-sql-and-adbc

On Tue, Nov 22, 2022 at 5:56 PM Ian Cook <ia...@ursacomputing.com> wrote:
>
> Hi all,
>
> Our biweekly sync call is tomorrow at 12:00 noon Eastern time.
>
> The Zoom meeting URL for this and other biweekly Arrow sync calls is:
> https://zoom.us/j/87649033008?pwd=SitsRHluQStlREM0TjJVYkRibVZsUT09
>
> Alternatively, enter this information into the Zoom website or app to
> join the call:
> Meeting ID: 876 4903 3008
> Passcode: 958092
>
> Thanks,
> Ian