You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Wes McKinney <we...@gmail.com> on 2020/04/04 21:08:03 UTC

[DRAFT] Arrow Board Report April 2020

## Description:

The mission of Apache Arrow is the creation and maintenance of software related
to columnar in-memory processing and data interchange. The project has some
level of support for 11 different programming languages.

## Issues:

- We are continuing to work with INFRA on issues related to self-hosted CI
  machines integrated with our GitHub-based pull request workflows. There are
  two avenues we are exploring (and we may well use both of them), GitHub
  Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
  approved for the @apache GitHub organization and we will soon validate that
  we can successfully use this with the free Arrow organization that Buildkite
  has provided us. CI/CD is likely to require an ongoing significant investment
  of time, and we are doing the best we can try avoid overburdening ASF Infra
  with requests.

## Membership Data:

Apache Arrow was founded 2016-01-19 (4 years ago)
There are currently 50 committers and 30 PMC members in this project.
The Committer-to-PMC ratio is 5:3.

Community changes, past quarter:
- Francois Saint-Jacques was added to the PMC on 2020-03-04
- Neal Richardson was added to the PMC on 2020-03-04
- No new committers. Last addition was Joris Van den Bossche on 2019-12-06.

## Project Activity:

- 0.16.0 was released at the end of January. We are close to
  releasing 0.17.0, with a 1.0.0 release hopefully sometime in
  2020.
- We just adopted a "C Data Interface" for the project which will open many new
  opportunities for integrations with third party projects.

## Community Health:

The project and contributor base continues to grow in size and
scope. We now have over 400 unique contributors since the
creation of the project.

Re: [DRAFT] Arrow Board Report April 2020

Posted by Wes McKinney <we...@gmail.com>.
Here's the updated board report. I updated it to confirm that
Buildkite is indeed finally working on the @apache organization

## Description:

The mission of Apache Arrow is the creation and maintenance of software related
to columnar in-memory processing and data interchange. The project has some
level of support for 11 different programming languages.

## Issues:

- We are continuing to work with INFRA on issues related to self-hosted CI
  machines integrated with our GitHub-based pull request workflows. There are
  two avenues we are exploring (and we may well use both of them), GitHub
  Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
  approved for the @apache GitHub organization and we have validated that
  we can successfully use this with the free Arrow organization that Buildkite
  has provided us. CI/CD is likely to require an ongoing significant investment
  of time, and we are doing the best we can try avoid overburdening ASF Infra
  with requests.

## Membership Data:

Apache Arrow was founded 2016-01-19 (4 years ago)
There are currently 50 committers and 30 PMC members in this project.
The Committer-to-PMC ratio is 5:3.

Community changes, past quarter:
- Francois Saint-Jacques was added to the PMC on 2020-03-04
- Neal Richardson was added to the PMC on 2020-03-04
- No new committers. Last addition was Joris Van den Bossche on 2019-12-06.

## Project Activity:

- 0.16.0 was released at the end of January. We are close to
  releasing 0.17.0, with a 1.0.0 release hopefully sometime in
  2020.
- Three months ago, Apache Arrow was accepted for continuous fuzzing in the
  OSS-Fuzz infrastructure.  We have now finally stabilized the situation by
  fixing all detected issues in the Arrow C++ IPC implementation, and are
  actively fixing issues in the Arrow C++ Parquet reader.
- We just adopted a "C Data Interface" for the project which will open many new
  opportunities for integrations with third party projects.

## Community Health:

The project and contributor base continues to grow in size and
scope. We now have over 400 unique contributors since the
creation of the project.

On Wed, Apr 8, 2020 at 8:43 AM Wes McKinney <we...@gmail.com> wrote:
>
> Sounds good.
>
> I think it's fine to mention Parquet since presumably some issues will
> be fixed that are relevant to Arrow users that don't affect other
> kinds of Parquet users.
>
> On Wed, Apr 8, 2020 at 8:29 AM Antoine Pitrou <an...@python.org> wrote:
> >
> >
> > - Three months ago, Apache Arrow was accepted for continuous fuzzing in
> > the OSS-Fuzz infrastructure.  We have now finally stabilized the
> > situation by fixing all detected issues in the Arrow C++ IPC
> > implementation, and are actively fixing issues in the Arrow C++ Parquet
> > reader.
> >
> > (XXX not sure Parquet should be mentioned here? is the Parquet C++
> > implementation formally part of the Apache Arrow project)
> >
> >
> > Le 08/04/2020 à 15:22, Wes McKinney a écrit :
> > > Yes, definitely, can you propose a paragraph for the Project Activity section?
> > >
> > > On Wed, Apr 8, 2020 at 8:10 AM Antoine Pitrou <an...@python.org> wrote:
> > >>
> > >>
> > >> Is it worth mentioning the OSS-Fuzz integration (and "success story")?
> > >>
> > >> Le 08/04/2020 à 15:05, Wes McKinney a écrit :
> > >>> The report is due today. Are there any more comments?
> > >>>
> > >>> On Sat, Apr 4, 2020 at 4:08 PM Wes McKinney <we...@gmail.com> wrote:
> > >>>>
> > >>>> ## Description:
> > >>>>
> > >>>> The mission of Apache Arrow is the creation and maintenance of software related
> > >>>> to columnar in-memory processing and data interchange. The project has some
> > >>>> level of support for 11 different programming languages.
> > >>>>
> > >>>> ## Issues:
> > >>>>
> > >>>> - We are continuing to work with INFRA on issues related to self-hosted CI
> > >>>>   machines integrated with our GitHub-based pull request workflows. There are
> > >>>>   two avenues we are exploring (and we may well use both of them), GitHub
> > >>>>   Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
> > >>>>   approved for the @apache GitHub organization and we will soon validate that
> > >>>>   we can successfully use this with the free Arrow organization that Buildkite
> > >>>>   has provided us. CI/CD is likely to require an ongoing significant investment
> > >>>>   of time, and we are doing the best we can try avoid overburdening ASF Infra
> > >>>>   with requests.
> > >>>>
> > >>>> ## Membership Data:
> > >>>>
> > >>>> Apache Arrow was founded 2016-01-19 (4 years ago)
> > >>>> There are currently 50 committers and 30 PMC members in this project.
> > >>>> The Committer-to-PMC ratio is 5:3.
> > >>>>
> > >>>> Community changes, past quarter:
> > >>>> - Francois Saint-Jacques was added to the PMC on 2020-03-04
> > >>>> - Neal Richardson was added to the PMC on 2020-03-04
> > >>>> - No new committers. Last addition was Joris Van den Bossche on 2019-12-06.
> > >>>>
> > >>>> ## Project Activity:
> > >>>>
> > >>>> - 0.16.0 was released at the end of January. We are close to
> > >>>>   releasing 0.17.0, with a 1.0.0 release hopefully sometime in
> > >>>>   2020.
> > >>>> - We just adopted a "C Data Interface" for the project which will open many new
> > >>>>   opportunities for integrations with third party projects.
> > >>>>
> > >>>> ## Community Health:
> > >>>>
> > >>>> The project and contributor base continues to grow in size and
> > >>>> scope. We now have over 400 unique contributors since the
> > >>>> creation of the project.

Re: [DRAFT] Arrow Board Report April 2020

Posted by Wes McKinney <we...@gmail.com>.
Sounds good.

I think it's fine to mention Parquet since presumably some issues will
be fixed that are relevant to Arrow users that don't affect other
kinds of Parquet users.

On Wed, Apr 8, 2020 at 8:29 AM Antoine Pitrou <an...@python.org> wrote:
>
>
> - Three months ago, Apache Arrow was accepted for continuous fuzzing in
> the OSS-Fuzz infrastructure.  We have now finally stabilized the
> situation by fixing all detected issues in the Arrow C++ IPC
> implementation, and are actively fixing issues in the Arrow C++ Parquet
> reader.
>
> (XXX not sure Parquet should be mentioned here? is the Parquet C++
> implementation formally part of the Apache Arrow project)
>
>
> Le 08/04/2020 à 15:22, Wes McKinney a écrit :
> > Yes, definitely, can you propose a paragraph for the Project Activity section?
> >
> > On Wed, Apr 8, 2020 at 8:10 AM Antoine Pitrou <an...@python.org> wrote:
> >>
> >>
> >> Is it worth mentioning the OSS-Fuzz integration (and "success story")?
> >>
> >> Le 08/04/2020 à 15:05, Wes McKinney a écrit :
> >>> The report is due today. Are there any more comments?
> >>>
> >>> On Sat, Apr 4, 2020 at 4:08 PM Wes McKinney <we...@gmail.com> wrote:
> >>>>
> >>>> ## Description:
> >>>>
> >>>> The mission of Apache Arrow is the creation and maintenance of software related
> >>>> to columnar in-memory processing and data interchange. The project has some
> >>>> level of support for 11 different programming languages.
> >>>>
> >>>> ## Issues:
> >>>>
> >>>> - We are continuing to work with INFRA on issues related to self-hosted CI
> >>>>   machines integrated with our GitHub-based pull request workflows. There are
> >>>>   two avenues we are exploring (and we may well use both of them), GitHub
> >>>>   Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
> >>>>   approved for the @apache GitHub organization and we will soon validate that
> >>>>   we can successfully use this with the free Arrow organization that Buildkite
> >>>>   has provided us. CI/CD is likely to require an ongoing significant investment
> >>>>   of time, and we are doing the best we can try avoid overburdening ASF Infra
> >>>>   with requests.
> >>>>
> >>>> ## Membership Data:
> >>>>
> >>>> Apache Arrow was founded 2016-01-19 (4 years ago)
> >>>> There are currently 50 committers and 30 PMC members in this project.
> >>>> The Committer-to-PMC ratio is 5:3.
> >>>>
> >>>> Community changes, past quarter:
> >>>> - Francois Saint-Jacques was added to the PMC on 2020-03-04
> >>>> - Neal Richardson was added to the PMC on 2020-03-04
> >>>> - No new committers. Last addition was Joris Van den Bossche on 2019-12-06.
> >>>>
> >>>> ## Project Activity:
> >>>>
> >>>> - 0.16.0 was released at the end of January. We are close to
> >>>>   releasing 0.17.0, with a 1.0.0 release hopefully sometime in
> >>>>   2020.
> >>>> - We just adopted a "C Data Interface" for the project which will open many new
> >>>>   opportunities for integrations with third party projects.
> >>>>
> >>>> ## Community Health:
> >>>>
> >>>> The project and contributor base continues to grow in size and
> >>>> scope. We now have over 400 unique contributors since the
> >>>> creation of the project.

Re: [DRAFT] Arrow Board Report April 2020

Posted by Antoine Pitrou <an...@python.org>.
- Three months ago, Apache Arrow was accepted for continuous fuzzing in
the OSS-Fuzz infrastructure.  We have now finally stabilized the
situation by fixing all detected issues in the Arrow C++ IPC
implementation, and are actively fixing issues in the Arrow C++ Parquet
reader.

(XXX not sure Parquet should be mentioned here? is the Parquet C++
implementation formally part of the Apache Arrow project)


Le 08/04/2020 à 15:22, Wes McKinney a écrit :
> Yes, definitely, can you propose a paragraph for the Project Activity section?
> 
> On Wed, Apr 8, 2020 at 8:10 AM Antoine Pitrou <an...@python.org> wrote:
>>
>>
>> Is it worth mentioning the OSS-Fuzz integration (and "success story")?
>>
>> Le 08/04/2020 à 15:05, Wes McKinney a écrit :
>>> The report is due today. Are there any more comments?
>>>
>>> On Sat, Apr 4, 2020 at 4:08 PM Wes McKinney <we...@gmail.com> wrote:
>>>>
>>>> ## Description:
>>>>
>>>> The mission of Apache Arrow is the creation and maintenance of software related
>>>> to columnar in-memory processing and data interchange. The project has some
>>>> level of support for 11 different programming languages.
>>>>
>>>> ## Issues:
>>>>
>>>> - We are continuing to work with INFRA on issues related to self-hosted CI
>>>>   machines integrated with our GitHub-based pull request workflows. There are
>>>>   two avenues we are exploring (and we may well use both of them), GitHub
>>>>   Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
>>>>   approved for the @apache GitHub organization and we will soon validate that
>>>>   we can successfully use this with the free Arrow organization that Buildkite
>>>>   has provided us. CI/CD is likely to require an ongoing significant investment
>>>>   of time, and we are doing the best we can try avoid overburdening ASF Infra
>>>>   with requests.
>>>>
>>>> ## Membership Data:
>>>>
>>>> Apache Arrow was founded 2016-01-19 (4 years ago)
>>>> There are currently 50 committers and 30 PMC members in this project.
>>>> The Committer-to-PMC ratio is 5:3.
>>>>
>>>> Community changes, past quarter:
>>>> - Francois Saint-Jacques was added to the PMC on 2020-03-04
>>>> - Neal Richardson was added to the PMC on 2020-03-04
>>>> - No new committers. Last addition was Joris Van den Bossche on 2019-12-06.
>>>>
>>>> ## Project Activity:
>>>>
>>>> - 0.16.0 was released at the end of January. We are close to
>>>>   releasing 0.17.0, with a 1.0.0 release hopefully sometime in
>>>>   2020.
>>>> - We just adopted a "C Data Interface" for the project which will open many new
>>>>   opportunities for integrations with third party projects.
>>>>
>>>> ## Community Health:
>>>>
>>>> The project and contributor base continues to grow in size and
>>>> scope. We now have over 400 unique contributors since the
>>>> creation of the project.

Re: [DRAFT] Arrow Board Report April 2020

Posted by Wes McKinney <we...@gmail.com>.
Yes, definitely, can you propose a paragraph for the Project Activity section?

On Wed, Apr 8, 2020 at 8:10 AM Antoine Pitrou <an...@python.org> wrote:
>
>
> Is it worth mentioning the OSS-Fuzz integration (and "success story")?
>
> Le 08/04/2020 à 15:05, Wes McKinney a écrit :
> > The report is due today. Are there any more comments?
> >
> > On Sat, Apr 4, 2020 at 4:08 PM Wes McKinney <we...@gmail.com> wrote:
> >>
> >> ## Description:
> >>
> >> The mission of Apache Arrow is the creation and maintenance of software related
> >> to columnar in-memory processing and data interchange. The project has some
> >> level of support for 11 different programming languages.
> >>
> >> ## Issues:
> >>
> >> - We are continuing to work with INFRA on issues related to self-hosted CI
> >>   machines integrated with our GitHub-based pull request workflows. There are
> >>   two avenues we are exploring (and we may well use both of them), GitHub
> >>   Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
> >>   approved for the @apache GitHub organization and we will soon validate that
> >>   we can successfully use this with the free Arrow organization that Buildkite
> >>   has provided us. CI/CD is likely to require an ongoing significant investment
> >>   of time, and we are doing the best we can try avoid overburdening ASF Infra
> >>   with requests.
> >>
> >> ## Membership Data:
> >>
> >> Apache Arrow was founded 2016-01-19 (4 years ago)
> >> There are currently 50 committers and 30 PMC members in this project.
> >> The Committer-to-PMC ratio is 5:3.
> >>
> >> Community changes, past quarter:
> >> - Francois Saint-Jacques was added to the PMC on 2020-03-04
> >> - Neal Richardson was added to the PMC on 2020-03-04
> >> - No new committers. Last addition was Joris Van den Bossche on 2019-12-06.
> >>
> >> ## Project Activity:
> >>
> >> - 0.16.0 was released at the end of January. We are close to
> >>   releasing 0.17.0, with a 1.0.0 release hopefully sometime in
> >>   2020.
> >> - We just adopted a "C Data Interface" for the project which will open many new
> >>   opportunities for integrations with third party projects.
> >>
> >> ## Community Health:
> >>
> >> The project and contributor base continues to grow in size and
> >> scope. We now have over 400 unique contributors since the
> >> creation of the project.

Re: [DRAFT] Arrow Board Report April 2020

Posted by Antoine Pitrou <an...@python.org>.
Is it worth mentioning the OSS-Fuzz integration (and "success story")?

Le 08/04/2020 à 15:05, Wes McKinney a écrit :
> The report is due today. Are there any more comments?
> 
> On Sat, Apr 4, 2020 at 4:08 PM Wes McKinney <we...@gmail.com> wrote:
>>
>> ## Description:
>>
>> The mission of Apache Arrow is the creation and maintenance of software related
>> to columnar in-memory processing and data interchange. The project has some
>> level of support for 11 different programming languages.
>>
>> ## Issues:
>>
>> - We are continuing to work with INFRA on issues related to self-hosted CI
>>   machines integrated with our GitHub-based pull request workflows. There are
>>   two avenues we are exploring (and we may well use both of them), GitHub
>>   Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
>>   approved for the @apache GitHub organization and we will soon validate that
>>   we can successfully use this with the free Arrow organization that Buildkite
>>   has provided us. CI/CD is likely to require an ongoing significant investment
>>   of time, and we are doing the best we can try avoid overburdening ASF Infra
>>   with requests.
>>
>> ## Membership Data:
>>
>> Apache Arrow was founded 2016-01-19 (4 years ago)
>> There are currently 50 committers and 30 PMC members in this project.
>> The Committer-to-PMC ratio is 5:3.
>>
>> Community changes, past quarter:
>> - Francois Saint-Jacques was added to the PMC on 2020-03-04
>> - Neal Richardson was added to the PMC on 2020-03-04
>> - No new committers. Last addition was Joris Van den Bossche on 2019-12-06.
>>
>> ## Project Activity:
>>
>> - 0.16.0 was released at the end of January. We are close to
>>   releasing 0.17.0, with a 1.0.0 release hopefully sometime in
>>   2020.
>> - We just adopted a "C Data Interface" for the project which will open many new
>>   opportunities for integrations with third party projects.
>>
>> ## Community Health:
>>
>> The project and contributor base continues to grow in size and
>> scope. We now have over 400 unique contributors since the
>> creation of the project.

Re: [DRAFT] Arrow Board Report April 2020

Posted by Wes McKinney <we...@gmail.com>.
The report is due today. Are there any more comments?

On Sat, Apr 4, 2020 at 4:08 PM Wes McKinney <we...@gmail.com> wrote:
>
> ## Description:
>
> The mission of Apache Arrow is the creation and maintenance of software related
> to columnar in-memory processing and data interchange. The project has some
> level of support for 11 different programming languages.
>
> ## Issues:
>
> - We are continuing to work with INFRA on issues related to self-hosted CI
>   machines integrated with our GitHub-based pull request workflows. There are
>   two avenues we are exploring (and we may well use both of them), GitHub
>   Actions Self-hosted and Buildkite. Per INFRA-19217 Buildkite has just been
>   approved for the @apache GitHub organization and we will soon validate that
>   we can successfully use this with the free Arrow organization that Buildkite
>   has provided us. CI/CD is likely to require an ongoing significant investment
>   of time, and we are doing the best we can try avoid overburdening ASF Infra
>   with requests.
>
> ## Membership Data:
>
> Apache Arrow was founded 2016-01-19 (4 years ago)
> There are currently 50 committers and 30 PMC members in this project.
> The Committer-to-PMC ratio is 5:3.
>
> Community changes, past quarter:
> - Francois Saint-Jacques was added to the PMC on 2020-03-04
> - Neal Richardson was added to the PMC on 2020-03-04
> - No new committers. Last addition was Joris Van den Bossche on 2019-12-06.
>
> ## Project Activity:
>
> - 0.16.0 was released at the end of January. We are close to
>   releasing 0.17.0, with a 1.0.0 release hopefully sometime in
>   2020.
> - We just adopted a "C Data Interface" for the project which will open many new
>   opportunities for integrations with third party projects.
>
> ## Community Health:
>
> The project and contributor base continues to grow in size and
> scope. We now have over 400 unique contributors since the
> creation of the project.