You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Jacques Nadeau <ja...@apache.org> on 2016/10/13 01:09:29 UTC

[DRAFT REPORT] Apache Arrow October 2016

Hey Guys, any feedback on the Arrow report?

## Description:
Arrow is a columnar in-memory analytics layer designed to accelerate big
data.
It houses a set of canonical in-memory representations of flat and
hierarchical
data along with multiple language-bindings for structure manipulation. It
also
provides IPC and common algorithm implementations.

## Issues:

- There are no issues requiring board attention at this time.

## Activity:
- Arrow made its first release.
- In preparation of the release, multiple discussions were focused on
  formalizing various Arrow specification details.
- Discussion was good and we reworked some integration to invert the
  dependency model between the Parquet project and the Arrow project.
- A new Arrow file format was defined and implemented in both Java and C++.
- Community members covered Arrow at multiple conferences including Strata
  NYC.
- Arrow <> Parquet interchange has been made available in both Java and
C++.
- The new Arrow file format is planned to be used to move forward on both
  cross-language IPC implementations and enabling cross-language
compatibility
  tests.

## Health report:
- The first release is a good step in engaging a broader range of
contributors
  and users. Having bits for use, albeit alpha, allows us to engage a wider
  range of engineers.
- We need to continue to add new examples and more documentation to better
  describe how to use and extend Arrow.

## PMC changes:

- Currently 17 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Abdel Hakim Deneche on Tue Jan 19 2016

## Committer base changes:

- Currently 20 committers.
- No new committers added in the last 3 months
- Last committer addition was Ippokratis Pandis at Thu Feb 18 2016

## Releases:

- 0.1.0 was released on Wed Oct 12 2016

## JIRA activity:

- 95 JIRA tickets created in the last 3 months
- 73 JIRA tickets closed/resolved in the last 3 months

Re: [DRAFT REPORT] Apache Arrow October 2016

Posted by Julien Le Dem <ju...@dremio.com>.
+1

On Wed, Oct 12, 2016 at 6:09 PM, Jacques Nadeau <ja...@apache.org> wrote:

> Hey Guys, any feedback on the Arrow report?
>
> ## Description:
> Arrow is a columnar in-memory analytics layer designed to accelerate big
> data.
> It houses a set of canonical in-memory representations of flat and
> hierarchical
> data along with multiple language-bindings for structure manipulation. It
> also
> provides IPC and common algorithm implementations.
>
> ## Issues:
>
> - There are no issues requiring board attention at this time.
>
> ## Activity:
> - Arrow made its first release.
> - In preparation of the release, multiple discussions were focused on
>   formalizing various Arrow specification details.
> - Discussion was good and we reworked some integration to invert the
>   dependency model between the Parquet project and the Arrow project.
> - A new Arrow file format was defined and implemented in both Java and C++.
> - Community members covered Arrow at multiple conferences including Strata
>   NYC.
> - Arrow <> Parquet interchange has been made available in both Java and
> C++.
> - The new Arrow file format is planned to be used to move forward on both
>   cross-language IPC implementations and enabling cross-language
> compatibility
>   tests.
>
> ## Health report:
> - The first release is a good step in engaging a broader range of
> contributors
>   and users. Having bits for use, albeit alpha, allows us to engage a wider
>   range of engineers.
> - We need to continue to add new examples and more documentation to better
>   describe how to use and extend Arrow.
>
> ## PMC changes:
>
> - Currently 17 PMC members.
> - No new PMC members added in the last 3 months
> - Last PMC addition was Abdel Hakim Deneche on Tue Jan 19 2016
>
> ## Committer base changes:
>
> - Currently 20 committers.
> - No new committers added in the last 3 months
> - Last committer addition was Ippokratis Pandis at Thu Feb 18 2016
>
> ## Releases:
>
> - 0.1.0 was released on Wed Oct 12 2016
>
> ## JIRA activity:
>
> - 95 JIRA tickets created in the last 3 months
> - 73 JIRA tickets closed/resolved in the last 3 months
>



-- 
Julien

Re: [DRAFT REPORT] Apache Arrow October 2016

Posted by Jacques Nadeau <ja...@apache.org>.
Thanks for the feedback all. Final report I submitted below.

Uwe, good catch on Java <> Parquet. Mental hiccup on my part...we don't
have that yet :P


## Description:
Arrow is a columnar in-memory analytics layer designed to accelerate big
data.
It houses a set of canonical in-memory representations of flat and
hierarchical
data along with multiple language-bindings for structure manipulation. It
also
provides IPC and common algorithm implementations.

## Issues:

- There are no issues requiring board attention at this time.

## Activity:
- Arrow made its first release.
- In preparation of the release, multiple discussions were focused on
  formalizing various Arrow specification details.
- Discussion was good and we reworked some integration to invert the
  dependency model between the Parquet project and the Arrow project.
- A new Arrow file format was defined and implemented in both Java and C++.
  It is also available from Python.
- Community members covered Arrow at multiple conferences including Strata
  NYC.
- Arrow <> Parquet interchange has been made available in C++.
- The new Arrow file format is planned to be used to move forward on both
  cross-language IPC implementations and enabling cross-language
compatibility
  tests.
- We've seen good growth in the Arrow developer mailing list, having
increased
  to 467 subscribers (up 43 in the last 3 months):

## Health report:
- The first release is a good step in engaging a broader range of
contributors
  and users. Having bits for use, albeit alpha, allows us to engage a wider
  range of engineers.
- We need to continue to add new examples and more documentation to better
  describe how to use and extend Arrow.

## PMC changes:

- Currently 17 PMC members.
- No new PMC members added in the last 3 months
- Last PMC addition was Abdel Hakim Deneche on Tue Jan 19 2016

## Committer base changes:

- Currently 20 committers.
- No new committers added in the last 3 months
- Last committer addition was Ippokratis Pandis at Thu Feb 18 2016

## Releases:

- 0.1.0 was released on Wed Oct 12 2016

## JIRA activity:

- 95 JIRA tickets created in the last 3 months
- 73 JIRA tickets closed/resolved in the last 3 months

On Thu, Oct 13, 2016 at 3:21 PM, Julian Hyde <jh...@apache.org> wrote:

> I know the board discourages including the mailing list stats, but they’re
> not too shabby. Especially the subscriber count. Worth mentioning email
> activity even if you skip the numbers.
>
>
> - dev@arrow.apache.org:
>     - 467 subscribers (up 43 in the last 3 months):
>     - 615 emails sent to list (504 in previous quarter)
>
>  - issues@arrow.apache.org:
>     - 6 subscribers (up 0 in the last 3 months):
>     - 399 emails sent to list (338 in previous quarter)
>
> > On Oct 12, 2016, at 11:44 PM, Uwe Korn <uw...@xhochy.com> wrote:
> >
> > See inline comments.
> >
> >
> > On 13.10.16 03:09, Jacques Nadeau wrote:
> >> Hey Guys, any feedback on the Arrow report?
> >>
> >> ## Description:
> >> Arrow is a columnar in-memory analytics layer designed to accelerate big
> >> data.
> >> It houses a set of canonical in-memory representations of flat and
> >> hierarchical
> >> data along with multiple language-bindings for structure manipulation.
> It
> >> also
> >> provides IPC and common algorithm implementations.
> >>
> >> ## Issues:
> >>
> >> - There are no issues requiring board attention at this time.
> >>
> >> ## Activity:
> >> - Arrow made its first release.
> >> - In preparation of the release, multiple discussions were focused on
> >>   formalizing various Arrow specification details.
> >> - Discussion was good and we reworked some integration to invert the
> >>   dependency model between the Parquet project and the Arrow project.
> >> - A new Arrow file format was defined and implemented in both Java and
> C++.
> > Also available in Python. Although just a small layer on top of C++,
> still worth mentioning.
> >> - Community members covered Arrow at multiple conferences including
> Strata
> >>   NYC.
> >> - Arrow <> Parquet interchange has been made available in both Java and
> >> C++.
> > Where does the Java implementation of the Parquet interchange live?
> >> - The new Arrow file format is planned to be used to move forward on
> both
> >>   cross-language IPC implementations and enabling cross-language
> >> compatibility
> >>   tests.
> >>
> >> ## Health report:
> >> - The first release is a good step in engaging a broader range of
> >> contributors
> >>   and users. Having bits for use, albeit alpha, allows us to engage a
> wider
> >>   range of engineers.
> >> - We need to continue to add new examples and more documentation to
> better
> >>   describe how to use and extend Arrow.
> >>
> >> ## PMC changes:
> >>
> >> - Currently 17 PMC members.
> >> - No new PMC members added in the last 3 months
> >> - Last PMC addition was Abdel Hakim Deneche on Tue Jan 19 2016
> >>
> >> ## Committer base changes:
> >>
> >> - Currently 20 committers.
> >> - No new committers added in the last 3 months
> >> - Last committer addition was Ippokratis Pandis at Thu Feb 18 2016
> >>
> >> ## Releases:
> >>
> >> - 0.1.0 was released on Wed Oct 12 2016
> >>
> >> ## JIRA activity:
> >>
> >> - 95 JIRA tickets created in the last 3 months
> >> - 73 JIRA tickets closed/resolved in the last 3 months
> > Uwe
>
>

Re: [DRAFT REPORT] Apache Arrow October 2016

Posted by Julian Hyde <jh...@apache.org>.
I know the board discourages including the mailing list stats, but they’re not too shabby. Especially the subscriber count. Worth mentioning email activity even if you skip the numbers.


- dev@arrow.apache.org:  
    - 467 subscribers (up 43 in the last 3 months): 
    - 615 emails sent to list (504 in previous quarter) 
   
 - issues@arrow.apache.org:  
    - 6 subscribers (up 0 in the last 3 months): 
    - 399 emails sent to list (338 in previous quarter) 

> On Oct 12, 2016, at 11:44 PM, Uwe Korn <uw...@xhochy.com> wrote:
> 
> See inline comments.
> 
> 
> On 13.10.16 03:09, Jacques Nadeau wrote:
>> Hey Guys, any feedback on the Arrow report?
>> 
>> ## Description:
>> Arrow is a columnar in-memory analytics layer designed to accelerate big
>> data.
>> It houses a set of canonical in-memory representations of flat and
>> hierarchical
>> data along with multiple language-bindings for structure manipulation. It
>> also
>> provides IPC and common algorithm implementations.
>> 
>> ## Issues:
>> 
>> - There are no issues requiring board attention at this time.
>> 
>> ## Activity:
>> - Arrow made its first release.
>> - In preparation of the release, multiple discussions were focused on
>>   formalizing various Arrow specification details.
>> - Discussion was good and we reworked some integration to invert the
>>   dependency model between the Parquet project and the Arrow project.
>> - A new Arrow file format was defined and implemented in both Java and C++.
> Also available in Python. Although just a small layer on top of C++, still worth mentioning.
>> - Community members covered Arrow at multiple conferences including Strata
>>   NYC.
>> - Arrow <> Parquet interchange has been made available in both Java and
>> C++.
> Where does the Java implementation of the Parquet interchange live?
>> - The new Arrow file format is planned to be used to move forward on both
>>   cross-language IPC implementations and enabling cross-language
>> compatibility
>>   tests.
>> 
>> ## Health report:
>> - The first release is a good step in engaging a broader range of
>> contributors
>>   and users. Having bits for use, albeit alpha, allows us to engage a wider
>>   range of engineers.
>> - We need to continue to add new examples and more documentation to better
>>   describe how to use and extend Arrow.
>> 
>> ## PMC changes:
>> 
>> - Currently 17 PMC members.
>> - No new PMC members added in the last 3 months
>> - Last PMC addition was Abdel Hakim Deneche on Tue Jan 19 2016
>> 
>> ## Committer base changes:
>> 
>> - Currently 20 committers.
>> - No new committers added in the last 3 months
>> - Last committer addition was Ippokratis Pandis at Thu Feb 18 2016
>> 
>> ## Releases:
>> 
>> - 0.1.0 was released on Wed Oct 12 2016
>> 
>> ## JIRA activity:
>> 
>> - 95 JIRA tickets created in the last 3 months
>> - 73 JIRA tickets closed/resolved in the last 3 months
> Uwe


Re: [DRAFT REPORT] Apache Arrow October 2016

Posted by Uwe Korn <uw...@xhochy.com>.
See inline comments.


On 13.10.16 03:09, Jacques Nadeau wrote:
> Hey Guys, any feedback on the Arrow report?
>
> ## Description:
> Arrow is a columnar in-memory analytics layer designed to accelerate big
> data.
> It houses a set of canonical in-memory representations of flat and
> hierarchical
> data along with multiple language-bindings for structure manipulation. It
> also
> provides IPC and common algorithm implementations.
>
> ## Issues:
>
> - There are no issues requiring board attention at this time.
>
> ## Activity:
> - Arrow made its first release.
> - In preparation of the release, multiple discussions were focused on
>    formalizing various Arrow specification details.
> - Discussion was good and we reworked some integration to invert the
>    dependency model between the Parquet project and the Arrow project.
> - A new Arrow file format was defined and implemented in both Java and C++.
Also available in Python. Although just a small layer on top of C++, 
still worth mentioning.
> - Community members covered Arrow at multiple conferences including Strata
>    NYC.
> - Arrow <> Parquet interchange has been made available in both Java and
> C++.
Where does the Java implementation of the Parquet interchange live?
> - The new Arrow file format is planned to be used to move forward on both
>    cross-language IPC implementations and enabling cross-language
> compatibility
>    tests.
>
> ## Health report:
> - The first release is a good step in engaging a broader range of
> contributors
>    and users. Having bits for use, albeit alpha, allows us to engage a wider
>    range of engineers.
> - We need to continue to add new examples and more documentation to better
>    describe how to use and extend Arrow.
>
> ## PMC changes:
>
> - Currently 17 PMC members.
> - No new PMC members added in the last 3 months
> - Last PMC addition was Abdel Hakim Deneche on Tue Jan 19 2016
>
> ## Committer base changes:
>
> - Currently 20 committers.
> - No new committers added in the last 3 months
> - Last committer addition was Ippokratis Pandis at Thu Feb 18 2016
>
> ## Releases:
>
> - 0.1.0 was released on Wed Oct 12 2016
>
> ## JIRA activity:
>
> - 95 JIRA tickets created in the last 3 months
> - 73 JIRA tickets closed/resolved in the last 3 months
Uwe