You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Antoine Pitrou <an...@python.org> on 2020/02/11 19:06:33 UTC

[VOTE] Adopt Arrow in-process C Data Interface specification

Hello,

We have been discussing the creation of a minimalist C-based data
interface for applications to exchange Arrow columnar data structures
with each other. Some notable features of this interface include:

* A small amount of header-only C code can be copied independently into
third-party libraries and downstream applications, no dependencies are
needed even on Arrow C++ itself (notably, it is not required to use
Flatbuffers, though there are trade-offs resulting from this).

* Low development investment (in other words: limited-scope use cases
can be accomplished with little code), so as to enable C or C++
libraries to export Arrow columnar data with minimal code.

* Data lifetime management hooks so as to properly handle non-trivial
data sharing (for example passing Arrow columnar data to an async
processing consumer).

This "C Data Interface" serves different use cases from the
language-independent IPC protocol and trades away a number of features
in the interest of minimalism / simplicity. It is not a replacement for
the IPC protocol and will only be used to interchange in-process data at
C or C++ call sites.

The PR providing the specification is here:
https://github.com/apache/arrow/pull/5442

In particular, you can read the spec document here:
https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst

A fairly comprehensive C++ implementation of this demonstrating its
use is found here:
https://github.com/apache/arrow/pull/5608

(note that other applications implementing the interface may choose to
only support a few features and thus have far less code to write)

Please vote to adopt the SPECIFICATION (GitHub PR #5442).

This vote will be open for at least 72 hours

[ ] +1 Adopt C Data Interface specification
[ ] +0
[ ] -1 Do not adopt because...

Thank you

Regards

Antoine.


(PS: yes, this is in large part a copy/paste of Wes's previous vote
email :-))

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Wes McKinney <we...@gmail.com>.
A week has passed, I would say we should move forward with merging
patches related to this. Any last words (in the next 12 hours or so)?

On Tue, Feb 18, 2020 at 7:48 AM Krisztián Szűcs
<sz...@gmail.com> wrote:
>
> +1 (binding)
>
> On Tue, Feb 18, 2020 at 10:47 AM Antoine Pitrou <so...@pitrou.net> wrote:
> >
> >
> > There has also been interest from DuckDB:
> > https://github.com/cwida/duckdb/issues/151
> >
> > Regards
> >
> > Antoine.
> >
> >
> > On Tue, 18 Feb 2020 02:37:43 -0600
> > Wes McKinney <we...@gmail.com> wrote:
> > > As I recall TFX developers weighed in that this would be helpful for
> > > TensorFlow-related use cases where they are concerns about C++ ABI
> > > compatibility. Since this project has been ongoing for about 5 months
> > > (see also related discussion around implementation guidelines for
> > > third parties [1]) there has been a lot of time for people to have a
> > > look
> > >
> > > [1]: https://lists.apache.org/thread.html/b7c2094ac4e11ffce46914b603e16b6bba8f235bc6465f3ab6d320d5%40%3Cdev.arrow.apache.org%3E
> > >
> > > On Mon, Feb 17, 2020 at 11:19 PM Micah Kornfield <em...@gmail.com> wrote:
> > > >
> > > > I reviewed the spec again (not the implementation).  I'm +1 on this.
> > > >
> > > > I was wondering if we shared/received feedback on this with any other
> > > > communities?
> > > >
> > > > Thanks,
> > > > Micah
> > > >
> > > >
> > > >
> > > > On Sun, Feb 16, 2020 at 8:13 PM Micah Kornfield <em...@gmail.com>
> > > > wrote:
> > > >
> > > > > I will try to review tomorrow and cast a vote.
> > > > >
> > > > > On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney <we...@gmail.com> wrote:
> > > > >
> > > > >> There is only 1 binding +1 vote so far, we should probably wait for
> > > > >> three before closing the vote (it's possible that lazy consensus could
> > > > >> be employed here but not much harm in waiting a few more days)
> > > > >>
> > > > >> On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques
> > > > >> <fs...@gmail.com> wrote:
> > > > >> >
> > > > >> > +1
> > > > >> >
> > > > >> > On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:
> > > > >> > >
> > > > >> > > +1 (binding)
> > > > >> > >
> > > > >> > > On Thu, Feb 13, 2020 at q1:52 AM Wes McKinney <we...@gmail.com>
> > > > >> wrote:
> > > > >> > >
> > > > >> > > > +1 (binding)
> > > > >> > > >
> > > > >> > > > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org>
> > > > >> wrote:
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > Ah, you're right, it's PR 6040:
> > > > >> > > > > https://github.com/apache/arrow/pull/6040
> > > > >> > > > >
> > > > >> > > > > Similarly, the C++ implementation is at PR 6026:
> > > > >> > > > > https://github.com/apache/arrow/pull/6026
> > > > >> > > > >
> > > > >> > > > > Regar$s
> > > > >> > > > >
> > > > >> > > > > Antoine.
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> > > > >> > > > > > hi Antoine, PR 5442 seems to no longer be the right one. Which
> > > > >> open PR
> > > > >> > > > > > contains the specification now?
> > > > >> > > > > >
> > > > >> > > > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <
> > > > >> antoine@python.org>
> > > > >> > > > wrote:
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >> Hello,
> > > > >> > > > > >>
> > > > >> > > > > >> We have been discussing the creation of a minimalist C-based
> > > > >> data
> > > > >> > > > > >> interface for applications t/ exchange Arrow columnar data
> > > > >> structures
> > > > >> > > > > >> with each other. Some notable features of this interface
> > > > >> include:
> > > > >> > > > > >>
> > > > >> > > > > >> * A small amount of header-only C code can be copied
> > > > >> independently
> > > > >> > > > into
> > > > >> > > > > >> third-party libraries and downstream applications, no
> > > > >> dependencies are
> > > > >> > > > > >> needed even on Arrow C++ itself (notably, it is not required
> > > > >> to use
> > > > >> > > > > >> Flatbuffers, though there are trade-offs`resulting from this).
> > > > >> > > > > >>
> > > > >> > > > > >> * Low development investment (in other words: limited-scope
> > > > >> use cases
> > > > >> > > > > >> can be accomplished with little code), so as to enable C or C++
> > > > >> > > > > >> libraries to export Arrow columnar data with minimal code.
> > > > >> > > > > >>
> > > > >> > > > > >> * Data lifetime management hooks so as to properly handle
> > > > >> non-trivial
> > > > >> > > > > >> data sharing (for example passing Arrow columnar data to an
> > > > >> async
> > > > >> > > > > >> processing consumer).
> > > > >> > > > > >>
> > > > >> > > > > >> This "C Data Interface" serves different use cases from the
> > > > >> > > > > >> language-independent IPC protocol and trades away a number of
> > > > >> features
> > > > >> > > > > >> in the interest of minimalism / simplicity. It is not a
> > > > >> replacement
> > > > >> > > > for
> > > > >> > > > > >> the IPC protocol and will only be used to interchange
> > > > >> in-process data
> > > > >> > > > at
> > > > >> > > > > >> C or C++ call sites.
> > > > >> > > > > >>
> > > > >> > > > > >> The PR providing the specification is here:
> > > > >> > > > > >> https://github.com/apache/arrow/pull/5442
> > > > >> > > > > >>
> > > > >> > > > > >> In particular, you can read the spec document here:
> > > > >> > > > > >>
> > > > >> > > >
> > > > >> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> > > > >> > > > > >>
> > > > >> > > > > >> A fairly comprehensive C++ implementation of this
> > > > >> demonstrating its
> > > > >> > > > > >> use is found here:
> > > > >> > > > > >> https://github.com/apache/arrow/pull/5608
> > > > >> > > > > >>
> > > > >> > > > > >> (note that other applications implementing the i.terface may
> > > > >> choose to
> > > > >> > > > > >> only support a few features and thus have far less code to
> > > > >> write)
> > > > >> > > > > >>
> > > > >> > > > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > > > >> > > > > >>
> > > > >> > > > > >> This vote will be open for at least 72 hours
> > > > >> > > > > >>
> > > > >> > > > > >> [ ] +1 Adopt C Data Interface specification
> > > > >> > > > > >> [ ] +0
> > > > >> > > > > >> [ ] -1 Do not adopt because...
> > > > >> > > > > >>
> > > > >> > > > > >> Thank you
> > > > >~ > > > > >>
> > > > >> > > > > >> Regards
> > > > >> > > > > >>
> > > > >> > > > > >> Antoine.
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >> (PS: yes, this is in large part a copy/paste of Wes's previous
> > > > >> vote
> > > > >> > > > > >> email :-))
> > > > >> > > >
> > > > >>
> > > > >
> > >
> >
> >
> >

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Krisztián Szűcs <sz...@gmail.com>.
+1 (binding)

On Tue, Feb 18, 2020 at 10:47 AM Antoine Pitrou <so...@pitrou.net> wrote:
>
>
> There has also been interest from DuckDB:
> https://github.com/cwida/duckdb/issues/151
>
> Regards
>
> Antoine.
>
>
> On Tue, 18 Feb 2020 02:37:43 -0600
> Wes McKinney <we...@gmail.com> wrote:
> > As I recall TFX developers weighed in that this would be helpful for
> > TensorFlow-related use cases where they are concerns about C++ ABI
> > compatibility. Since this project has been ongoing for about 5 months
> > (see also related discussion around implementation guidelines for
> > third parties [1]) there has been a lot of time for people to have a
> > look
> >
> > [1]: https://lists.apache.org/thread.html/b7c2094ac4e11ffce46914b603e16b6bba8f235bc6465f3ab6d320d5%40%3Cdev.arrow.apache.org%3E
> >
> > On Mon, Feb 17, 2020 at 11:19 PM Micah Kornfield <em...@gmail.com> wrote:
> > >
> > > I reviewed the spec again (not the implementation).  I'm +1 on this.
> > >
> > > I was wondering if we shared/received feedback on this with any other
> > > communities?
> > >
> > > Thanks,
> > > Micah
> > >
> > >
> > >
> > > On Sun, Feb 16, 2020 at 8:13 PM Micah Kornfield <em...@gmail.com>
> > > wrote:
> > >
> > > > I will try to review tomorrow and cast a vote.
> > > >
> > > > On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney <we...@gmail.com> wrote:
> > > >
> > > >> There is only 1 binding +1 vote so far, we should probably wait for
> > > >> three before closing the vote (it's possible that lazy consensus could
> > > >> be employed here but not much harm in waiting a few more days)
> > > >>
> > > >> On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques
> > > >> <fs...@gmail.com> wrote:
> > > >> >
> > > >> > +1
> > > >> >
> > > >> > On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:
> > > >> > >
> > > >> > > +1 (binding)
> > > >> > >
> > > >> > > On Thu, Feb 13, 2020 at q1:52 AM Wes McKinney <we...@gmail.com>
> > > >> wrote:
> > > >> > >
> > > >> > > > +1 (binding)
> > > >> > > >
> > > >> > > > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org>
> > > >> wrote:
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > Ah, you're right, it's PR 6040:
> > > >> > > > > https://github.com/apache/arrow/pull/6040
> > > >> > > > >
> > > >> > > > > Similarly, the C++ implementation is at PR 6026:
> > > >> > > > > https://github.com/apache/arrow/pull/6026
> > > >> > > > >
> > > >> > > > > Regar$s
> > > >> > > > >
> > > >> > > > > Antoine.
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> > > >> > > > > > hi Antoine, PR 5442 seems to no longer be the right one. Which
> > > >> open PR
> > > >> > > > > > contains the specification now?
> > > >> > > > > >
> > > >> > > > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <
> > > >> antoine@python.org>
> > > >> > > > wrote:
> > > >> > > > > >>
> > > >> > > > > >>
> > > >> > > > > >> Hello,
> > > >> > > > > >>
> > > >> > > > > >> We have been discussing the creation of a minimalist C-based
> > > >> data
> > > >> > > > > >> interface for applications t/ exchange Arrow columnar data
> > > >> structures
> > > >> > > > > >> with each other. Some notable features of this interface
> > > >> include:
> > > >> > > > > >>
> > > >> > > > > >> * A small amount of header-only C code can be copied
> > > >> independently
> > > >> > > > into
> > > >> > > > > >> third-party libraries and downstream applications, no
> > > >> dependencies are
> > > >> > > > > >> needed even on Arrow C++ itself (notably, it is not required
> > > >> to use
> > > >> > > > > >> Flatbuffers, though there are trade-offs`resulting from this).
> > > >> > > > > >>
> > > >> > > > > >> * Low development investment (in other words: limited-scope
> > > >> use cases
> > > >> > > > > >> can be accomplished with little code), so as to enable C or C++
> > > >> > > > > >> libraries to export Arrow columnar data with minimal code.
> > > >> > > > > >>
> > > >> > > > > >> * Data lifetime management hooks so as to properly handle
> > > >> non-trivial
> > > >> > > > > >> data sharing (for example passing Arrow columnar data to an
> > > >> async
> > > >> > > > > >> processing consumer).
> > > >> > > > > >>
> > > >> > > > > >> This "C Data Interface" serves different use cases from the
> > > >> > > > > >> language-independent IPC protocol and trades away a number of
> > > >> features
> > > >> > > > > >> in the interest of minimalism / simplicity. It is not a
> > > >> replacement
> > > >> > > > for
> > > >> > > > > >> the IPC protocol and will only be used to interchange
> > > >> in-process data
> > > >> > > > at
> > > >> > > > > >> C or C++ call sites.
> > > >> > > > > >>
> > > >> > > > > >> The PR providing the specification is here:
> > > >> > > > > >> https://github.com/apache/arrow/pull/5442
> > > >> > > > > >>
> > > >> > > > > >> In particular, you can read the spec document here:
> > > >> > > > > >>
> > > >> > > >
> > > >> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> > > >> > > > > >>
> > > >> > > > > >> A fairly comprehensive C++ implementation of this
> > > >> demonstrating its
> > > >> > > > > >> use is found here:
> > > >> > > > > >> https://github.com/apache/arrow/pull/5608
> > > >> > > > > >>
> > > >> > > > > >> (note that other applications implementing the i.terface may
> > > >> choose to
> > > >> > > > > >> only support a few features and thus have far less code to
> > > >> write)
> > > >> > > > > >>
> > > >> > > > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > > >> > > > > >>
> > > >> > > > > >> This vote will be open for at least 72 hours
> > > >> > > > > >>
> > > >> > > > > >> [ ] +1 Adopt C Data Interface specification
> > > >> > > > > >> [ ] +0
> > > >> > > > > >> [ ] -1 Do not adopt because...
> > > >> > > > > >>
> > > >> > > > > >> Thank you
> > > >~ > > > > >>
> > > >> > > > > >> Regards
> > > >> > > > > >>
> > > >> > > > > >> Antoine.
> > > >> > > > > >>
> > > >> > > > > >>
> > > >> > > > > >> (PS: yes, this is in large part a copy/paste of Wes's previous
> > > >> vote
> > > >> > > > > >> email :-))
> > > >> > > >
> > > >>
> > > >
> >
>
>
>

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Antoine Pitrou <so...@pitrou.net>.
There has also been interest from DuckDB:
https://github.com/cwida/duckdb/issues/151

Regards

Antoine.


On Tue, 18 Feb 2020 02:37:43 -0600
Wes McKinney <we...@gmail.com> wrote:
> As I recall TFX developers weighed in that this would be helpful for
> TensorFlow-related use cases where they are concerns about C++ ABI
> compatibility. Since this project has been ongoing for about 5 months
> (see also related discussion around implementation guidelines for
> third parties [1]) there has been a lot of time for people to have a
> look
> 
> [1]: https://lists.apache.org/thread.html/b7c2094ac4e11ffce46914b603e16b6bba8f235bc6465f3ab6d320d5%40%3Cdev.arrow.apache.org%3E
> 
> On Mon, Feb 17, 2020 at 11:19 PM Micah Kornfield <em...@gmail.com> wrote:
> >
> > I reviewed the spec again (not the implementation).  I'm +1 on this.
> >
> > I was wondering if we shared/received feedback on this with any other
> > communities?
> >
> > Thanks,
> > Micah
> >
> >
> >
> > On Sun, Feb 16, 2020 at 8:13 PM Micah Kornfield <em...@gmail.com>
> > wrote:
> >  
> > > I will try to review tomorrow and cast a vote.
> > >
> > > On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney <we...@gmail.com> wrote:
> > >  
> > >> There is only 1 binding +1 vote so far, we should probably wait for
> > >> three before closing the vote (it's possible that lazy consensus could
> > >> be employed here but not much harm in waiting a few more days)
> > >>
> > >> On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques
> > >> <fs...@gmail.com> wrote:  
> > >> >
> > >> > +1
> > >> >
> > >> > On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:  
> > >> > >
> > >> > > +1 (binding)
> > >> > >
> > >> > > On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney <we...@gmail.com>  
> > >> wrote:  
> > >> > >  
> > >> > > > +1 (binding)
> > >> > > >
> > >> > > > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org>  
> > >> wrote:  
> > >> > > > >
> > >> > > > >
> > >> > > > > Ah, you're right, it's PR 6040:
> > >> > > > > https://github.com/apache/arrow/pull/6040
> > >> > > > >
> > >> > > > > Similarly, the C++ implementation is at PR 6026:
> > >> > > > > https://github.com/apache/arrow/pull/6026
> > >> > > > >
> > >> > > > > Regards
> > >> > > > >
> > >> > > > > Antoine.
> > >> > > > >
> > >> > > > >
> > >> > > > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :  
> > >> > > > > > hi Antoine, PR 5442 seems to no longer be the right one. Which  
> > >> open PR  
> > >> > > > > > contains the specification now?
> > >> > > > > >
> > >> > > > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <  
> > >> antoine@python.org>  
> > >> > > > wrote:  
> > >> > > > > >>
> > >> > > > > >>
> > >> > > > > >> Hello,
> > >> > > > > >>
> > >> > > > > >> We have been discussing the creation of a minimalist C-based  
> > >> data  
> > >> > > > > >> interface for applications to exchange Arrow columnar data  
> > >> structures  
> > >> > > > > >> with each other. Some notable features of this interface  
> > >> include:  
> > >> > > > > >>
> > >> > > > > >> * A small amount of header-only C code can be copied  
> > >> independently  
> > >> > > > into  
> > >> > > > > >> third-party libraries and downstream applications, no  
> > >> dependencies are  
> > >> > > > > >> needed even on Arrow C++ itself (notably, it is not required  
> > >> to use  
> > >> > > > > >> Flatbuffers, though there are trade-offs resulting from this).
> > >> > > > > >>
> > >> > > > > >> * Low development investment (in other words: limited-scope  
> > >> use cases  
> > >> > > > > >> can be accomplished with little code), so as to enable C or C++
> > >> > > > > >> libraries to export Arrow columnar data with minimal code.
> > >> > > > > >>
> > >> > > > > >> * Data lifetime management hooks so as to properly handle  
> > >> non-trivial  
> > >> > > > > >> data sharing (for example passing Arrow columnar data to an  
> > >> async  
> > >> > > > > >> processing consumer).
> > >> > > > > >>
> > >> > > > > >> This "C Data Interface" serves different use cases from the
> > >> > > > > >> language-independent IPC protocol and trades away a number of  
> > >> features  
> > >> > > > > >> in the interest of minimalism / simplicity. It is not a  
> > >> replacement  
> > >> > > > for  
> > >> > > > > >> the IPC protocol and will only be used to interchange  
> > >> in-process data  
> > >> > > > at  
> > >> > > > > >> C or C++ call sites.
> > >> > > > > >>
> > >> > > > > >> The PR providing the specification is here:
> > >> > > > > >> https://github.com/apache/arrow/pull/5442
> > >> > > > > >>
> > >> > > > > >> In particular, you can read the spec document here:
> > >> > > > > >>  
> > >> > > >  
> > >> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst  
> > >> > > > > >>
> > >> > > > > >> A fairly comprehensive C++ implementation of this  
> > >> demonstrating its  
> > >> > > > > >> use is found here:
> > >> > > > > >> https://github.com/apache/arrow/pull/5608
> > >> > > > > >>
> > >> > > > > >> (note that other applications implementing the interface may  
> > >> choose to  
> > >> > > > > >> only support a few features and thus have far less code to  
> > >> write)  
> > >> > > > > >>
> > >> > > > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > >> > > > > >>
> > >> > > > > >> This vote will be open for at least 72 hours
> > >> > > > > >>
> > >> > > > > >> [ ] +1 Adopt C Data Interface specification
> > >> > > > > >> [ ] +0
> > >> > > > > >> [ ] -1 Do not adopt because...
> > >> > > > > >>
> > >> > > > > >> Thank you
> > >> > > > > >>
> > >> > > > > >> Regards
> > >> > > > > >>
> > >> > > > > >> Antoine.
> > >> > > > > >>
> > >> > > > > >>
> > >> > > > > >> (PS: yes, this is in large part a copy/paste of Wes's previous  
> > >> vote  
> > >> > > > > >> email :-))  
> > >> > > >  
> > >>  
> > >  
> 




Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Wes McKinney <we...@gmail.com>.
As I recall TFX developers weighed in that this would be helpful for
TensorFlow-related use cases where they are concerns about C++ ABI
compatibility. Since this project has been ongoing for about 5 months
(see also related discussion around implementation guidelines for
third parties [1]) there has been a lot of time for people to have a
look

[1]: https://lists.apache.org/thread.html/b7c2094ac4e11ffce46914b603e16b6bba8f235bc6465f3ab6d320d5%40%3Cdev.arrow.apache.org%3E

On Mon, Feb 17, 2020 at 11:19 PM Micah Kornfield <em...@gmail.com> wrote:
>
> I reviewed the spec again (not the implementation).  I'm +1 on this.
>
> I was wondering if we shared/received feedback on this with any other
> communities?
>
> Thanks,
> Micah
>
>
>
> On Sun, Feb 16, 2020 at 8:13 PM Micah Kornfield <em...@gmail.com>
> wrote:
>
> > I will try to review tomorrow and cast a vote.
> >
> > On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney <we...@gmail.com> wrote:
> >
> >> There is only 1 binding +1 vote so far, we should probably wait for
> >> three before closing the vote (it's possible that lazy consensus could
> >> be employed here but not much harm in waiting a few more days)
> >>
> >> On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques
> >> <fs...@gmail.com> wrote:
> >> >
> >> > +1
> >> >
> >> > On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:
> >> > >
> >> > > +1 (binding)
> >> > >
> >> > > On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney <we...@gmail.com>
> >> wrote:
> >> > >
> >> > > > +1 (binding)
> >> > > >
> >> > > > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org>
> >> wrote:
> >> > > > >
> >> > > > >
> >> > > > > Ah, you're right, it's PR 6040:
> >> > > > > https://github.com/apache/arrow/pull/6040
> >> > > > >
> >> > > > > Similarly, the C++ implementation is at PR 6026:
> >> > > > > https://github.com/apache/arrow/pull/6026
> >> > > > >
> >> > > > > Regards
> >> > > > >
> >> > > > > Antoine.
> >> > > > >
> >> > > > >
> >> > > > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> >> > > > > > hi Antoine, PR 5442 seems to no longer be the right one. Which
> >> open PR
> >> > > > > > contains the specification now?
> >> > > > > >
> >> > > > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <
> >> antoine@python.org>
> >> > > > wrote:
> >> > > > > >>
> >> > > > > >>
> >> > > > > >> Hello,
> >> > > > > >>
> >> > > > > >> We have been discussing the creation of a minimalist C-based
> >> data
> >> > > > > >> interface for applications to exchange Arrow columnar data
> >> structures
> >> > > > > >> with each other. Some notable features of this interface
> >> include:
> >> > > > > >>
> >> > > > > >> * A small amount of header-only C code can be copied
> >> independently
> >> > > > into
> >> > > > > >> third-party libraries and downstream applications, no
> >> dependencies are
> >> > > > > >> needed even on Arrow C++ itself (notably, it is not required
> >> to use
> >> > > > > >> Flatbuffers, though there are trade-offs resulting from this).
> >> > > > > >>
> >> > > > > >> * Low development investment (in other words: limited-scope
> >> use cases
> >> > > > > >> can be accomplished with little code), so as to enable C or C++
> >> > > > > >> libraries to export Arrow columnar data with minimal code.
> >> > > > > >>
> >> > > > > >> * Data lifetime management hooks so as to properly handle
> >> non-trivial
> >> > > > > >> data sharing (for example passing Arrow columnar data to an
> >> async
> >> > > > > >> processing consumer).
> >> > > > > >>
> >> > > > > >> This "C Data Interface" serves different use cases from the
> >> > > > > >> language-independent IPC protocol and trades away a number of
> >> features
> >> > > > > >> in the interest of minimalism / simplicity. It is not a
> >> replacement
> >> > > > for
> >> > > > > >> the IPC protocol and will only be used to interchange
> >> in-process data
> >> > > > at
> >> > > > > >> C or C++ call sites.
> >> > > > > >>
> >> > > > > >> The PR providing the specification is here:
> >> > > > > >> https://github.com/apache/arrow/pull/5442
> >> > > > > >>
> >> > > > > >> In particular, you can read the spec document here:
> >> > > > > >>
> >> > > >
> >> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> >> > > > > >>
> >> > > > > >> A fairly comprehensive C++ implementation of this
> >> demonstrating its
> >> > > > > >> use is found here:
> >> > > > > >> https://github.com/apache/arrow/pull/5608
> >> > > > > >>
> >> > > > > >> (note that other applications implementing the interface may
> >> choose to
> >> > > > > >> only support a few features and thus have far less code to
> >> write)
> >> > > > > >>
> >> > > > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> >> > > > > >>
> >> > > > > >> This vote will be open for at least 72 hours
> >> > > > > >>
> >> > > > > >> [ ] +1 Adopt C Data Interface specification
> >> > > > > >> [ ] +0
> >> > > > > >> [ ] -1 Do not adopt because...
> >> > > > > >>
> >> > > > > >> Thank you
> >> > > > > >>
> >> > > > > >> Regards
> >> > > > > >>
> >> > > > > >> Antoine.
> >> > > > > >>
> >> > > > > >>
> >> > > > > >> (PS: yes, this is in large part a copy/paste of Wes's previous
> >> vote
> >> > > > > >> email :-))
> >> > > >
> >>
> >

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Micah Kornfield <em...@gmail.com>.
I reviewed the spec again (not the implementation).  I'm +1 on this.

I was wondering if we shared/received feedback on this with any other
communities?

Thanks,
Micah



On Sun, Feb 16, 2020 at 8:13 PM Micah Kornfield <em...@gmail.com>
wrote:

> I will try to review tomorrow and cast a vote.
>
> On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney <we...@gmail.com> wrote:
>
>> There is only 1 binding +1 vote so far, we should probably wait for
>> three before closing the vote (it's possible that lazy consensus could
>> be employed here but not much harm in waiting a few more days)
>>
>> On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques
>> <fs...@gmail.com> wrote:
>> >
>> > +1
>> >
>> > On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:
>> > >
>> > > +1 (binding)
>> > >
>> > > On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney <we...@gmail.com>
>> wrote:
>> > >
>> > > > +1 (binding)
>> > > >
>> > > > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org>
>> wrote:
>> > > > >
>> > > > >
>> > > > > Ah, you're right, it's PR 6040:
>> > > > > https://github.com/apache/arrow/pull/6040
>> > > > >
>> > > > > Similarly, the C++ implementation is at PR 6026:
>> > > > > https://github.com/apache/arrow/pull/6026
>> > > > >
>> > > > > Regards
>> > > > >
>> > > > > Antoine.
>> > > > >
>> > > > >
>> > > > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
>> > > > > > hi Antoine, PR 5442 seems to no longer be the right one. Which
>> open PR
>> > > > > > contains the specification now?
>> > > > > >
>> > > > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <
>> antoine@python.org>
>> > > > wrote:
>> > > > > >>
>> > > > > >>
>> > > > > >> Hello,
>> > > > > >>
>> > > > > >> We have been discussing the creation of a minimalist C-based
>> data
>> > > > > >> interface for applications to exchange Arrow columnar data
>> structures
>> > > > > >> with each other. Some notable features of this interface
>> include:
>> > > > > >>
>> > > > > >> * A small amount of header-only C code can be copied
>> independently
>> > > > into
>> > > > > >> third-party libraries and downstream applications, no
>> dependencies are
>> > > > > >> needed even on Arrow C++ itself (notably, it is not required
>> to use
>> > > > > >> Flatbuffers, though there are trade-offs resulting from this).
>> > > > > >>
>> > > > > >> * Low development investment (in other words: limited-scope
>> use cases
>> > > > > >> can be accomplished with little code), so as to enable C or C++
>> > > > > >> libraries to export Arrow columnar data with minimal code.
>> > > > > >>
>> > > > > >> * Data lifetime management hooks so as to properly handle
>> non-trivial
>> > > > > >> data sharing (for example passing Arrow columnar data to an
>> async
>> > > > > >> processing consumer).
>> > > > > >>
>> > > > > >> This "C Data Interface" serves different use cases from the
>> > > > > >> language-independent IPC protocol and trades away a number of
>> features
>> > > > > >> in the interest of minimalism / simplicity. It is not a
>> replacement
>> > > > for
>> > > > > >> the IPC protocol and will only be used to interchange
>> in-process data
>> > > > at
>> > > > > >> C or C++ call sites.
>> > > > > >>
>> > > > > >> The PR providing the specification is here:
>> > > > > >> https://github.com/apache/arrow/pull/5442
>> > > > > >>
>> > > > > >> In particular, you can read the spec document here:
>> > > > > >>
>> > > >
>> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
>> > > > > >>
>> > > > > >> A fairly comprehensive C++ implementation of this
>> demonstrating its
>> > > > > >> use is found here:
>> > > > > >> https://github.com/apache/arrow/pull/5608
>> > > > > >>
>> > > > > >> (note that other applications implementing the interface may
>> choose to
>> > > > > >> only support a few features and thus have far less code to
>> write)
>> > > > > >>
>> > > > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
>> > > > > >>
>> > > > > >> This vote will be open for at least 72 hours
>> > > > > >>
>> > > > > >> [ ] +1 Adopt C Data Interface specification
>> > > > > >> [ ] +0
>> > > > > >> [ ] -1 Do not adopt because...
>> > > > > >>
>> > > > > >> Thank you
>> > > > > >>
>> > > > > >> Regards
>> > > > > >>
>> > > > > >> Antoine.
>> > > > > >>
>> > > > > >>
>> > > > > >> (PS: yes, this is in large part a copy/paste of Wes's previous
>> vote
>> > > > > >> email :-))
>> > > >
>>
>

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Micah Kornfield <em...@gmail.com>.
I will try to review tomorrow and cast a vote.

On Fri, Feb 14, 2020 at 5:41 AM Wes McKinney <we...@gmail.com> wrote:

> There is only 1 binding +1 vote so far, we should probably wait for
> three before closing the vote (it's possible that lazy consensus could
> be employed here but not much harm in waiting a few more days)
>
> On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques
> <fs...@gmail.com> wrote:
> >
> > +1
> >
> > On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:
> > >
> > > +1 (binding)
> > >
> > > On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney <we...@gmail.com>
> wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org>
> wrote:
> > > > >
> > > > >
> > > > > Ah, you're right, it's PR 6040:
> > > > > https://github.com/apache/arrow/pull/6040
> > > > >
> > > > > Similarly, the C++ implementation is at PR 6026:
> > > > > https://github.com/apache/arrow/pull/6026
> > > > >
> > > > > Regards
> > > > >
> > > > > Antoine.
> > > > >
> > > > >
> > > > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> > > > > > hi Antoine, PR 5442 seems to no longer be the right one. Which
> open PR
> > > > > > contains the specification now?
> > > > > >
> > > > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <
> antoine@python.org>
> > > > wrote:
> > > > > >>
> > > > > >>
> > > > > >> Hello,
> > > > > >>
> > > > > >> We have been discussing the creation of a minimalist C-based
> data
> > > > > >> interface for applications to exchange Arrow columnar data
> structures
> > > > > >> with each other. Some notable features of this interface
> include:
> > > > > >>
> > > > > >> * A small amount of header-only C code can be copied
> independently
> > > > into
> > > > > >> third-party libraries and downstream applications, no
> dependencies are
> > > > > >> needed even on Arrow C++ itself (notably, it is not required to
> use
> > > > > >> Flatbuffers, though there are trade-offs resulting from this).
> > > > > >>
> > > > > >> * Low development investment (in other words: limited-scope use
> cases
> > > > > >> can be accomplished with little code), so as to enable C or C++
> > > > > >> libraries to export Arrow columnar data with minimal code.
> > > > > >>
> > > > > >> * Data lifetime management hooks so as to properly handle
> non-trivial
> > > > > >> data sharing (for example passing Arrow columnar data to an
> async
> > > > > >> processing consumer).
> > > > > >>
> > > > > >> This "C Data Interface" serves different use cases from the
> > > > > >> language-independent IPC protocol and trades away a number of
> features
> > > > > >> in the interest of minimalism / simplicity. It is not a
> replacement
> > > > for
> > > > > >> the IPC protocol and will only be used to interchange
> in-process data
> > > > at
> > > > > >> C or C++ call sites.
> > > > > >>
> > > > > >> The PR providing the specification is here:
> > > > > >> https://github.com/apache/arrow/pull/5442
> > > > > >>
> > > > > >> In particular, you can read the spec document here:
> > > > > >>
> > > >
> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> > > > > >>
> > > > > >> A fairly comprehensive C++ implementation of this demonstrating
> its
> > > > > >> use is found here:
> > > > > >> https://github.com/apache/arrow/pull/5608
> > > > > >>
> > > > > >> (note that other applications implementing the interface may
> choose to
> > > > > >> only support a few features and thus have far less code to
> write)
> > > > > >>
> > > > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > > > > >>
> > > > > >> This vote will be open for at least 72 hours
> > > > > >>
> > > > > >> [ ] +1 Adopt C Data Interface specification
> > > > > >> [ ] +0
> > > > > >> [ ] -1 Do not adopt because...
> > > > > >>
> > > > > >> Thank you
> > > > > >>
> > > > > >> Regards
> > > > > >>
> > > > > >> Antoine.
> > > > > >>
> > > > > >>
> > > > > >> (PS: yes, this is in large part a copy/paste of Wes's previous
> vote
> > > > > >> email :-))
> > > >
>

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Wes McKinney <we...@gmail.com>.
There is only 1 binding +1 vote so far, we should probably wait for
three before closing the vote (it's possible that lazy consensus could
be employed here but not much harm in waiting a few more days)

On Thu, Feb 13, 2020 at 8:15 PM Francois Saint-Jacques
<fs...@gmail.com> wrote:
>
> +1
>
> On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:
> >
> > +1 (binding)
> >
> > On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney <we...@gmail.com> wrote:
> >
> > > +1 (binding)
> > >
> > > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org> wrote:
> > > >
> > > >
> > > > Ah, you're right, it's PR 6040:
> > > > https://github.com/apache/arrow/pull/6040
> > > >
> > > > Similarly, the C++ implementation is at PR 6026:
> > > > https://github.com/apache/arrow/pull/6026
> > > >
> > > > Regards
> > > >
> > > > Antoine.
> > > >
> > > >
> > > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> > > > > hi Antoine, PR 5442 seems to no longer be the right one. Which open PR
> > > > > contains the specification now?
> > > > >
> > > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <an...@python.org>
> > > wrote:
> > > > >>
> > > > >>
> > > > >> Hello,
> > > > >>
> > > > >> We have been discussing the creation of a minimalist C-based data
> > > > >> interface for applications to exchange Arrow columnar data structures
> > > > >> with each other. Some notable features of this interface include:
> > > > >>
> > > > >> * A small amount of header-only C code can be copied independently
> > > into
> > > > >> third-party libraries and downstream applications, no dependencies are
> > > > >> needed even on Arrow C++ itself (notably, it is not required to use
> > > > >> Flatbuffers, though there are trade-offs resulting from this).
> > > > >>
> > > > >> * Low development investment (in other words: limited-scope use cases
> > > > >> can be accomplished with little code), so as to enable C or C++
> > > > >> libraries to export Arrow columnar data with minimal code.
> > > > >>
> > > > >> * Data lifetime management hooks so as to properly handle non-trivial
> > > > >> data sharing (for example passing Arrow columnar data to an async
> > > > >> processing consumer).
> > > > >>
> > > > >> This "C Data Interface" serves different use cases from the
> > > > >> language-independent IPC protocol and trades away a number of features
> > > > >> in the interest of minimalism / simplicity. It is not a replacement
> > > for
> > > > >> the IPC protocol and will only be used to interchange in-process data
> > > at
> > > > >> C or C++ call sites.
> > > > >>
> > > > >> The PR providing the specification is here:
> > > > >> https://github.com/apache/arrow/pull/5442
> > > > >>
> > > > >> In particular, you can read the spec document here:
> > > > >>
> > > https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> > > > >>
> > > > >> A fairly comprehensive C++ implementation of this demonstrating its
> > > > >> use is found here:
> > > > >> https://github.com/apache/arrow/pull/5608
> > > > >>
> > > > >> (note that other applications implementing the interface may choose to
> > > > >> only support a few features and thus have far less code to write)
> > > > >>
> > > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > > > >>
> > > > >> This vote will be open for at least 72 hours
> > > > >>
> > > > >> [ ] +1 Adopt C Data Interface specification
> > > > >> [ ] +0
> > > > >> [ ] -1 Do not adopt because...
> > > > >>
> > > > >> Thank you
> > > > >>
> > > > >> Regards
> > > > >>
> > > > >> Antoine.
> > > > >>
> > > > >>
> > > > >> (PS: yes, this is in large part a copy/paste of Wes's previous vote
> > > > >> email :-))
> > >

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Francois Saint-Jacques <fs...@gmail.com>.
+1

On Thu, Feb 13, 2020 at 9:08 PM Fan Liya <li...@gmail.com> wrote:
>
> +1 (binding)
>
> On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney <we...@gmail.com> wrote:
>
> > +1 (binding)
> >
> > On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org> wrote:
> > >
> > >
> > > Ah, you're right, it's PR 6040:
> > > https://github.com/apache/arrow/pull/6040
> > >
> > > Similarly, the C++ implementation is at PR 6026:
> > > https://github.com/apache/arrow/pull/6026
> > >
> > > Regards
> > >
> > > Antoine.
> > >
> > >
> > > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> > > > hi Antoine, PR 5442 seems to no longer be the right one. Which open PR
> > > > contains the specification now?
> > > >
> > > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <an...@python.org>
> > wrote:
> > > >>
> > > >>
> > > >> Hello,
> > > >>
> > > >> We have been discussing the creation of a minimalist C-based data
> > > >> interface for applications to exchange Arrow columnar data structures
> > > >> with each other. Some notable features of this interface include:
> > > >>
> > > >> * A small amount of header-only C code can be copied independently
> > into
> > > >> third-party libraries and downstream applications, no dependencies are
> > > >> needed even on Arrow C++ itself (notably, it is not required to use
> > > >> Flatbuffers, though there are trade-offs resulting from this).
> > > >>
> > > >> * Low development investment (in other words: limited-scope use cases
> > > >> can be accomplished with little code), so as to enable C or C++
> > > >> libraries to export Arrow columnar data with minimal code.
> > > >>
> > > >> * Data lifetime management hooks so as to properly handle non-trivial
> > > >> data sharing (for example passing Arrow columnar data to an async
> > > >> processing consumer).
> > > >>
> > > >> This "C Data Interface" serves different use cases from the
> > > >> language-independent IPC protocol and trades away a number of features
> > > >> in the interest of minimalism / simplicity. It is not a replacement
> > for
> > > >> the IPC protocol and will only be used to interchange in-process data
> > at
> > > >> C or C++ call sites.
> > > >>
> > > >> The PR providing the specification is here:
> > > >> https://github.com/apache/arrow/pull/5442
> > > >>
> > > >> In particular, you can read the spec document here:
> > > >>
> > https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> > > >>
> > > >> A fairly comprehensive C++ implementation of this demonstrating its
> > > >> use is found here:
> > > >> https://github.com/apache/arrow/pull/5608
> > > >>
> > > >> (note that other applications implementing the interface may choose to
> > > >> only support a few features and thus have far less code to write)
> > > >>
> > > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > > >>
> > > >> This vote will be open for at least 72 hours
> > > >>
> > > >> [ ] +1 Adopt C Data Interface specification
> > > >> [ ] +0
> > > >> [ ] -1 Do not adopt because...
> > > >>
> > > >> Thank you
> > > >>
> > > >> Regards
> > > >>
> > > >> Antoine.
> > > >>
> > > >>
> > > >> (PS: yes, this is in large part a copy/paste of Wes's previous vote
> > > >> email :-))
> >

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Fan Liya <li...@gmail.com>.
+1 (binding)

On Thu, Feb 13, 2020 at 11:52 AM Wes McKinney <we...@gmail.com> wrote:

> +1 (binding)
>
> On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org> wrote:
> >
> >
> > Ah, you're right, it's PR 6040:
> > https://github.com/apache/arrow/pull/6040
> >
> > Similarly, the C++ implementation is at PR 6026:
> > https://github.com/apache/arrow/pull/6026
> >
> > Regards
> >
> > Antoine.
> >
> >
> > Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> > > hi Antoine, PR 5442 seems to no longer be the right one. Which open PR
> > > contains the specification now?
> > >
> > > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <an...@python.org>
> wrote:
> > >>
> > >>
> > >> Hello,
> > >>
> > >> We have been discussing the creation of a minimalist C-based data
> > >> interface for applications to exchange Arrow columnar data structures
> > >> with each other. Some notable features of this interface include:
> > >>
> > >> * A small amount of header-only C code can be copied independently
> into
> > >> third-party libraries and downstream applications, no dependencies are
> > >> needed even on Arrow C++ itself (notably, it is not required to use
> > >> Flatbuffers, though there are trade-offs resulting from this).
> > >>
> > >> * Low development investment (in other words: limited-scope use cases
> > >> can be accomplished with little code), so as to enable C or C++
> > >> libraries to export Arrow columnar data with minimal code.
> > >>
> > >> * Data lifetime management hooks so as to properly handle non-trivial
> > >> data sharing (for example passing Arrow columnar data to an async
> > >> processing consumer).
> > >>
> > >> This "C Data Interface" serves different use cases from the
> > >> language-independent IPC protocol and trades away a number of features
> > >> in the interest of minimalism / simplicity. It is not a replacement
> for
> > >> the IPC protocol and will only be used to interchange in-process data
> at
> > >> C or C++ call sites.
> > >>
> > >> The PR providing the specification is here:
> > >> https://github.com/apache/arrow/pull/5442
> > >>
> > >> In particular, you can read the spec document here:
> > >>
> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> > >>
> > >> A fairly comprehensive C++ implementation of this demonstrating its
> > >> use is found here:
> > >> https://github.com/apache/arrow/pull/5608
> > >>
> > >> (note that other applications implementing the interface may choose to
> > >> only support a few features and thus have far less code to write)
> > >>
> > >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> > >>
> > >> This vote will be open for at least 72 hours
> > >>
> > >> [ ] +1 Adopt C Data Interface specification
> > >> [ ] +0
> > >> [ ] -1 Do not adopt because...
> > >>
> > >> Thank you
> > >>
> > >> Regards
> > >>
> > >> Antoine.
> > >>
> > >>
> > >> (PS: yes, this is in large part a copy/paste of Wes's previous vote
> > >> email :-))
>

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Wes McKinney <we...@gmail.com>.
+1 (binding)

On Tue, Feb 11, 2020 at 4:29 PM Antoine Pitrou <an...@python.org> wrote:
>
>
> Ah, you're right, it's PR 6040:
> https://github.com/apache/arrow/pull/6040
>
> Similarly, the C++ implementation is at PR 6026:
> https://github.com/apache/arrow/pull/6026
>
> Regards
>
> Antoine.
>
>
> Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> > hi Antoine, PR 5442 seems to no longer be the right one. Which open PR
> > contains the specification now?
> >
> > On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <an...@python.org> wrote:
> >>
> >>
> >> Hello,
> >>
> >> We have been discussing the creation of a minimalist C-based data
> >> interface for applications to exchange Arrow columnar data structures
> >> with each other. Some notable features of this interface include:
> >>
> >> * A small amount of header-only C code can be copied independently into
> >> third-party libraries and downstream applications, no dependencies are
> >> needed even on Arrow C++ itself (notably, it is not required to use
> >> Flatbuffers, though there are trade-offs resulting from this).
> >>
> >> * Low development investment (in other words: limited-scope use cases
> >> can be accomplished with little code), so as to enable C or C++
> >> libraries to export Arrow columnar data with minimal code.
> >>
> >> * Data lifetime management hooks so as to properly handle non-trivial
> >> data sharing (for example passing Arrow columnar data to an async
> >> processing consumer).
> >>
> >> This "C Data Interface" serves different use cases from the
> >> language-independent IPC protocol and trades away a number of features
> >> in the interest of minimalism / simplicity. It is not a replacement for
> >> the IPC protocol and will only be used to interchange in-process data at
> >> C or C++ call sites.
> >>
> >> The PR providing the specification is here:
> >> https://github.com/apache/arrow/pull/5442
> >>
> >> In particular, you can read the spec document here:
> >> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> >>
> >> A fairly comprehensive C++ implementation of this demonstrating its
> >> use is found here:
> >> https://github.com/apache/arrow/pull/5608
> >>
> >> (note that other applications implementing the interface may choose to
> >> only support a few features and thus have far less code to write)
> >>
> >> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> >>
> >> This vote will be open for at least 72 hours
> >>
> >> [ ] +1 Adopt C Data Interface specification
> >> [ ] +0
> >> [ ] -1 Do not adopt because...
> >>
> >> Thank you
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >> (PS: yes, this is in large part a copy/paste of Wes's previous vote
> >> email :-))

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Antoine Pitrou <an...@python.org>.
Ah, you're right, it's PR 6040:
https://github.com/apache/arrow/pull/6040

Similarly, the C++ implementation is at PR 6026:
https://github.com/apache/arrow/pull/6026

Regards

Antoine.


Le 11/02/2020 à 23:17, Wes McKinney a écrit :
> hi Antoine, PR 5442 seems to no longer be the right one. Which open PR
> contains the specification now?
> 
> On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <an...@python.org> wrote:
>>
>>
>> Hello,
>>
>> We have been discussing the creation of a minimalist C-based data
>> interface for applications to exchange Arrow columnar data structures
>> with each other. Some notable features of this interface include:
>>
>> * A small amount of header-only C code can be copied independently into
>> third-party libraries and downstream applications, no dependencies are
>> needed even on Arrow C++ itself (notably, it is not required to use
>> Flatbuffers, though there are trade-offs resulting from this).
>>
>> * Low development investment (in other words: limited-scope use cases
>> can be accomplished with little code), so as to enable C or C++
>> libraries to export Arrow columnar data with minimal code.
>>
>> * Data lifetime management hooks so as to properly handle non-trivial
>> data sharing (for example passing Arrow columnar data to an async
>> processing consumer).
>>
>> This "C Data Interface" serves different use cases from the
>> language-independent IPC protocol and trades away a number of features
>> in the interest of minimalism / simplicity. It is not a replacement for
>> the IPC protocol and will only be used to interchange in-process data at
>> C or C++ call sites.
>>
>> The PR providing the specification is here:
>> https://github.com/apache/arrow/pull/5442
>>
>> In particular, you can read the spec document here:
>> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
>>
>> A fairly comprehensive C++ implementation of this demonstrating its
>> use is found here:
>> https://github.com/apache/arrow/pull/5608
>>
>> (note that other applications implementing the interface may choose to
>> only support a few features and thus have far less code to write)
>>
>> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
>>
>> This vote will be open for at least 72 hours
>>
>> [ ] +1 Adopt C Data Interface specification
>> [ ] +0
>> [ ] -1 Do not adopt because...
>>
>> Thank you
>>
>> Regards
>>
>> Antoine.
>>
>>
>> (PS: yes, this is in large part a copy/paste of Wes's previous vote
>> email :-))

Re: [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Wes McKinney <we...@gmail.com>.
hi Antoine, PR 5442 seems to no longer be the right one. Which open PR
contains the specification now?

On Tue, Feb 11, 2020 at 1:06 PM Antoine Pitrou <an...@python.org> wrote:
>
>
> Hello,
>
> We have been discussing the creation of a minimalist C-based data
> interface for applications to exchange Arrow columnar data structures
> with each other. Some notable features of this interface include:
>
> * A small amount of header-only C code can be copied independently into
> third-party libraries and downstream applications, no dependencies are
> needed even on Arrow C++ itself (notably, it is not required to use
> Flatbuffers, though there are trade-offs resulting from this).
>
> * Low development investment (in other words: limited-scope use cases
> can be accomplished with little code), so as to enable C or C++
> libraries to export Arrow columnar data with minimal code.
>
> * Data lifetime management hooks so as to properly handle non-trivial
> data sharing (for example passing Arrow columnar data to an async
> processing consumer).
>
> This "C Data Interface" serves different use cases from the
> language-independent IPC protocol and trades away a number of features
> in the interest of minimalism / simplicity. It is not a replacement for
> the IPC protocol and will only be used to interchange in-process data at
> C or C++ call sites.
>
> The PR providing the specification is here:
> https://github.com/apache/arrow/pull/5442
>
> In particular, you can read the spec document here:
> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
>
> A fairly comprehensive C++ implementation of this demonstrating its
> use is found here:
> https://github.com/apache/arrow/pull/5608
>
> (note that other applications implementing the interface may choose to
> only support a few features and thus have far less code to write)
>
> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
>
> This vote will be open for at least 72 hours
>
> [ ] +1 Adopt C Data Interface specification
> [ ] +0
> [ ] -1 Do not adopt because...
>
> Thank you
>
> Regards
>
> Antoine.
>
>
> (PS: yes, this is in large part a copy/paste of Wes's previous vote
> email :-))

[RESULT] [VOTE] Adopt Arrow in-process C Data Interface specification

Posted by Antoine Pitrou <so...@pitrou.net>.
Hello,

The vote succeeds with 3 +1 (binding) and 2 +1 (non-binding).

I'll soon open a JIRA for the specification and the C++ implementation,
so that we can merge those timely.

Regards

Antoine.



On Tue, 11 Feb 2020 20:06:33 +0100
Antoine Pitrou <an...@python.org> wrote:
> Hello,
> 
> We have been discussing the creation of a minimalist C-based data
> interface for applications to exchange Arrow columnar data structures
> with each other. Some notable features of this interface include:
> 
> * A small amount of header-only C code can be copied independently into
> third-party libraries and downstream applications, no dependencies are
> needed even on Arrow C++ itself (notably, it is not required to use
> Flatbuffers, though there are trade-offs resulting from this).
> 
> * Low development investment (in other words: limited-scope use cases
> can be accomplished with little code), so as to enable C or C++
> libraries to export Arrow columnar data with minimal code.
> 
> * Data lifetime management hooks so as to properly handle non-trivial
> data sharing (for example passing Arrow columnar data to an async
> processing consumer).
> 
> This "C Data Interface" serves different use cases from the
> language-independent IPC protocol and trades away a number of features
> in the interest of minimalism / simplicity. It is not a replacement for
> the IPC protocol and will only be used to interchange in-process data at
> C or C++ call sites.
> 
> The PR providing the specification is here:
> https://github.com/apache/arrow/pull/5442
> 
> In particular, you can read the spec document here:
> https://github.com/pitrou/arrow/blob/doc-c-data-interface2/docs/source/format/CDataInterface.rst
> 
> A fairly comprehensive C++ implementation of this demonstrating its
> use is found here:
> https://github.com/apache/arrow/pull/5608
> 
> (note that other applications implementing the interface may choose to
> only support a few features and thus have far less code to write)
> 
> Please vote to adopt the SPECIFICATION (GitHub PR #5442).
> 
> This vote will be open for at least 72 hours
> 
> [ ] +1 Adopt C Data Interface specification
> [ ] +0
> [ ] -1 Do not adopt because...
> 
> Thank you
> 
> Regards
> 
> Antoine.
> 
> 
> (PS: yes, this is in large part a copy/paste of Wes's previous vote
> email :-))
>