You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@arrow.apache.org by David Li <li...@apache.org> on 2023/01/10 18:58:39 UTC

[ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

The Apache Arrow community is pleased to announce the 0.1.0 release of the Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues ([1]). 

The release is available now from [2] and [3].

Release notes are available at: 
https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md

What is Apache Arrow?
---------------------
Apache Arrow is a columnar in-memory analytics layer designed to accelerate big data. It houses a set of canonical in-memory representations of flat and hierarchical data along with multiple language-bindings for structure manipulation. It also provides low-overhead streaming and batch messaging, zero-copy interprocess communication (IPC), and vectorized in-memory analytics libraries. Languages currently supported include C, C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust.

What is Apache Arrow ADBC?
--------------------------
ADBC is a database access abstraction for Arrow-based applications. It provides a cross-language API for working with databases while using Arrow data, providing an alternative to APIs like JDBC and ODBC for analytical applications. For more, see [4].

Please report any feedback to the mailing lists ([5], [6]).

Regards,
The Apache Arrow Community

[1]: https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
[2]: https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
[3]: https://apache.jfrog.io/ui/native/arrow
[4]: https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
[5]: https://lists.apache.org/list.html?user@arrow.apache.org
[6]: https://lists.apache.org/list.html?dev@arrow.apache.org

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by Will Jones <wi...@gmail.com>.

>
> Thanks for the reference. I feel like this must've been shared earlier but
> I missed it.


If it seems familiar, it was mentioned in this earlier user ML thread [1].

The other thing I'd be curious about is if we can generalize this subset of
> SQL/Substrait to drivers for other 'storage layers' like Apache Iceberg and
> Apache Hudi.
>

Yeah it's basically Ian Cook's idea of plan delegation [2] to storage
systems. Not sure if it's storage layers in general or table formats
specifically that we might want to define, but it's an interesting idea.

On Substrait, I'm holding off implementing anything for now, in the hopes
that later on we might get the Spark implementation of Delta Lake to align
on Substrait definitions for operations on Delta Lake tables. But if we can
get shared messages with Iceberg and Hudi, that would be even better. Seems
feasible at first glance.

[1] https://lists.apache.org/thread/fywntyryy7pr1ttzv25s3ghf6sqy7zjl
[2] https://youtu.be/5JjaB7p3Sjk?t=550

On Mon, Jan 16, 2023 at 4:01 PM David Li <li...@apache.org> wrote:

> Thanks for the reference. I feel like this must've been shared earlier but
> I missed it.
>
> Another direction I mean to explore: implementing an Arrow Dataset backend
> using ADBC, so that we can feed SQL databases (and now Delta Lake) into
> (Py)Arrow Dataset, and then further into Acero (and the R package's dplyr
> bindings, ...).
>
> The other thing I'd be curious about is if we can generalize this subset
> of SQL/Substrait to drivers for other 'storage layers' like Apache Iceberg
> and Apache Hudi.
>
> On Mon, Jan 16, 2023, at 17:53, Will Jones wrote:
> >>
> >> You could do something like what Matt Topol's done for Go
> >>
> >
> > Thank you for the link! That's very similar to what I am thinking for
> Rust.
> > I will look at that as a reference. :)
> >
> > What do you plan for a "query" to mean to the ADBC Delta Lake driver?
> Would
> >> that be a subset of Substrait that gets mapped to a table scan (with
> >> optional filter/selection)?
> >>
> >
> > Reads are basically a Substrait read relation. Other queries like CREATE
> > TABLE, DELETE, UPDATE are passed as simple SQL or Substrait queries. And
> > then engines can use the driver as a sink (binding output data as a
> record
> > batch stream) for INSERT, OVERWRITE, and MERGE operations. Further
> details
> > are in the design doc [1].
> >
> > The audience is query engines that want to add Delta Lake support (read,
> > write, modify) without getting deep into the details of the format and
> > writer protocol. The latter is especially complex. Whereas a database
> like
> > Postgres will validate new data and handle transaction logic, in Delta
> Lake
> > that responsibility falls on each write.
> >
> > [1]
> >
> https://docs.google.com/document/d/1ud-iBPg8VVz2N3HxySz9qbrffw6a9I7TiGZJ2MBs7ZE/edit?usp=sharing
> >
> >
> > On Mon, Jan 16, 2023 at 2:26 PM David Li <li...@apache.org> wrote:
> >
> >> Exciting!
> >>
> >> You could do something like what Matt Topol's done for Go: define a
> native
> >> Go API for ADBC, then a generic adapter to wrap any Go ADBC driver as a
> C
> >> one. See [1]. As a bonus,  you can then have a more natural (and safe)
> API
> >> for implementing the actual driver, and relegate the fiddly FFI bits to
> the
> >> adapter.
> >>
> >> What do you plan for a "query" to mean to the ADBC Delta Lake driver?
> >> Would that be a subset of Substrait that gets mapped to a table scan
> (with
> >> optional filter/selection)?
> >>
> >> [1]: https://github.com/apache/arrow-adbc/pull/347
> >>
> >> On Mon, Jan 16, 2023, at 16:09, Will Jones wrote:
> >> > Andrew and David,
> >> >
> >> > I'm starting to work on the ADBC connector for Delta Lake (in the
> >> delta-rs
> >> > repo) [1], written in Rust.
> >> >
> >> > I'm thinking there's some general code I can factor out to make it
> easier
> >> > for Rust developers to create ADBC drivers. I've created an issue to
> >> track
> >> > that in the arrow-rs repo [2]. If there's anyone else planning on
> working
> >> > with ADBC in Rust, I would be happy to collaborate.
> >> >
> >> > Best,
> >> >
> >> > Will Jones
> >> >
> >> > [1] https://github.com/delta-io/delta-rs/pull/945
> >> > [2] https://github.com/apache/arrow-rs/issues/3540
> >> >
> >> > On Sun, Jan 15, 2023 at 5:33 AM Andrew Lamb <al...@influxdata.com>
> >> wrote:
> >> >
> >> >> Thanks David -- I think currently the Rust implementation of
> >> arrow-flight
> >> >> and arrow-sql are being hammered out
> >> >>
> >> >> There are several projects that are working to implement FlightSQL in
> >> >> various stages of completeness (I know of Ballista and IOx) and so I
> >> expect
> >> >> FlightSQL support to be better in arrow-rs over the next few months.
> As
> >> >> part of that I expect we'll be using the integration tests and
> >> contribute
> >> >> back to other implementations as needed.
> >> >>
> >> >>
> >> >>
> >> >> On Sat, Jan 14, 2023 at 9:11 AM David Li <li...@apache.org>
> wrote:
> >> >>
> >> >> > Thanks Andrew! Several people helped, particularly Kou, Matt, and
> >> Jacob,
> >> >> > and this release also builds heavily on the nanoarrow project that
> >> Dewey
> >> >> is
> >> >> > spearheading.
> >> >> >
> >> >> > I know Rust was neglected for this initial push, but I would like
> to
> >> get
> >> >> > around to that someday. (If you're interested, feel free to propose
> >> >> > something or start a discussion. My Rust is too, well, rusty to put
> >> >> forward
> >> >> > a coherent proposal at the moment.)
> >> >> >
> >> >> > -David
> >> >> >
> >> >> > On Fri, Jan 13, 2023, at 16:00, Andrew Lamb wrote:
> >> >> > > Thank you David and everyone else who helped make this happen --
> >> really
> >> >> > > nice work filling in the Arrow / Database integration story.
> >> >> > >
> >> >> > > Andrew
> >> >> > >
> >> >> > > On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org>
> >> wrote:
> >> >> > >
> >> >> > >> The Apache Arrow community is pleased to announce the 0.1.0
> >> release of
> >> >> > the
> >> >> > >> Apache Arrow ADBC libraries. It includes 63 resolved GitHub
> issues
> >> >> > ([1]).
> >> >> > >>
> >> >> > >> The release is available now from [2] and [3].
> >> >> > >>
> >> >> > >> Release notes are available at:
> >> >> > >>
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
> >> >> > >>
> >> >> > >> What is Apache Arrow?
> >> >> > >> ---------------------
> >> >> > >> Apache Arrow is a columnar in-memory analytics layer designed to
> >> >> > >> accelerate big data. It houses a set of canonical in-memory
> >> >> > representations
> >> >> > >> of flat and hierarchical data along with multiple
> language-bindings
> >> >> for
> >> >> > >> structure manipulation. It also provides low-overhead streaming
> and
> >> >> > batch
> >> >> > >> messaging, zero-copy interprocess communication (IPC), and
> >> vectorized
> >> >> > >> in-memory analytics libraries. Languages currently supported
> >> include
> >> >> C,
> >> >> > >> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby,
> and
> >> >> Rust.
> >> >> > >>
> >> >> > >> What is Apache Arrow ADBC?
> >> >> > >> --------------------------
> >> >> > >> ADBC is a database access abstraction for Arrow-based
> >> applications. It
> >> >> > >> provides a cross-language API for working with databases while
> >> using
> >> >> > Arrow
> >> >> > >> data, providing an alternative to APIs like JDBC and ODBC for
> >> >> analytical
> >> >> > >> applications. For more, see [4].
> >> >> > >>
> >> >> > >> Please report any feedback to the mailing lists ([5], [6]).
> >> >> > >>
> >> >> > >> Regards,
> >> >> > >> The Apache Arrow Community
> >> >> > >>
> >> >> > >> [1]:
> >> >> > >>
> >> >> >
> >> >>
> >>
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
> >> >> > >> [2]:
> >> >> >
> https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
> >> >> > >> [3]: https://apache.jfrog.io/ui/native/arrow
> >> >> > >> [4]:
> >> https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
> >> >> > >> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
> >> >> > >> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
> >> >> > >>
> >> >> >
> >> >>
> >>
>

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by David Li <li...@apache.org>.

Thanks for the reference. I feel like this must've been shared earlier but I missed it.

Another direction I mean to explore: implementing an Arrow Dataset backend using ADBC, so that we can feed SQL databases (and now Delta Lake) into (Py)Arrow Dataset, and then further into Acero (and the R package's dplyr bindings, ...). 

The other thing I'd be curious about is if we can generalize this subset of SQL/Substrait to drivers for other 'storage layers' like Apache Iceberg and Apache Hudi.

On Mon, Jan 16, 2023, at 17:53, Will Jones wrote:
>>
>> You could do something like what Matt Topol's done for Go
>>
>
> Thank you for the link! That's very similar to what I am thinking for Rust.
> I will look at that as a reference. :)
>
> What do you plan for a "query" to mean to the ADBC Delta Lake driver? Would
>> that be a subset of Substrait that gets mapped to a table scan (with
>> optional filter/selection)?
>>
>
> Reads are basically a Substrait read relation. Other queries like CREATE
> TABLE, DELETE, UPDATE are passed as simple SQL or Substrait queries. And
> then engines can use the driver as a sink (binding output data as a record
> batch stream) for INSERT, OVERWRITE, and MERGE operations. Further details
> are in the design doc [1].
>
> The audience is query engines that want to add Delta Lake support (read,
> write, modify) without getting deep into the details of the format and
> writer protocol. The latter is especially complex. Whereas a database like
> Postgres will validate new data and handle transaction logic, in Delta Lake
> that responsibility falls on each write.
>
> [1]
> https://docs.google.com/document/d/1ud-iBPg8VVz2N3HxySz9qbrffw6a9I7TiGZJ2MBs7ZE/edit?usp=sharing
>
>
> On Mon, Jan 16, 2023 at 2:26 PM David Li <li...@apache.org> wrote:
>
>> Exciting!
>>
>> You could do something like what Matt Topol's done for Go: define a native
>> Go API for ADBC, then a generic adapter to wrap any Go ADBC driver as a C
>> one. See [1]. As a bonus,  you can then have a more natural (and safe) API
>> for implementing the actual driver, and relegate the fiddly FFI bits to the
>> adapter.
>>
>> What do you plan for a "query" to mean to the ADBC Delta Lake driver?
>> Would that be a subset of Substrait that gets mapped to a table scan (with
>> optional filter/selection)?
>>
>> [1]: https://github.com/apache/arrow-adbc/pull/347
>>
>> On Mon, Jan 16, 2023, at 16:09, Will Jones wrote:
>> > Andrew and David,
>> >
>> > I'm starting to work on the ADBC connector for Delta Lake (in the
>> delta-rs
>> > repo) [1], written in Rust.
>> >
>> > I'm thinking there's some general code I can factor out to make it easier
>> > for Rust developers to create ADBC drivers. I've created an issue to
>> track
>> > that in the arrow-rs repo [2]. If there's anyone else planning on working
>> > with ADBC in Rust, I would be happy to collaborate.
>> >
>> > Best,
>> >
>> > Will Jones
>> >
>> > [1] https://github.com/delta-io/delta-rs/pull/945
>> > [2] https://github.com/apache/arrow-rs/issues/3540
>> >
>> > On Sun, Jan 15, 2023 at 5:33 AM Andrew Lamb <al...@influxdata.com>
>> wrote:
>> >
>> >> Thanks David -- I think currently the Rust implementation of
>> arrow-flight
>> >> and arrow-sql are being hammered out
>> >>
>> >> There are several projects that are working to implement FlightSQL in
>> >> various stages of completeness (I know of Ballista and IOx) and so I
>> expect
>> >> FlightSQL support to be better in arrow-rs over the next few months. As
>> >> part of that I expect we'll be using the integration tests and
>> contribute
>> >> back to other implementations as needed.
>> >>
>> >>
>> >>
>> >> On Sat, Jan 14, 2023 at 9:11 AM David Li <li...@apache.org> wrote:
>> >>
>> >> > Thanks Andrew! Several people helped, particularly Kou, Matt, and
>> Jacob,
>> >> > and this release also builds heavily on the nanoarrow project that
>> Dewey
>> >> is
>> >> > spearheading.
>> >> >
>> >> > I know Rust was neglected for this initial push, but I would like to
>> get
>> >> > around to that someday. (If you're interested, feel free to propose
>> >> > something or start a discussion. My Rust is too, well, rusty to put
>> >> forward
>> >> > a coherent proposal at the moment.)
>> >> >
>> >> > -David
>> >> >
>> >> > On Fri, Jan 13, 2023, at 16:00, Andrew Lamb wrote:
>> >> > > Thank you David and everyone else who helped make this happen --
>> really
>> >> > > nice work filling in the Arrow / Database integration story.
>> >> > >
>> >> > > Andrew
>> >> > >
>> >> > > On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org>
>> wrote:
>> >> > >
>> >> > >> The Apache Arrow community is pleased to announce the 0.1.0
>> release of
>> >> > the
>> >> > >> Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues
>> >> > ([1]).
>> >> > >>
>> >> > >> The release is available now from [2] and [3].
>> >> > >>
>> >> > >> Release notes are available at:
>> >> > >>
>> >> > >>
>> >> >
>> >>
>> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
>> >> > >>
>> >> > >> What is Apache Arrow?
>> >> > >> ---------------------
>> >> > >> Apache Arrow is a columnar in-memory analytics layer designed to
>> >> > >> accelerate big data. It houses a set of canonical in-memory
>> >> > representations
>> >> > >> of flat and hierarchical data along with multiple language-bindings
>> >> for
>> >> > >> structure manipulation. It also provides low-overhead streaming and
>> >> > batch
>> >> > >> messaging, zero-copy interprocess communication (IPC), and
>> vectorized
>> >> > >> in-memory analytics libraries. Languages currently supported
>> include
>> >> C,
>> >> > >> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and
>> >> Rust.
>> >> > >>
>> >> > >> What is Apache Arrow ADBC?
>> >> > >> --------------------------
>> >> > >> ADBC is a database access abstraction for Arrow-based
>> applications. It
>> >> > >> provides a cross-language API for working with databases while
>> using
>> >> > Arrow
>> >> > >> data, providing an alternative to APIs like JDBC and ODBC for
>> >> analytical
>> >> > >> applications. For more, see [4].
>> >> > >>
>> >> > >> Please report any feedback to the mailing lists ([5], [6]).
>> >> > >>
>> >> > >> Regards,
>> >> > >> The Apache Arrow Community
>> >> > >>
>> >> > >> [1]:
>> >> > >>
>> >> >
>> >>
>> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
>> >> > >> [2]:
>> >> > https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
>> >> > >> [3]: https://apache.jfrog.io/ui/native/arrow
>> >> > >> [4]:
>> https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
>> >> > >> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
>> >> > >> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
>> >> > >>
>> >> >
>> >>
>>

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by Will Jones <wi...@gmail.com>.

>
> You could do something like what Matt Topol's done for Go
>

Thank you for the link! That's very similar to what I am thinking for Rust.
I will look at that as a reference. :)

What do you plan for a "query" to mean to the ADBC Delta Lake driver? Would
> that be a subset of Substrait that gets mapped to a table scan (with
> optional filter/selection)?
>

Reads are basically a Substrait read relation. Other queries like CREATE
TABLE, DELETE, UPDATE are passed as simple SQL or Substrait queries. And
then engines can use the driver as a sink (binding output data as a record
batch stream) for INSERT, OVERWRITE, and MERGE operations. Further details
are in the design doc [1].

The audience is query engines that want to add Delta Lake support (read,
write, modify) without getting deep into the details of the format and
writer protocol. The latter is especially complex. Whereas a database like
Postgres will validate new data and handle transaction logic, in Delta Lake
that responsibility falls on each write.

[1]
https://docs.google.com/document/d/1ud-iBPg8VVz2N3HxySz9qbrffw6a9I7TiGZJ2MBs7ZE/edit?usp=sharing


On Mon, Jan 16, 2023 at 2:26 PM David Li <li...@apache.org> wrote:

> Exciting!
>
> You could do something like what Matt Topol's done for Go: define a native
> Go API for ADBC, then a generic adapter to wrap any Go ADBC driver as a C
> one. See [1]. As a bonus,  you can then have a more natural (and safe) API
> for implementing the actual driver, and relegate the fiddly FFI bits to the
> adapter.
>
> What do you plan for a "query" to mean to the ADBC Delta Lake driver?
> Would that be a subset of Substrait that gets mapped to a table scan (with
> optional filter/selection)?
>
> [1]: https://github.com/apache/arrow-adbc/pull/347
>
> On Mon, Jan 16, 2023, at 16:09, Will Jones wrote:
> > Andrew and David,
> >
> > I'm starting to work on the ADBC connector for Delta Lake (in the
> delta-rs
> > repo) [1], written in Rust.
> >
> > I'm thinking there's some general code I can factor out to make it easier
> > for Rust developers to create ADBC drivers. I've created an issue to
> track
> > that in the arrow-rs repo [2]. If there's anyone else planning on working
> > with ADBC in Rust, I would be happy to collaborate.
> >
> > Best,
> >
> > Will Jones
> >
> > [1] https://github.com/delta-io/delta-rs/pull/945
> > [2] https://github.com/apache/arrow-rs/issues/3540
> >
> > On Sun, Jan 15, 2023 at 5:33 AM Andrew Lamb <al...@influxdata.com>
> wrote:
> >
> >> Thanks David -- I think currently the Rust implementation of
> arrow-flight
> >> and arrow-sql are being hammered out
> >>
> >> There are several projects that are working to implement FlightSQL in
> >> various stages of completeness (I know of Ballista and IOx) and so I
> expect
> >> FlightSQL support to be better in arrow-rs over the next few months. As
> >> part of that I expect we'll be using the integration tests and
> contribute
> >> back to other implementations as needed.
> >>
> >>
> >>
> >> On Sat, Jan 14, 2023 at 9:11 AM David Li <li...@apache.org> wrote:
> >>
> >> > Thanks Andrew! Several people helped, particularly Kou, Matt, and
> Jacob,
> >> > and this release also builds heavily on the nanoarrow project that
> Dewey
> >> is
> >> > spearheading.
> >> >
> >> > I know Rust was neglected for this initial push, but I would like to
> get
> >> > around to that someday. (If you're interested, feel free to propose
> >> > something or start a discussion. My Rust is too, well, rusty to put
> >> forward
> >> > a coherent proposal at the moment.)
> >> >
> >> > -David
> >> >
> >> > On Fri, Jan 13, 2023, at 16:00, Andrew Lamb wrote:
> >> > > Thank you David and everyone else who helped make this happen --
> really
> >> > > nice work filling in the Arrow / Database integration story.
> >> > >
> >> > > Andrew
> >> > >
> >> > > On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org>
> wrote:
> >> > >
> >> > >> The Apache Arrow community is pleased to announce the 0.1.0
> release of
> >> > the
> >> > >> Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues
> >> > ([1]).
> >> > >>
> >> > >> The release is available now from [2] and [3].
> >> > >>
> >> > >> Release notes are available at:
> >> > >>
> >> > >>
> >> >
> >>
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
> >> > >>
> >> > >> What is Apache Arrow?
> >> > >> ---------------------
> >> > >> Apache Arrow is a columnar in-memory analytics layer designed to
> >> > >> accelerate big data. It houses a set of canonical in-memory
> >> > representations
> >> > >> of flat and hierarchical data along with multiple language-bindings
> >> for
> >> > >> structure manipulation. It also provides low-overhead streaming and
> >> > batch
> >> > >> messaging, zero-copy interprocess communication (IPC), and
> vectorized
> >> > >> in-memory analytics libraries. Languages currently supported
> include
> >> C,
> >> > >> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and
> >> Rust.
> >> > >>
> >> > >> What is Apache Arrow ADBC?
> >> > >> --------------------------
> >> > >> ADBC is a database access abstraction for Arrow-based
> applications. It
> >> > >> provides a cross-language API for working with databases while
> using
> >> > Arrow
> >> > >> data, providing an alternative to APIs like JDBC and ODBC for
> >> analytical
> >> > >> applications. For more, see [4].
> >> > >>
> >> > >> Please report any feedback to the mailing lists ([5], [6]).
> >> > >>
> >> > >> Regards,
> >> > >> The Apache Arrow Community
> >> > >>
> >> > >> [1]:
> >> > >>
> >> >
> >>
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
> >> > >> [2]:
> >> > https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
> >> > >> [3]: https://apache.jfrog.io/ui/native/arrow
> >> > >> [4]:
> https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
> >> > >> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
> >> > >> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
> >> > >>
> >> >
> >>
>

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by David Li <li...@apache.org>.

Exciting!

You could do something like what Matt Topol's done for Go: define a native Go API for ADBC, then a generic adapter to wrap any Go ADBC driver as a C one. See [1]. As a bonus,  you can then have a more natural (and safe) API for implementing the actual driver, and relegate the fiddly FFI bits to the adapter.

What do you plan for a "query" to mean to the ADBC Delta Lake driver? Would that be a subset of Substrait that gets mapped to a table scan (with optional filter/selection)?

[1]: https://github.com/apache/arrow-adbc/pull/347 

On Mon, Jan 16, 2023, at 16:09, Will Jones wrote:
> Andrew and David,
>
> I'm starting to work on the ADBC connector for Delta Lake (in the delta-rs
> repo) [1], written in Rust.
>
> I'm thinking there's some general code I can factor out to make it easier
> for Rust developers to create ADBC drivers. I've created an issue to track
> that in the arrow-rs repo [2]. If there's anyone else planning on working
> with ADBC in Rust, I would be happy to collaborate.
>
> Best,
>
> Will Jones
>
> [1] https://github.com/delta-io/delta-rs/pull/945
> [2] https://github.com/apache/arrow-rs/issues/3540
>
> On Sun, Jan 15, 2023 at 5:33 AM Andrew Lamb <al...@influxdata.com> wrote:
>
>> Thanks David -- I think currently the Rust implementation of arrow-flight
>> and arrow-sql are being hammered out
>>
>> There are several projects that are working to implement FlightSQL in
>> various stages of completeness (I know of Ballista and IOx) and so I expect
>> FlightSQL support to be better in arrow-rs over the next few months. As
>> part of that I expect we'll be using the integration tests and contribute
>> back to other implementations as needed.
>>
>>
>>
>> On Sat, Jan 14, 2023 at 9:11 AM David Li <li...@apache.org> wrote:
>>
>> > Thanks Andrew! Several people helped, particularly Kou, Matt, and Jacob,
>> > and this release also builds heavily on the nanoarrow project that Dewey
>> is
>> > spearheading.
>> >
>> > I know Rust was neglected for this initial push, but I would like to get
>> > around to that someday. (If you're interested, feel free to propose
>> > something or start a discussion. My Rust is too, well, rusty to put
>> forward
>> > a coherent proposal at the moment.)
>> >
>> > -David
>> >
>> > On Fri, Jan 13, 2023, at 16:00, Andrew Lamb wrote:
>> > > Thank you David and everyone else who helped make this happen -- really
>> > > nice work filling in the Arrow / Database integration story.
>> > >
>> > > Andrew
>> > >
>> > > On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org> wrote:
>> > >
>> > >> The Apache Arrow community is pleased to announce the 0.1.0 release of
>> > the
>> > >> Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues
>> > ([1]).
>> > >>
>> > >> The release is available now from [2] and [3].
>> > >>
>> > >> Release notes are available at:
>> > >>
>> > >>
>> >
>> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
>> > >>
>> > >> What is Apache Arrow?
>> > >> ---------------------
>> > >> Apache Arrow is a columnar in-memory analytics layer designed to
>> > >> accelerate big data. It houses a set of canonical in-memory
>> > representations
>> > >> of flat and hierarchical data along with multiple language-bindings
>> for
>> > >> structure manipulation. It also provides low-overhead streaming and
>> > batch
>> > >> messaging, zero-copy interprocess communication (IPC), and vectorized
>> > >> in-memory analytics libraries. Languages currently supported include
>> C,
>> > >> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and
>> Rust.
>> > >>
>> > >> What is Apache Arrow ADBC?
>> > >> --------------------------
>> > >> ADBC is a database access abstraction for Arrow-based applications. It
>> > >> provides a cross-language API for working with databases while using
>> > Arrow
>> > >> data, providing an alternative to APIs like JDBC and ODBC for
>> analytical
>> > >> applications. For more, see [4].
>> > >>
>> > >> Please report any feedback to the mailing lists ([5], [6]).
>> > >>
>> > >> Regards,
>> > >> The Apache Arrow Community
>> > >>
>> > >> [1]:
>> > >>
>> >
>> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
>> > >> [2]:
>> > https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
>> > >> [3]: https://apache.jfrog.io/ui/native/arrow
>> > >> [4]: https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
>> > >> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
>> > >> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
>> > >>
>> >
>>

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by Will Jones <wi...@gmail.com>.

Andrew and David,

I'm starting to work on the ADBC connector for Delta Lake (in the delta-rs
repo) [1], written in Rust.

I'm thinking there's some general code I can factor out to make it easier
for Rust developers to create ADBC drivers. I've created an issue to track
that in the arrow-rs repo [2]. If there's anyone else planning on working
with ADBC in Rust, I would be happy to collaborate.

Best,

Will Jones

[1] https://github.com/delta-io/delta-rs/pull/945
[2] https://github.com/apache/arrow-rs/issues/3540

On Sun, Jan 15, 2023 at 5:33 AM Andrew Lamb <al...@influxdata.com> wrote:

> Thanks David -- I think currently the Rust implementation of arrow-flight
> and arrow-sql are being hammered out
>
> There are several projects that are working to implement FlightSQL in
> various stages of completeness (I know of Ballista and IOx) and so I expect
> FlightSQL support to be better in arrow-rs over the next few months. As
> part of that I expect we'll be using the integration tests and contribute
> back to other implementations as needed.
>
>
>
> On Sat, Jan 14, 2023 at 9:11 AM David Li <li...@apache.org> wrote:
>
> > Thanks Andrew! Several people helped, particularly Kou, Matt, and Jacob,
> > and this release also builds heavily on the nanoarrow project that Dewey
> is
> > spearheading.
> >
> > I know Rust was neglected for this initial push, but I would like to get
> > around to that someday. (If you're interested, feel free to propose
> > something or start a discussion. My Rust is too, well, rusty to put
> forward
> > a coherent proposal at the moment.)
> >
> > -David
> >
> > On Fri, Jan 13, 2023, at 16:00, Andrew Lamb wrote:
> > > Thank you David and everyone else who helped make this happen -- really
> > > nice work filling in the Arrow / Database integration story.
> > >
> > > Andrew
> > >
> > > On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org> wrote:
> > >
> > >> The Apache Arrow community is pleased to announce the 0.1.0 release of
> > the
> > >> Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues
> > ([1]).
> > >>
> > >> The release is available now from [2] and [3].
> > >>
> > >> Release notes are available at:
> > >>
> > >>
> >
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
> > >>
> > >> What is Apache Arrow?
> > >> ---------------------
> > >> Apache Arrow is a columnar in-memory analytics layer designed to
> > >> accelerate big data. It houses a set of canonical in-memory
> > representations
> > >> of flat and hierarchical data along with multiple language-bindings
> for
> > >> structure manipulation. It also provides low-overhead streaming and
> > batch
> > >> messaging, zero-copy interprocess communication (IPC), and vectorized
> > >> in-memory analytics libraries. Languages currently supported include
> C,
> > >> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and
> Rust.
> > >>
> > >> What is Apache Arrow ADBC?
> > >> --------------------------
> > >> ADBC is a database access abstraction for Arrow-based applications. It
> > >> provides a cross-language API for working with databases while using
> > Arrow
> > >> data, providing an alternative to APIs like JDBC and ODBC for
> analytical
> > >> applications. For more, see [4].
> > >>
> > >> Please report any feedback to the mailing lists ([5], [6]).
> > >>
> > >> Regards,
> > >> The Apache Arrow Community
> > >>
> > >> [1]:
> > >>
> >
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
> > >> [2]:
> > https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
> > >> [3]: https://apache.jfrog.io/ui/native/arrow
> > >> [4]: https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
> > >> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
> > >> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
> > >>
> >
>

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by Andrew Lamb <al...@influxdata.com>.

Thanks David -- I think currently the Rust implementation of arrow-flight
and arrow-sql are being hammered out

There are several projects that are working to implement FlightSQL in
various stages of completeness (I know of Ballista and IOx) and so I expect
FlightSQL support to be better in arrow-rs over the next few months. As
part of that I expect we'll be using the integration tests and contribute
back to other implementations as needed.



On Sat, Jan 14, 2023 at 9:11 AM David Li <li...@apache.org> wrote:

> Thanks Andrew! Several people helped, particularly Kou, Matt, and Jacob,
> and this release also builds heavily on the nanoarrow project that Dewey is
> spearheading.
>
> I know Rust was neglected for this initial push, but I would like to get
> around to that someday. (If you're interested, feel free to propose
> something or start a discussion. My Rust is too, well, rusty to put forward
> a coherent proposal at the moment.)
>
> -David
>
> On Fri, Jan 13, 2023, at 16:00, Andrew Lamb wrote:
> > Thank you David and everyone else who helped make this happen -- really
> > nice work filling in the Arrow / Database integration story.
> >
> > Andrew
> >
> > On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org> wrote:
> >
> >> The Apache Arrow community is pleased to announce the 0.1.0 release of
> the
> >> Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues
> ([1]).
> >>
> >> The release is available now from [2] and [3].
> >>
> >> Release notes are available at:
> >>
> >>
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
> >>
> >> What is Apache Arrow?
> >> ---------------------
> >> Apache Arrow is a columnar in-memory analytics layer designed to
> >> accelerate big data. It houses a set of canonical in-memory
> representations
> >> of flat and hierarchical data along with multiple language-bindings for
> >> structure manipulation. It also provides low-overhead streaming and
> batch
> >> messaging, zero-copy interprocess communication (IPC), and vectorized
> >> in-memory analytics libraries. Languages currently supported include C,
> >> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust.
> >>
> >> What is Apache Arrow ADBC?
> >> --------------------------
> >> ADBC is a database access abstraction for Arrow-based applications. It
> >> provides a cross-language API for working with databases while using
> Arrow
> >> data, providing an alternative to APIs like JDBC and ODBC for analytical
> >> applications. For more, see [4].
> >>
> >> Please report any feedback to the mailing lists ([5], [6]).
> >>
> >> Regards,
> >> The Apache Arrow Community
> >>
> >> [1]:
> >>
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
> >> [2]:
> https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
> >> [3]: https://apache.jfrog.io/ui/native/arrow
> >> [4]: https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
> >> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
> >> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
> >>
>

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by David Li <li...@apache.org>.

Thanks Andrew! Several people helped, particularly Kou, Matt, and Jacob, and this release also builds heavily on the nanoarrow project that Dewey is spearheading.

I know Rust was neglected for this initial push, but I would like to get around to that someday. (If you're interested, feel free to propose something or start a discussion. My Rust is too, well, rusty to put forward a coherent proposal at the moment.)

-David

On Fri, Jan 13, 2023, at 16:00, Andrew Lamb wrote:
> Thank you David and everyone else who helped make this happen -- really
> nice work filling in the Arrow / Database integration story.
>
> Andrew
>
> On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org> wrote:
>
>> The Apache Arrow community is pleased to announce the 0.1.0 release of the
>> Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues ([1]).
>>
>> The release is available now from [2] and [3].
>>
>> Release notes are available at:
>>
>> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
>>
>> What is Apache Arrow?
>> ---------------------
>> Apache Arrow is a columnar in-memory analytics layer designed to
>> accelerate big data. It houses a set of canonical in-memory representations
>> of flat and hierarchical data along with multiple language-bindings for
>> structure manipulation. It also provides low-overhead streaming and batch
>> messaging, zero-copy interprocess communication (IPC), and vectorized
>> in-memory analytics libraries. Languages currently supported include C,
>> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust.
>>
>> What is Apache Arrow ADBC?
>> --------------------------
>> ADBC is a database access abstraction for Arrow-based applications. It
>> provides a cross-language API for working with databases while using Arrow
>> data, providing an alternative to APIs like JDBC and ODBC for analytical
>> applications. For more, see [4].
>>
>> Please report any feedback to the mailing lists ([5], [6]).
>>
>> Regards,
>> The Apache Arrow Community
>>
>> [1]:
>> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
>> [2]: https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
>> [3]: https://apache.jfrog.io/ui/native/arrow
>> [4]: https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
>> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
>> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
>>

Re: [ANNOUNCE] Apache Arrow ADBC 0.1.0 Released

Posted by Andrew Lamb <al...@influxdata.com>.

Thank you David and everyone else who helped make this happen -- really
nice work filling in the Arrow / Database integration story.

Andrew

On Tue, Jan 10, 2023 at 8:00 PM David Li <li...@apache.org> wrote:

> The Apache Arrow community is pleased to announce the 0.1.0 release of the
> Apache Arrow ADBC libraries. It includes 63 resolved GitHub issues ([1]).
>
> The release is available now from [2] and [3].
>
> Release notes are available at:
>
> https://github.com/apache/arrow-adbc/blob/apache-arrow-adbc-0.1.0/CHANGELOG.md
>
> What is Apache Arrow?
> ---------------------
> Apache Arrow is a columnar in-memory analytics layer designed to
> accelerate big data. It houses a set of canonical in-memory representations
> of flat and hierarchical data along with multiple language-bindings for
> structure manipulation. It also provides low-overhead streaming and batch
> messaging, zero-copy interprocess communication (IPC), and vectorized
> in-memory analytics libraries. Languages currently supported include C,
> C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust.
>
> What is Apache Arrow ADBC?
> --------------------------
> ADBC is a database access abstraction for Arrow-based applications. It
> provides a cross-language API for working with databases while using Arrow
> data, providing an alternative to APIs like JDBC and ODBC for analytical
> applications. For more, see [4].
>
> Please report any feedback to the mailing lists ([5], [6]).
>
> Regards,
> The Apache Arrow Community
>
> [1]:
> https://github.com/apache/arrow-adbc/issues?q=is%3Aissue+milestone%3A0.1.0+is%3Aclosed
> [2]: https://www.apache.org/dyn/closer.cgi/arrow/apache-arrow-adbc-0.1.0
> [3]: https://apache.jfrog.io/ui/native/arrow
> [4]: https://arrow.apache.org/blog/2023/01/05/introducing-arrow-adbc/
> [5]: https://lists.apache.org/list.html?user@arrow.apache.org
> [6]: https://lists.apache.org/list.html?dev@arrow.apache.org
>