You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Andrew Lamb <al...@influxdata.com> on 2021/08/19 18:34:40 UTC

[Rust] [DataFusion] Discuss new API for remote data sources

I wanted to draw attention to a  PR[1] in the DataFusion repository that
proposes a  API designed  to abstract away access to remote datasources, to
allow plugging in sources such as S3.

It has had some great discussion and community engagement so far but given
it is a fairly significant change to how I/O is performed in DataFusion and
potentially sets the stage for some even greater changes in follow on PRs,
I wanted to raise it in this forum too

Andrew

[1] https://github.com/apache/arrow-datafusion/pull/811

Re: [Rust] [DataFusion] Discuss new API for remote data sources

Posted by Andrew Lamb <al...@influxdata.com>.
There is now a design doc [1]  available for comment

[1]
https://docs.google.com/document/d/1ZEZqvdohrot0ewtTNeaBtqczOIJ1Q0OnX9PqMMxpOF8/edit#

On Thu, Aug 19, 2021 at 2:34 PM Andrew Lamb <al...@influxdata.com> wrote:

> I wanted to draw attention to a  PR[1] in the DataFusion repository that
> proposes a  API designed  to abstract away access to remote datasources, to
> allow plugging in sources such as S3.
>
> It has had some great discussion and community engagement so far but given
> it is a fairly significant change to how I/O is performed in DataFusion and
> potentially sets the stage for some even greater changes in follow on PRs,
> I wanted to raise it in this forum too
>
> Andrew
>
> [1] https://github.com/apache/arrow-datafusion/pull/811
>