You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Andrew Pilloud <ap...@google.com> on 2018/06/06 16:24:36 UTC

Beam SQL Pipeline Options

We are just about to the point of having a working pure SQL workflow for
Beam! One of the last things that remains is how to configure Pipeline
Options via a SQL shell. I have written up a proposal to use the set
statement, for example "SET runner=DataflowRunner". I'm looking for
feedback, particularly on what will make for the best user experience.
Please take a look and comment:

https://docs.google.com/document/d/1UTsSBuruJRfGnVOS9eXbQI6NauCD4WnSAPgA_Y0zjdk/edit?usp=sharing

Andrew

Re: Beam SQL Pipeline Options

Posted by Kenneth Knowles <kl...@google.com>.
Hi Arun,

If you are looking to connect to a SQL database from Beam Java code, then I
you might want JdbcIO [1] with a PostgreSQL JDBC driver [2]. This is not
currently wired up to Beam SQL specially; you have to connect in Java code.
It would be a nice contribution.

Kenn

[1]
https://beam.apache.org/documentation/sdks/javadoc/2.4.0/index.html?org/apache/beam/sdk/io/jdbc/JdbcIO.html
[2] https://jdbc.postgresql.org/

On Wed, Jun 6, 2018 at 9:49 PM arun kumar <su...@gmail.com> wrote:

> Hi
>
> Thanks for the update.
> Can you please share me if you have any documentation for connecting to
> postgres using beam code.
>
> Thanks
> Arun
>
> On Wed, Jun 6, 2018, 9:54 PM Andrew Pilloud <ap...@google.com> wrote:
>
>> We are just about to the point of having a working pure SQL workflow for
>> Beam! One of the last things that remains is how to configure Pipeline
>> Options via a SQL shell. I have written up a proposal to use the set
>> statement, for example "SET runner=DataflowRunner". I'm looking for
>> feedback, particularly on what will make for the best user experience.
>> Please take a look and comment:
>>
>>
>> https://docs.google.com/document/d/1UTsSBuruJRfGnVOS9eXbQI6NauCD4WnSAPgA_Y0zjdk/edit?usp=sharing
>>
>> Andrew
>>
>

Re: Beam SQL Pipeline Options

Posted by Kenneth Knowles <kl...@google.com>.
Hi Arun,

If you are looking to connect to a SQL database from Beam Java code, then I
you might want JdbcIO [1] with a PostgreSQL JDBC driver [2]. This is not
currently wired up to Beam SQL specially; you have to connect in Java code.
It would be a nice contribution.

Kenn

[1]
https://beam.apache.org/documentation/sdks/javadoc/2.4.0/index.html?org/apache/beam/sdk/io/jdbc/JdbcIO.html
[2] https://jdbc.postgresql.org/

On Wed, Jun 6, 2018 at 9:49 PM arun kumar <su...@gmail.com> wrote:

> Hi
>
> Thanks for the update.
> Can you please share me if you have any documentation for connecting to
> postgres using beam code.
>
> Thanks
> Arun
>
> On Wed, Jun 6, 2018, 9:54 PM Andrew Pilloud <ap...@google.com> wrote:
>
>> We are just about to the point of having a working pure SQL workflow for
>> Beam! One of the last things that remains is how to configure Pipeline
>> Options via a SQL shell. I have written up a proposal to use the set
>> statement, for example "SET runner=DataflowRunner". I'm looking for
>> feedback, particularly on what will make for the best user experience.
>> Please take a look and comment:
>>
>>
>> https://docs.google.com/document/d/1UTsSBuruJRfGnVOS9eXbQI6NauCD4WnSAPgA_Y0zjdk/edit?usp=sharing
>>
>> Andrew
>>
>

Re: Beam SQL Pipeline Options

Posted by arun kumar <su...@gmail.com>.
Hi

Thanks for the update.
Can you please share me if you have any documentation for connecting to
postgres using beam code.

Thanks
Arun

On Wed, Jun 6, 2018, 9:54 PM Andrew Pilloud <ap...@google.com> wrote:

> We are just about to the point of having a working pure SQL workflow for
> Beam! One of the last things that remains is how to configure Pipeline
> Options via a SQL shell. I have written up a proposal to use the set
> statement, for example "SET runner=DataflowRunner". I'm looking for
> feedback, particularly on what will make for the best user experience.
> Please take a look and comment:
>
>
> https://docs.google.com/document/d/1UTsSBuruJRfGnVOS9eXbQI6NauCD4WnSAPgA_Y0zjdk/edit?usp=sharing
>
> Andrew
>

Re: Beam SQL Pipeline Options

Posted by Andrew Pilloud <ap...@google.com>.
I've turned this into a PR, more discussion going on over there:
https://github.com/apache/beam/pull/5592

Andrew

On Wed, Jun 6, 2018 at 9:46 PM Kenneth Knowles <kl...@google.com> wrote:

> This is a nice short design discussion doc, and perhaps a cooler piece of
> news hidden in the paragraph :-)
>
> Kenn
>
> On Wed, Jun 6, 2018 at 9:24 AM Andrew Pilloud <ap...@google.com> wrote:
>
>> We are just about to the point of having a working pure SQL workflow for
>> Beam! One of the last things that remains is how to configure Pipeline
>> Options via a SQL shell. I have written up a proposal to use the set
>> statement, for example "SET runner=DataflowRunner". I'm looking for
>> feedback, particularly on what will make for the best user experience.
>> Please take a look and comment:
>>
>>
>> https://docs.google.com/document/d/1UTsSBuruJRfGnVOS9eXbQI6NauCD4WnSAPgA_Y0zjdk/edit?usp=sharing
>>
>> Andrew
>>
>

Re: Beam SQL Pipeline Options

Posted by Kenneth Knowles <kl...@google.com>.
This is a nice short design discussion doc, and perhaps a cooler piece of
news hidden in the paragraph :-)

Kenn

On Wed, Jun 6, 2018 at 9:24 AM Andrew Pilloud <ap...@google.com> wrote:

> We are just about to the point of having a working pure SQL workflow for
> Beam! One of the last things that remains is how to configure Pipeline
> Options via a SQL shell. I have written up a proposal to use the set
> statement, for example "SET runner=DataflowRunner". I'm looking for
> feedback, particularly on what will make for the best user experience.
> Please take a look and comment:
>
>
> https://docs.google.com/document/d/1UTsSBuruJRfGnVOS9eXbQI6NauCD4WnSAPgA_Y0zjdk/edit?usp=sharing
>
> Andrew
>