You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Mohit Anchlia <mo...@gmail.com> on 2017/09/26 19:04:37 UTC

JDBC table source

We are looking to stream data from the database. Is there already a jdbc
table source available for streaming?

Re: JDBC table source

Posted by Fabian Hueske <fh...@gmail.com>.
Yes, there's no built-in TableSource for that.
However, it is certainly possible to implement a custom TableSource for
your use case. The code of the JdbcInputFormat should be a good starting
point. So you could run a query every n seconds (assuming you can consume
the data of the last n seconds in n seconds). If you want to run the
TableSource in parallel, you would need to partition the query (as for the
JdbcInputFormat).

2017-09-26 19:19 GMT-04:00 Mohit Anchlia <mo...@gmail.com>:

> Thanks. Idea was to query for 'x' records in last 'n' seconds using an
> indexed column. Looks like that is not possible?
>
> On Tue, Sep 26, 2017 at 3:24 PM, Fabian Hueske <fh...@gmail.com> wrote:
>
>> Hi Mohit,
>>
>> no, a JdbcTableSource does not exist yet. However, since there is a
>> JdbcInputFormat it should not be hard to wrap that in a TableSource.
>> However, this would rather be a batch TableSource in the sense that it
>> would just return the data that the query returns. Once all data is read it
>> would terminate. You can of course wrap the JdbcInputFormat in a
>> StreamingTableSource, but as I said it would terminate when all data was
>> read.
>>
>> If you are thinking of streaming a changelog stream from a database to
>> the Table API / SQL, this would not be possible at the moment due to
>> limitation in the Table API / SQL (these will be removed in the future).
>> Moreover, not many DBMS expose their changelog (such as PostgreSQL) and
>> there is no common interface for that such as JDBC. Instead they use custom
>> formats. There is a tool called Bottled Water that ingests PostgreSQL
>> streams into Kafka.
>>
>> So, to make a long story short: implementing a JDBC TableSource for batch
>> query should be fairly easy. A true streaming solution that hooks into the
>> changelog stream of a table is not possible at the moment.
>>
>> Cheers, Fabian
>>
>> 2017-09-26 15:04 GMT-04:00 Mohit Anchlia <mo...@gmail.com>:
>>
>>> We are looking to stream data from the database. Is there already a jdbc
>>> table source available for streaming?
>>>
>>
>>
>

Re: JDBC table source

Posted by Mohit Anchlia <mo...@gmail.com>.
Thanks. Idea was to query for 'x' records in last 'n' seconds using an
indexed column. Looks like that is not possible?

On Tue, Sep 26, 2017 at 3:24 PM, Fabian Hueske <fh...@gmail.com> wrote:

> Hi Mohit,
>
> no, a JdbcTableSource does not exist yet. However, since there is a
> JdbcInputFormat it should not be hard to wrap that in a TableSource.
> However, this would rather be a batch TableSource in the sense that it
> would just return the data that the query returns. Once all data is read it
> would terminate. You can of course wrap the JdbcInputFormat in a
> StreamingTableSource, but as I said it would terminate when all data was
> read.
>
> If you are thinking of streaming a changelog stream from a database to the
> Table API / SQL, this would not be possible at the moment due to limitation
> in the Table API / SQL (these will be removed in the future).
> Moreover, not many DBMS expose their changelog (such as PostgreSQL) and
> there is no common interface for that such as JDBC. Instead they use custom
> formats. There is a tool called Bottled Water that ingests PostgreSQL
> streams into Kafka.
>
> So, to make a long story short: implementing a JDBC TableSource for batch
> query should be fairly easy. A true streaming solution that hooks into the
> changelog stream of a table is not possible at the moment.
>
> Cheers, Fabian
>
> 2017-09-26 15:04 GMT-04:00 Mohit Anchlia <mo...@gmail.com>:
>
>> We are looking to stream data from the database. Is there already a jdbc
>> table source available for streaming?
>>
>
>

Re: JDBC table source

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Mohit,

no, a JdbcTableSource does not exist yet. However, since there is a
JdbcInputFormat it should not be hard to wrap that in a TableSource.
However, this would rather be a batch TableSource in the sense that it
would just return the data that the query returns. Once all data is read it
would terminate. You can of course wrap the JdbcInputFormat in a
StreamingTableSource, but as I said it would terminate when all data was
read.

If you are thinking of streaming a changelog stream from a database to the
Table API / SQL, this would not be possible at the moment due to limitation
in the Table API / SQL (these will be removed in the future).
Moreover, not many DBMS expose their changelog (such as PostgreSQL) and
there is no common interface for that such as JDBC. Instead they use custom
formats. There is a tool called Bottled Water that ingests PostgreSQL
streams into Kafka.

So, to make a long story short: implementing a JDBC TableSource for batch
query should be fairly easy. A true streaming solution that hooks into the
changelog stream of a table is not possible at the moment.

Cheers, Fabian

2017-09-26 15:04 GMT-04:00 Mohit Anchlia <mo...@gmail.com>:

> We are looking to stream data from the database. Is there already a jdbc
> table source available for streaming?
>