You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Lijie Wang <wa...@gmail.com> on 2022/07/01 02:24:56 UTC

Re: Re: [DISCUSS] FLIP-239: Port JDBC Connector Source to FLIP-27

Hi Roc,

Thanks for driving the discussion.

Could you describe in detail what the JdbcSourceSplit represents? It looks
like something wrong with the comments of JdbcSourceSplit in FLIP(it
describe as "A {@link SourceSplit} that represents a file, or a region of a
file....").

Best,
Lijie


Roc Marshal <fl...@126.com> 于2022年6月30日周四 21:41写道:

> Hi, Boto.
>     Thanks for your reply.
>
>    +1 to me on watermark strategy definition in ‘streaming’ & table
> source. I'm not sure if FLIP-202[1]  is suitable for a separate discussion,
> but I think your proposal is very helpful to the new source. It would be
> great if the new source could be compatible with this abstraction.
>
>    In addition, whether we need to support such a special bounded scenario
> abstraction?
>    The number of JdbcSourceSplit is certain, but the time to generate all
> JdbcSourceSplit completely is not certain in the user defined
> implementation. When the condition that the JdbcSourceSplit
> generate-process end is met, the JdbcSourceSplit will not be generated.
> After all JdbcSourceSplit processing is completed, the reader will be
> notified that there are no more JdbcSourceSplit from
> JdbcSourceSplitEnumerator.
>
> - [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-202%3A+Introduce+ClickHouse+Connector
>
> Best regards,
> Roc Marshal
>
> On 2022/06/30 09:02:23 João Boto wrote:
> > Hi,
> >
> > On source we could improve the JdbcParameterValuesProvider.. to be
> defined as a query(s) or something more dynamic.
> > The most time if your job is dynamic or have some condition to be met
> (based on data on table) you have to create a connection an get that info
> from database
> >
> > If we are going to create/allow a "streaming" jdbc source, we should be
> able to define watermark and get new data from table using that watermark..
> >
> >
> > For the sink (but it could apply on source) will be great to be able to
> set your implementation of the connection type.. For example if you are
> connecting to clickhouse, be able to set a implementation based on
> "BalancedClickhouseDataSource" for example (in this[1] implementation we
> have a example) or set a extension version of a implementation for debug
> purpose
> >
> > Regards
> >
> >
> > [1]
> https://github.com/apache/flink/pull/20097/files#diff-8b36e3403381dc14c748aeb5de0b4ceb7d7daec39594b1eacff1694b5266419d
> >
> > On 2022/06/27 13:09:51 Roc Marshal wrote:
> > > Hi, all,
> > >
> > >
> > >
> > >
> > > I would like to open a discussion on porting JDBC Source to new Source
> API (FLIP-27[1]).
> > >
> > > Martijn Visser, Jing Ge and I had a preliminary discussion on the JIRA
> FLINK-25420[2] and planed to start the discussion about the source part
> first.
> > >
> > >
> > >
> > > Please let me know:
> > >
> > > - The issues about old Jdbc source you encountered;
> > > - The new feature or design you want;
> > > - More suggestions from other dimensions...
> > >
> > >
> > >
> > > You could find more details in FLIP-239[3].
> > >
> > > Looking forward to your feedback.
> > >
> > >
> > >
> > >
> > > [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface
> > >
> > > [2] https://issues.apache.org/jira/browse/FLINK-25420
> > >
> > > [3]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=217386271
> > >
> > >
> > >
> > >
> > > Best regards,
> > >
> > > Roc Marshal
> >