You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Ran Tao <ch...@gmail.com> on 2022/12/08 09:26:17 UTC

[DISCUSS] Hybrid Source Connector

Hi guys. HybridSource is a good feature, but now released version did not
support table & sql api for a long time.

I notice that there is a related ticket here:
https://issues.apache.org/jira/browse/FLINK-22793
but the progress is slow, i wonder can we push forward this function.

I have wrote a discussed FLIP,  look forward to your comments.
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235836225

-- 
Best Regards,
Ran Tao
https://github.com/chucheng92

Re: [DISCUSS] Hybrid Source Connector

Posted by Timo Walther <tw...@apache.org>.
Hi Ran,

Thanks for proposing a FLIP. Btw according to the process, the subject 
of this email should be `[DISCUSS] FLIP-278: Hybrid Source Connector` so 
that people can identify this discussion as a FLIP discussion.

Supporting the hybrid source for SQL was a long-standing issue on our 
roadmap. Happy to give feedback here:

1) Options

Coming up with stable long-term options should be a shared effort. 
Having an index as a key could cause unintended side effects if the 
index is not correctly chosen, I would suggest we use IDs instead.

What do you think about the following structure?

CREATE TABLE ... WITH (
   'sources'='historical;realtime',   -- Config option of type string list
   'historical.connector' = 'filesystem',
   'historical.path' = '/tmp/a.csv',
   'historcal.format' = 'csv',
   'realtime.path' = '/tmp/b.csv',
   'realtime.format' = 'csv'"
)

I would limit the IDs to simple [a-z0-9_] identifiers. Once we support 
metadata columns, we can also propagate these IDs easily.

2) Schema field mappings

The FLIP mentions `schema-field-mappings` could you elaborate on this in 
the document?

3) Start position strategies

Have you thought about how we can represent start position strategies. 
The FLIP is very minimal but it would be nice to at least hear some 
opinions on this topic. Maybe we can come up with some general strategy 
that makes the most common use case possible in the near future.

Thanks,
Timo



On 08.12.22 10:26, Ran Tao wrote:
> Hi guys. HybridSource is a good feature, but now released version did not
> support table & sql api for a long time.
> 
> I notice that there is a related ticket here:
> https://issues.apache.org/jira/browse/FLINK-22793
> but the progress is slow, i wonder can we push forward this function.
> 
> I have wrote a discussed FLIP,  look forward to your comments.
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=235836225
>