You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2019/12/02 18:53:00 UTC

[jira] [Commented] (HUDI-288) Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment

    [ https://issues.apache.org/jira/browse/HUDI-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986264#comment-16986264 ] 

Vinoth Chandar commented on HUDI-288:
-------------------------------------

> So keeping <topic name> in the target path looks a bit skeptical to me because a topic name might not necessarily include table name

I agree. I was suggesting a mere sane default, we should let the user override as needed using a TableConfig like mechanism, if needed.. If not, by default table_name = topic_name seems acceptable to me. At Uber atleast, it was very useful for auto creating Hudi datasets based on newly added kafka topics for e.g

> then it will keep on running for the first table itself and will never pick up the next table

Yes. you need a thread per DeltaSync instance.. Supporting continuous mode would be good for k8s deployments, where cluster setup and teardown are costly affairs.. Continuous mode solves the problem of managing compactions for MOR. For COW, running without continuous mode could be sufficient. We can phase this in slowly as well.  

 

So, whos going to drive this? :)  We should also give this tool a Cool name :D 

> Add support for ingesting multiple kafka streams in a single DeltaStreamer deployment
> -------------------------------------------------------------------------------------
>
>                 Key: HUDI-288
>                 URL: https://issues.apache.org/jira/browse/HUDI-288
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: deltastreamer
>            Reporter: Vinoth Chandar
>            Assignee: leesf
>            Priority: Major
>
> https://lists.apache.org/thread.html/3a69934657c48b1c0d85cba223d69cb18e18cd8aaa4817c9fd72cef6@<dev.hudi.apache.org> has all the context



--
This message was sent by Atlassian Jira
(v8.3.4#803005)