You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "Vinoth Chandar (Jira)" <ji...@apache.org> on 2019/11/18 18:59:00 UTC

[jira] [Commented] (HUDI-184) Integrate Hudi with Apache Flink

    [ https://issues.apache.org/jira/browse/HUDI-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976791#comment-16976791 ] 

Vinoth Chandar commented on HUDI-184:
-------------------------------------

[~yanghua]  been thinking a bit around what we could do to unlock progress with Flink...  There are two now 

1) Index

2) Size estimations

I wonder if we can think about starting with a simpler model first.. i.e use Joins for index, see if we can make the functionality work correctly without `WorkloadProfile` .. 

Also there are probably some bigger questions to answer first? e.g if we are targetting the streaming APIs, then whats the execution model? In Spark Streaming, we commit after each micro batch.  When do we commit for Flink writing? 

 

> Integrate Hudi with Apache Flink
> --------------------------------
>
>                 Key: HUDI-184
>                 URL: https://issues.apache.org/jira/browse/HUDI-184
>             Project: Apache Hudi (incubating)
>          Issue Type: New Feature
>          Components: Write Client
>            Reporter: vinoyang
>            Assignee: vinoyang
>            Priority: Major
>
> Apache Flink is a popular streaming processing engine.
> Integrating Hudi with Flink is a valuable work.
> The discussion mailing thread is here: [https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)