You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/09/03 07:08:00 UTC

[jira] [Commented] (HUDI-2394) [Kafka Connect Mileston 1] Implement kafka connect for immutable data

    [ https://issues.apache.org/jira/browse/HUDI-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17409315#comment-17409315 ] 

ASF GitHub Bot commented on HUDI-2394:
--------------------------------------

rmahindra123 opened a new pull request #3592:
URL: https://github.com/apache/hudi/pull/3592


   ## What is the purpose of the pull request
   
   Implement Kafka Sink Protocol for Hudi for Ingesting Immutable Data. This PR enables connect users to readily ingest Kafka AVRO/ JSON string records into Hudi tables without Spark engine, within the Kafka Connect framework.
   
   Currently, we use the HoodieJavaWriteClient's bulk insert support to insert append only data (CoW). We use file id indexing to ensure multiple writers per Kafka partition can write to the same Hudi partition path concurrently without locks. 
   
   ## Brief change log
   
   1. The Kafka connect protocol is implemented in a new package, hudi-kafka-connect
   2. A few code changes to integrate support for bulk insert with HoodieJavaWriteClient. 
   
   ## Verify this pull request
   
   1. Wrote unit tests for the key Coorindator <-> Participants interaction.
   2. Tested with the kafka console connect in distributed mode as per instructions in README.md
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> [Kafka Connect Mileston 1] Implement kafka connect for immutable data
> ---------------------------------------------------------------------
>
>                 Key: HUDI-2394
>                 URL: https://issues.apache.org/jira/browse/HUDI-2394
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Rajesh Mahindra
>            Priority: Major
>
> Implement kafka connect for immutable data using Bulk inserts



--
This message was sent by Atlassian Jira
(v8.3.4#803005)