You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "wangxianghu (Jira)" <ji...@apache.org> on 2020/06/19 08:24:00 UTC

[jira] [Commented] (HUDI-340) Increase Default max events to read from kafka source

    [ https://issues.apache.org/jira/browse/HUDI-340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140333#comment-17140333 ] 

wangxianghu commented on HUDI-340:
----------------------------------

Hi [~Pratyaksh], I got confused here, what's the purpose doing this judge,  Is there anything I missed ? :)
{code:java}
// org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java:205
maxEventsToReadFromKafka = (maxEventsToReadFromKafka == Long.MAX_VALUE || maxEventsToReadFromKafka == Integer.MAX_VALUE)
    ? Config.maxEventsFromKafkaSource : maxEventsToReadFromKafka;
{code}
If we set  Config.MAX_EVENTS_FROM_KAFKA_SOURCE_PROP =  Long.MAX_VALUE - 1, then we get{color:#172b4d} maxEventsToReadFromKafka =  Long.MAX_VALUE - 1{color},  it is huge, but configurable.

> Increase Default max events to read from kafka source
> -----------------------------------------------------
>
>                 Key: HUDI-340
>                 URL: https://issues.apache.org/jira/browse/HUDI-340
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: DeltaStreamer
>            Reporter: Pratyaksh Sharma
>            Assignee: Pratyaksh Sharma
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.5.1
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, DEFAULT_MAX_EVENTS_TO_READ is set to 1M in case of kafka source in KafkaOffsetGen.java class. DeltaStreamer can handle much more incoming records than this. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)