You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Pratyaksh Sharma (Jira)" <ji...@apache.org> on 2019/11/17 13:16:00 UTC

[jira] [Commented] (HUDI-340) Increase Default max events to read from kafka source

    [ https://issues.apache.org/jira/browse/HUDI-340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976027#comment-16976027 ] 

Pratyaksh Sharma commented on HUDI-340:
---------------------------------------

[~vinothchandar] As per the discussion, I will make the upper cap configurable. That will give users option to tune their memory and other parameters according to the size of RDD they want to have. Limiting the upper cap from our end looks a bit skeptical to me. 

Code wise, this means just changing the Math.min to Math.max on the given line in getNextOffsetRanges method in KafkaOffsetGen.java class -> 

long numEvents = Math.min(DEFAULT_MAX_EVENTS_TO_READ, sourceLimit);

WDYT?

> Increase Default max events to read from kafka source
> -----------------------------------------------------
>
>                 Key: HUDI-340
>                 URL: https://issues.apache.org/jira/browse/HUDI-340
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: deltastreamer
>            Reporter: Pratyaksh Sharma
>            Assignee: Pratyaksh Sharma
>            Priority: Major
>
> Right now, DEFAULT_MAX_EVENTS_TO_READ is set to 1M in case of kafka source in KafkaOffsetGen.java class. DeltaStreamer can handle much more incoming records than this. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)