You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/11/21 21:16:31 UTC

[GitHub] [incubator-hudi] leesf commented on a change in pull request #1039: [HUDI-340]: made max events to read from kafka source configurable

leesf commented on a change in pull request #1039: [HUDI-340]: made max events to read from kafka source configurable
URL: https://github.com/apache/incubator-hudi/pull/1039#discussion_r349321525
 
 

 ##########
 File path: hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
 ##########
 @@ -229,7 +229,9 @@ public KafkaOffsetGen(TypedProperties props) {
         new HashMap(ScalaHelpers.toJavaMap(cluster.getLatestLeaderOffsets(topicPartitions).right().get()));
 
     // Come up with final set of OffsetRanges to read (account for new partitions, limit number of events)
-    long numEvents = Math.min(DEFAULT_MAX_EVENTS_TO_READ, sourceLimit);
+    long maxEventsToReadFromKafka = props.getLong(Config.MAX_EVENTS_FROM_KAFKA_SOURCE_PROP,
+        Config.DEFAULT_MAX_EVENTS_FROM_KAFKA_SOURCE);
+    long numEvents = sourceLimit == Long.MAX_VALUE ? maxEventsToReadFromKafka : sourceLimit;
 
 Review comment:
   Should we also handle the case that `Config.MAX_EVENTS_FROM_KAFKA_SOURCE_PROP` is set to `Long.MAX_VALUE` in props? It would also scan the entire Kafka topic. If `sourceLimit` and `Config.MAX_EVENTS_FROM_KAFKA_SOURCE_PROP` both are set to `Long.MAX_VALUE`, just fallback to `Config.DEFAULT_MAX_EVENTS_FROM_KAFKA_SOURCE`. WDYT? @pratyakshsharma @vinothchandar 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services