You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "leesf (Jira)" <ji...@apache.org> on 2019/10/11 15:33:00 UTC

[jira] [Closed] (HUDI-292) Consume more entries from kafka than specified sourceLimit.

     [ https://issues.apache.org/jira/browse/HUDI-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

leesf closed HUDI-292.
----------------------
    Resolution: Fixed

Fixed via master: e10e06918e4758917513c55f9bc02c35dad99128

> Consume more entries from kafka than specified sourceLimit.
> -----------------------------------------------------------
>
>                 Key: HUDI-292
>                 URL: https://issues.apache.org/jira/browse/HUDI-292
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: Utilities
>            Reporter: leesf
>            Assignee: leesf
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.5.1
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> When _CheckpointUtils#computeOffsetRanges_ for consuming kafka messges. 
> Given 
> topic = "test",
> fromOffsets(partition -> offset pair) = (0 -> 0), (1 -> 0), (2 -> 0), (3 -> 0), (4 -> 0),
> toOffsets = (0, 100), (1, 1000), (2, 1000), (3, 1000), (4, 1000),
> numEvents = 1001.
> The output of _CheckpointUtils#computesOffsetRanges_ is  
> OffsetRange(topic: 'test', partition: 0, range: [0 -> 100])
> OffsetRange(topic: 'test', partition: 1, range: [0 -> 226])
> OffsetRange(topic: 'test', partition: 2, range: [0 -> 226])
> OffsetRange(topic: 'test', partition: 3, range: [0 -> 226])
> OffsetRange(topic: 'test', partition: 4, range: [0 -> 226])
> Total count is 1004(100 + 266 * 4), more than 1001, and thus consume more entries from kafka  than specified 1001.
> CC [~vinoth]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)