You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/05/16 07:40:04 UTC

[jira] [Commented] (FLINK-6352) FlinkKafkaConsumer should support to use timestamp to set up start offset

    [ https://issues.apache.org/jira/browse/FLINK-6352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011918#comment-16011918 ] 

ASF GitHub Bot commented on FLINK-6352:
---------------------------------------

GitHub user zjureel opened a pull request:

    https://github.com/apache/flink/pull/3915

    [FLINK-6352] Support to use timestamp to set the initial offset of kafka

    Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration.
    If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html).
    In addition to going through the list, please provide a meaningful description of your changes.
    
    - [ ] General
      - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text")
      - The pull request addresses only one issue
      - Each commit in the PR has a meaningful commit message (including the JIRA id)
    
    - [ ] Documentation
      - Documentation has been added for new functionality
      - Old documentation affected by the pull request has been updated
      - JavaDoc for public methods has been added
    
    - [ ] Tests & Build
      - Functionality added by the pull request is covered by tests
      - `mvn clean verify` has been executed successfully locally or a Travis build has passed


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/zjureel/flink FLINK-6352

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3915.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3915
    
----
commit e1f5aee8a471ef1f1e8cec3104807b22954b6a42
Author: zjureel <zj...@gmail.com>
Date:   2017-05-15T10:27:24Z

    [FLINK-6352] Support to use timestamp to set the initial offset of kafka

commit 5d482c57ad19f0f9739fe5b40fe6e8713900e8a4
Author: zjureel <zj...@gmail.com>
Date:   2017-05-16T07:37:09Z

    fix StreamExecutionEnvironment test

----


> FlinkKafkaConsumer should support to use timestamp to set up start offset
> -------------------------------------------------------------------------
>
>                 Key: FLINK-6352
>                 URL: https://issues.apache.org/jira/browse/FLINK-6352
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kafka Connector
>            Reporter: Fang Yong
>            Assignee: Fang Yong
>             Fix For: 1.3.0
>
>
>     Currently "auto.offset.reset" is used to initialize the start offset of FlinkKafkaConsumer, and the value should be earliest/latest/none. This method can only let the job comsume the beginning or the most recent data, but can not specify the specific offset of Kafka began to consume. 
>     So, there should be a configuration item (such as "flink.source.start.time" and the format is "yyyy-MM-dd HH:mm:ss") that allows user to configure the initial offset of Kafka. The action of "flink.source.start.time" is as follows:
> 1) job start from checkpoint / savepoint
>   a> offset of partition can be restored from checkpoint/savepoint,  "flink.source.start.time" will be ignored.
>   b> there's no checkpoint/savepoint for the partition (For example, this partition is newly increased), the "flink.kafka.start.time" will be used to initialize the offset of the partition    
> 2) job has no checkpoint / savepoint, the "flink.source.start.time" is used to initialize the offset of the kafka
>   a> the "flink.source.start.time" is valid, use it to set the offset of kafka
>   b> the "flink.source.start.time" is out-of-range, the same as it does currently with no initial offset, get kafka's current offset and start reading



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)