You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by pnowojski <gi...@git.apache.org> on 2017/10/04 16:22:50 UTC

[GitHub] flink pull request #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

GitHub user pnowojski opened a pull request:

    https://github.com/apache/flink/pull/4775

    [FLINK-7739] Fix KafkaXXITCase tests stability

    ## What is the purpose of the change
    
    This change fixes Kafka*ITCase tests stability. Main fix is excluding `netty` dependency from zookeeper. Other two are probably just cosmetic changes.
    
    For more info please look into individual commit messages.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pnowojski/flink kafka-test2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4775.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4775
    
----
commit c7cc24d062aa233d86b68b7438c9a4e717003393
Author: Piotr Nowojski <pi...@gmail.com>
Date:   2017-09-29T16:23:29Z

    [FLINK-7739][kafka-tests] Set shorter heartbeats intervals
    
    Default pause value of 60seconds is too large (tests would timeout before akka react)

commit 1677791f10153b9f7ecd552eac148d6ae3d056f1
Author: Piotr Nowojski <pi...@gmail.com>
Date:   2017-10-04T11:48:11Z

    [FLINK-7739][kafka-tests] Set restart delay to non zero
    
    Give TaskManagers some time to clean up before restaring a job.

commit 937c3fb388d9d7104b6336f59c3674bb70bfbf50
Author: Piotr Nowojski <pi...@gmail.com>
Date:   2017-10-04T14:50:57Z

    [FLINK-7739] Exclude netty dependency from zookeeper
    
    Zookeeper was pulling in conflicting Netty version. Conflict was
    extremly subtle - TaskManager in kafka tests was deadlocking in some
    rare corner cases.

----


---

[GitHub] flink pull request #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4775#discussion_r143254179
  
    --- Diff: flink-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaTestBase.java ---
    @@ -121,10 +122,12 @@ public static void shutDownServices() throws Exception {
     
     	protected static Configuration getFlinkConfiguration() {
     		Configuration flinkConfig = new Configuration();
    +		flinkConfig.setString(AkkaOptions.WATCH_HEARTBEAT_PAUSE, "5 s");
    +		flinkConfig.setString(AkkaOptions.WATCH_HEARTBEAT_INTERVAL, "1 s");
     		flinkConfig.setInteger(ConfigConstants.LOCAL_NUMBER_TASK_MANAGER, NUM_TMS);
     		flinkConfig.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, TM_SLOTS);
     		flinkConfig.setLong(TaskManagerOptions.MANAGED_MEMORY_SIZE, 16L);
    -		flinkConfig.setString(ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_DELAY, "0 s");
    +		flinkConfig.setString(ConfigConstants.RESTART_STRATEGY_FIXED_DELAY_DELAY, "5 s");
    --- End diff --
    
    If we can avoid this, we will save time during testing....


---

[GitHub] flink pull request #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4775#discussion_r143162433
  
    --- Diff: flink-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaTestBase.java ---
    @@ -121,10 +122,13 @@ public static void shutDownServices() throws Exception {
     
     	protected static Configuration getFlinkConfiguration() {
     		Configuration flinkConfig = new Configuration();
    +		flinkConfig.setString(AkkaOptions.WATCH_HEARTBEAT_PAUSE, "5 s");
    +		flinkConfig.setString(AkkaOptions.WATCH_HEARTBEAT_INTERVAL, "1 s");
    +		flinkConfig.setBoolean(AkkaOptions.LOG_LIFECYCLE_EVENTS, true);
    --- End diff --
    
    Yes sure, I forgot to drop it, it was only for debug purposes


---

[GitHub] flink pull request #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4775#discussion_r143149399
  
    --- Diff: flink-connectors/flink-connector-kafka-base/src/test/java/org/apache/flink/streaming/connectors/kafka/KafkaTestBase.java ---
    @@ -121,10 +122,13 @@ public static void shutDownServices() throws Exception {
     
     	protected static Configuration getFlinkConfiguration() {
     		Configuration flinkConfig = new Configuration();
    +		flinkConfig.setString(AkkaOptions.WATCH_HEARTBEAT_PAUSE, "5 s");
    +		flinkConfig.setString(AkkaOptions.WATCH_HEARTBEAT_INTERVAL, "1 s");
    +		flinkConfig.setBoolean(AkkaOptions.LOG_LIFECYCLE_EVENTS, true);
    --- End diff --
    
    Can we omit this log?


---

[GitHub] flink issue #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/4775
  
    Thanks, will merge this without the added restart delay. If it is still unstable, we can add that back.


---

[GitHub] flink issue #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4775
  
    @StephanEwen thanks for merging. We can do that.


---

[GitHub] flink issue #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

Posted by pnowojski <gi...@git.apache.org>.
Github user pnowojski commented on the issue:

    https://github.com/apache/flink/pull/4775
  
    I have run ~500 Kafka09 tests on travis and problem with `TaskManager` was lost/no more resources is gone. However in those 500 runs twice I have seen `at-least-once` test failure ( @tzulitai is looking into it )


---

[GitHub] flink pull request #4775: [FLINK-7739] Fix KafkaXXITCase tests stability

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/4775


---