You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/01/13 18:14:39 UTC

[jira] [Commented] (FLINK-2586) Unstable Storm Compatibility Tests

    [ https://issues.apache.org/jira/browse/FLINK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096579#comment-15096579 ] 

ASF GitHub Bot commented on FLINK-2586:
---------------------------------------

GitHub user mjsax opened a pull request:

    https://github.com/apache/flink/pull/1502

    [FLINK-2586] Unstable Storm Compatibility Tests

     - added BLOCKING flag to FlinkLocalCluster
     - added NullTerminatingSpout and SpoutOutputCollectorObserver plus tests
     - reworked test accordingly
       - set BLOCKING flag for ITCases
       - make infinite spouts finite using NullTerminatingSpout
       - removed sleep time to get stable
     - fixed bug in BoltSplitITCase and SpoutSplitITCase
       - exception in VerifyAndEnrichBolt is swallowed and test would not fail (replaced by errorFlag)
     - reduced sleep-time in BoltSplitITCase and SpoutSplitITCase to reduce testing time
     - fixed WrapperSetupHelperTest
       - reworked to origianl version with more than two inputs
         (was limite to two inputs because more the two inputs per bolt was not supported in between, which is now fixed)
    minor code cleanup (removed unused imports, indenting, etc.)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mjsax/flink flink-2586-unstable-storm-tests

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1502.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1502
    
----
commit ee830a62ae77e808b442e28acdd2a71b04758362
Author: mjsax <mj...@apache.org>
Date:   2016-01-13T16:47:32Z

    [FLINK-2586] Unstable Storm Compatibility Tests
     - added BLOCKING flag to FlinkLocalCluster
     - added NullTerminatingSpout and SpoutOutputCollectorObserver plus tests
     - reworked test accordingly
       - set BLOCKING flag for ITCases
       - make infinite spouts finite using NullTerminatingSpout
       - removed sleep time to get stable
     - fixed bug in BoltSplitITCase and SpoutSplitITCase
       - exception in VerifyAndEnrichBolt is swallowed and test would not fail (replaced by errorFlag)
     - reduced sleep-time in BoltSplitITCase and SpoutSplitITCase to reduce testing time
     - fixed WrapperSetupHelperTest
       - reworked to origianl version with more than two inputs
         (was limite to two inputs because more the two inputs per bolt was not supported in between, which is now fixed)
    minor code cleanup (removed unused imports, indenting, etc.)

----


> Unstable Storm Compatibility Tests
> ----------------------------------
>
>                 Key: FLINK-2586
>                 URL: https://issues.apache.org/jira/browse/FLINK-2586
>             Project: Flink
>          Issue Type: Bug
>          Components: Storm Compatibility
>    Affects Versions: 0.10.0
>            Reporter: Stephan Ewen
>            Assignee: Matthias J. Sax
>            Priority: Critical
>              Labels: test-stability
>             Fix For: 1.0.0
>
>
> The Storm Compatibility tests frequently fail.
> The reason is that they kill the topologies after a certain time interval. That may fail on CI infrastructure when certain steps are delayed beyond usual. Trying to guarantee progress by time is inherently problematic:
>   - Waiting too short makes tests unstable
>   - Waiting too long makes tests slow
> The right way to go is letting the program decide when to terminate, for example by throwing a special {{SuccessException}}.
> Have a look at the Kafka connector tests, they do this a lot and hence run exactly as short or as long as they need to.
> Here is an example of a failed run: https://s3.amazonaws.com/archive.travis-ci.org/jobs/77499577/log.txt
> From FLINK-2801
> bq. The tests for the storm compatibiliy layer are all working with timeouts (running the program for 10 seconds) and then checking whether teh expected result has been written.
> bq. That is inherently unstable and slow (long delays). They should be rewritten in a similar manner like for example the KafkaITCase tests, where the streaming jobs terminate themselves with a "SuccessException", which can be recognized as successful completion when thrown by the job client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)