You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by srdo <gi...@git.apache.org> on 2018/10/21 12:49:44 UTC

[GitHub] storm pull request #2890: STORM-3268: Improve integration test stability

GitHub user srdo opened a pull request:

    https://github.com/apache/storm/pull/2890

    STORM-3268: Improve integration test stability

    https://issues.apache.org/jira/browse/STORM-3268
    
    Basically the integration test can't handle logs rolling during the test. To fix this we added a bit of time between spout emits, but I don't think it was enough. I've put a limit on how many tuples the spouts will emit during the tests, which should prevent the logs from growing too much.
    
    Some tests use multiple bolt/spout tasks, and don't wait for all executors to be started. In some cases tests could fail because e.g. two spout instances were supposed to run, but one executor was slow to start so the log for that executor wasn't picked up when the test asked Nimbus for logs for the spout component. The tests now check that Nimbus has all expected executors listed.
    
    The tests extract spout/bolt logs by looking for a magic string. If multiple components were running in the same worker, the log reading logic could pick up e.g. logs from the bolt while looking for logs from the spout. The magic string now includes the component id the log reader is looking for.
    
    Some tests were slow due to building lots of useless error messages, the error messages are constructed lazily now.
    
    The tests aren't perfectly stable, but I think this is due to an issue in Storm where tuples can be lost if the spout starts emitting before the bolt executor is ready to receive. I've seen the spout emit a bunch of tuples before the bolt worker started, and those tuples end up timing out.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srdo/storm STORM-3268

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2890.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2890
    
----
commit ef9d907d457d9d5ce9af42b88c8a11d16b7facfe
Author: Stig Rohde Døssing <sr...@...>
Date:   2018-10-21T10:55:49Z

    STORM-3267: Disable Java 10 build

commit 5b066609b0dbeec2bb842ddd5a81ae8c32ac9719
Author: Stig Rohde Døssing <sr...@...>
Date:   2018-10-20T20:17:26Z

    STORM-3268: Try to improve integration test stability

----


---

[GitHub] storm pull request #2890: STORM-3268: Improve integration test stability

Posted by d2r <gi...@git.apache.org>.
Github user d2r commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2890#discussion_r226989550
  
    --- Diff: integration-test/src/main/java/org/apache/storm/st/topology/TestableTopology.java ---
    @@ -17,14 +17,18 @@
     
     package org.apache.storm.st.topology;
     
    +import java.util.concurrent.TimeUnit;
     import org.apache.storm.generated.StormTopology;
     
     public interface TestableTopology {
         String DUMMY_FIELD = "dummy";
    -    //Some tests rely on reading the worker log. If emits are too close together and too much is logged, the log might roll, breaking the test.
    -    int MIN_SLEEP_BETWEEN_EMITS_MS = 10;
    -    int MAX_SLEEP_BETWEEN_EMITS_MS = 100;
    +    int TIMEDATA_SLEEP_BETWEEN_EMITS_MS = 20;
    +    //Some tests rely on reading the worker log. If there are too many emits and too much is logged, the log might roll, breaking the test.
    --- End diff --
    
    I wonder if we could instead configure the logs to roll less often, since that is the real issue?


---

[GitHub] storm pull request #2890: STORM-3268: Improve integration test stability

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/2890


---

[GitHub] storm pull request #2890: STORM-3268: Improve integration test stability

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2890#discussion_r227033367
  
    --- Diff: integration-test/src/main/java/org/apache/storm/st/topology/TestableTopology.java ---
    @@ -17,14 +17,18 @@
     
     package org.apache.storm.st.topology;
     
    +import java.util.concurrent.TimeUnit;
     import org.apache.storm.generated.StormTopology;
     
     public interface TestableTopology {
         String DUMMY_FIELD = "dummy";
    -    //Some tests rely on reading the worker log. If emits are too close together and too much is logged, the log might roll, breaking the test.
    -    int MIN_SLEEP_BETWEEN_EMITS_MS = 10;
    -    int MAX_SLEEP_BETWEEN_EMITS_MS = 100;
    +    int TIMEDATA_SLEEP_BETWEEN_EMITS_MS = 20;
    +    //Some tests rely on reading the worker log. If there are too many emits and too much is logged, the log might roll, breaking the test.
    --- End diff --
    
    Yes, we could disable log rolling by customizing the cluster's `log4j2.xml`. I wanted to avoid doing that because it means you can't (reliably) run the integration test against your own cluster without having to edit `log4j2.xml` as well.


---

[GitHub] storm pull request #2890: STORM-3268: Improve integration test stability

Posted by d2r <gi...@git.apache.org>.
Github user d2r commented on a diff in the pull request:

    https://github.com/apache/storm/pull/2890#discussion_r227036635
  
    --- Diff: integration-test/src/main/java/org/apache/storm/st/topology/TestableTopology.java ---
    @@ -17,14 +17,18 @@
     
     package org.apache.storm.st.topology;
     
    +import java.util.concurrent.TimeUnit;
     import org.apache.storm.generated.StormTopology;
     
     public interface TestableTopology {
         String DUMMY_FIELD = "dummy";
    -    //Some tests rely on reading the worker log. If emits are too close together and too much is logged, the log might roll, breaking the test.
    -    int MIN_SLEEP_BETWEEN_EMITS_MS = 10;
    -    int MAX_SLEEP_BETWEEN_EMITS_MS = 100;
    +    int TIMEDATA_SLEEP_BETWEEN_EMITS_MS = 20;
    +    //Some tests rely on reading the worker log. If there are too many emits and too much is logged, the log might roll, breaking the test.
    --- End diff --
    
    OK, I assumed there was a way for the test itself to manipulate this. If not, then no worries.


---

[GitHub] storm issue #2890: STORM-3268: Improve integration test stability

Posted by d2r <gi...@git.apache.org>.
Github user d2r commented on the issue:

    https://github.com/apache/storm/pull/2890
  
    Depends on #2889 



---