You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Michael Noll (JIRA)" <ji...@apache.org> on 2014/06/26 16:10:24 UTC

[jira] [Updated] (STORM-163) Simulated Cluster Time doesn't work well for me.

     [ https://issues.apache.org/jira/browse/STORM-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michael Noll updated STORM-163:
-------------------------------

    Description: 
https://github.com/nathanmarz/storm/issues/512

{code}
(deftest test-builtin-metrics-2
  (with-simulated-time-local-cluster
    [cluster :daemon-conf {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                         
                           TOPOLOGY-MESSAGE-TIMEOUT-SECS 5
                           }]
    (let [feeder (feeder-spout ["field1"])
          tracker (AckFailMapTracker.)
          _ (.setAckFailDelegate feeder tracker)
          topology (thrift/mk-topology
                    {"myspout" (thrift/mk-spout-spec feeder)}
                    {"mybolt" (thrift/mk-bolt-spec {"myspout" :shuffle} ack-every-other)})]      
      (submit-local-topology (:nimbus cluster)
                             "metrics-tester"
                             {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                           
                              TOPOLOGY-MESSAGE-TIMEOUT-SECS 5}
                             topology)
      (.feed feeder ["a"] 1)              
      (.feed feeder ["b"] 2)
      (advance-cluster-time cluster 10)
      (assert-failed tracker 2)
      )))
{code}

The above unit test just hangs. 
This isn't just a one off unit test, there's a whole class of these when advance-cluster time is near message-timeout-secs (but greater than).

I noticed that when I added system executors in order to get heap size metrics, that an existing metrics unit test started to fail at assert-failed where previously it succeeded. So it seems like the amount that advance-cluster-time has to exceed message timeout by is not constant . This might explain the why zookeeper 3.4.5 upgrade caused unit tests to hang (where mk-in-process zookeeper has slower performance and start-up time).

And it'll pass if I run it with lein2 test selector , but fail if I run all unit tests together

I spent 6 hours trying to fix the unit tests, but haven't figured it out yet.


  was:
https://github.com/nathanmarz/storm/issues/512

(deftest test-builtin-metrics-2
  (with-simulated-time-local-cluster
    [cluster :daemon-conf {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                         
                           TOPOLOGY-MESSAGE-TIMEOUT-SECS 5
                           }]
    (let [feeder (feeder-spout ["field1"])
          tracker (AckFailMapTracker.)
          _ (.setAckFailDelegate feeder tracker)
          topology (thrift/mk-topology
                    {"myspout" (thrift/mk-spout-spec feeder)}
                    {"mybolt" (thrift/mk-bolt-spec {"myspout" :shuffle} ack-every-other)})]      
      (submit-local-topology (:nimbus cluster)
                             "metrics-tester"
                             {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                           
                              TOPOLOGY-MESSAGE-TIMEOUT-SECS 5}
                             topology)
      (.feed feeder ["a"] 1)              
      (.feed feeder ["b"] 2)
      (advance-cluster-time cluster 10)
      (assert-failed tracker 2)
      )))

The above unit test just hangs. 
This isn't just a one off unit test, there's a whole class of these when advance-cluster time is near message-timeout-secs (but greater than).

I noticed that when I added system executors in order to get heap size metrics, that an existing metrics unit test started to fail at assert-failed where previously it succeeded. So it seems like the amount that advance-cluster-time has to exceed message timeout by is not constant . This might explain the why zookeeper 3.4.5 upgrade caused unit tests to hang (where mk-in-process zookeeper has slower performance and start-up time).

And it'll pass if I run it with lein2 test selector , but fail if I run all unit tests together

I spent 6 hours trying to fix the unit tests, but haven't figured it out yet.



> Simulated Cluster Time doesn't work well for me.
> ------------------------------------------------
>
>                 Key: STORM-163
>                 URL: https://issues.apache.org/jira/browse/STORM-163
>             Project: Apache Storm (Incubating)
>          Issue Type: Bug
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/512
> {code}
> (deftest test-builtin-metrics-2
>   (with-simulated-time-local-cluster
>     [cluster :daemon-conf {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                         
>                            TOPOLOGY-MESSAGE-TIMEOUT-SECS 5
>                            }]
>     (let [feeder (feeder-spout ["field1"])
>           tracker (AckFailMapTracker.)
>           _ (.setAckFailDelegate feeder tracker)
>           topology (thrift/mk-topology
>                     {"myspout" (thrift/mk-spout-spec feeder)}
>                     {"mybolt" (thrift/mk-bolt-spec {"myspout" :shuffle} ack-every-other)})]      
>       (submit-local-topology (:nimbus cluster)
>                              "metrics-tester"
>                              {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                           
>                               TOPOLOGY-MESSAGE-TIMEOUT-SECS 5}
>                              topology)
>       (.feed feeder ["a"] 1)              
>       (.feed feeder ["b"] 2)
>       (advance-cluster-time cluster 10)
>       (assert-failed tracker 2)
>       )))
> {code}
> The above unit test just hangs. 
> This isn't just a one off unit test, there's a whole class of these when advance-cluster time is near message-timeout-secs (but greater than).
> I noticed that when I added system executors in order to get heap size metrics, that an existing metrics unit test started to fail at assert-failed where previously it succeeded. So it seems like the amount that advance-cluster-time has to exceed message timeout by is not constant . This might explain the why zookeeper 3.4.5 upgrade caused unit tests to hang (where mk-in-process zookeeper has slower performance and start-up time).
> And it'll pass if I run it with lein2 test selector , but fail if I run all unit tests together
> I spent 6 hours trying to fix the unit tests, but haven't figured it out yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)