You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by revans2 <gi...@git.apache.org> on 2016/09/20 22:05:44 UTC

[GitHub] storm pull request #1697: STORM-2018: Supervisor V2

GitHub user revans2 opened a pull request:

    https://github.com/apache/storm/pull/1697

    STORM-2018: Supervisor V2

    Still need to do some more manual testing but the unit tests passed for me.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/revans2/incubator-storm STORM-2018-1.x

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/1697.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1697
    
----
commit 318ab5f300e0820ab862cbeca04b7cb67699b938
Author: Robert (Bobby) Evans <ev...@yahoo-inc.com>
Date:   2016-09-20T19:59:07Z

    STORM-2018: Supervisor V2

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @revans2 @srdo 
    I'm even OK if we file an issue regarding intermittent race condition for local cluster and merge this now, since the race condition of Supervisor in 1.x is much critical. It even occurs on clustered environment.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @revans2 Shouldn't healthcheck.clj be deleted? At least for me HealthCheck.java clashes with healthcheck.clj. I can't clearly say why, might be specific issue with OSX, but anyway there's an issue. I left a comment regarding this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @revans2 
    It would be better to address STORM-2131 here as well. Please pull #1724 here.
    And could you update the pull request according to the review comments? Supervisor V2 is the one I would want to include to 1.1.0.
    
    Thanks in advance.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    While running build in storm-core I found that null/storm-local directory is created in storm-core. Maybe there's a case base path is set to null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    Just pulled in the latest set of bug fixes from master.  All known issues have been addressed and we have been running in staging with various versions of this patch for over a week now.  Expect to roll out to production fairly soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @revans2 Relevant PRs (#1699 #1700 #1705 #1712) are all merged to master. Please pull them here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @revans2 Do you have any updates on this? I'm occasionally seeing Supervisor failures so would like to get this merged to 1.x, and even 1.0.x.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    I seem to be getting a few new errors when running some of our own unit tests with this branch. The exceptions are intermittent.
    
    ``` 
    java.lang.NullPointerException
    	at org.apache.storm.utils.DisruptorQueue$FlusherPool.stop(DisruptorQueue.java:110)
    	at org.apache.storm.utils.DisruptorQueue$Flusher.close(DisruptorQueue.java:293)
    	at org.apache.storm.utils.DisruptorQueue.haltWithInterrupt(DisruptorQueue.java:410)
    	at org.apache.storm.disruptor$halt_with_interrupt_BANG_.invoke(disruptor.clj:77)
    	at org.apache.storm.daemon.executor$mk_executor$reify__4923.shutdown(executor.clj:412)
    	at sun.reflect.GeneratedMethodAccessor303.invoke(Unknown Source)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
    	at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313)
    	at org.apache.storm.daemon.worker$fn__5550$exec_fn__1372__auto__$reify__5552$shutdown_STAR___5572.invoke(worker.clj:668)
    	at org.apache.storm.daemon.worker$fn__5550$exec_fn__1372__auto__$reify$reify__5598.shutdown(worker.clj:706)
    	at org.apache.storm.ProcessSimulator.killProcess(ProcessSimulator.java:66)
    	at org.apache.storm.ProcessSimulator.killAllProcesses(ProcessSimulator.java:79)
    	at org.apache.storm.testing$kill_local_storm_cluster.invoke(testing.clj:207)
    	at org.apache.storm.testing4j$_withLocalCluster.invoke(testing4j.clj:93)
    	at org.apache.storm.Testing.withLocalCluster(Unknown Source)
    ```
    
    and this kind of error
    ```
    java.lang.IllegalStateException: Timer is not active
    	at org.apache.storm.timer$check_active_BANG_.invoke(timer.clj:87)
    	at org.apache.storm.timer$cancel_timer.invoke(timer.clj:120)
    	at org.apache.storm.daemon.worker$fn__5550$exec_fn__1372__auto__$reify__5552$shutdown_STAR___5572.invoke(worker.clj:682)
    	at org.apache.storm.daemon.worker$fn__5550$exec_fn__1372__auto__$reify$reify__5598.shutdown(worker.clj:706)
    	at org.apache.storm.ProcessSimulator.killProcess(ProcessSimulator.java:66)
    	at org.apache.storm.ProcessSimulator.killAllProcesses(ProcessSimulator.java:79)
    	at org.apache.storm.testing$kill_local_storm_cluster.invoke(testing.clj:207)
    	at org.apache.storm.testing4j$_withLocalCluster.invoke(testing4j.clj:93)
    	at org.apache.storm.Testing.withLocalCluster(Unknown Source)
    ```
    
    Our tests are running Storm in local mode with no time simulation. I've tried running the same tests on 1.x-branch, and these don't seem to occur there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    I just cherry-picked commit which excludes logs from RAT. It's merged to master but was part of port work so didn't port back.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    Just pushed the upmerged code.  Will look into pulling in #1724 too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    OK. Unit and integration tests, and manual tests passed on recent commits.
    I'm +1 and will merge this now since there's no more reviewer and this was open for fairly long, more than 1 month despite it's kind: backport.
    
    Thanks for the amazing work.
    
    Btw, I'm in favor of just having 1.1.0, not having 1.0.3 unless there's specific request on it. If you  would be OK to port back on demand, we could skip it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    Merged in #1724 now too (it was a trivial cherry pick).  
    
    @HeartSaVioR if you want to take a look this should be good for merging in.
    
    Just as an FYI we have been running with a version of this in production for a little while now with no real issues.
    
    Once this goes in if you want me to I can take a look at pulling it back to 1.0.x too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Re: [GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by Patricia McDade <pa...@gmail.com>.
Can you please let me know how to unsubscribe I have followed instructions
without success

On Sep 21, 2016 11:08 AM, "revans2" <gi...@git.apache.org> wrote:

> Github user revans2 commented on the issue:
>
>     https://github.com/apache/storm/pull/1697
>
>     I ran the same set of manual tests as before, but I now want to wait
> on #1699 to go into master, and then I will pull it in here.  We are in the
> process of rolling essentially what is this same patch out to staging at
> Yahoo, and plan to roll it out to production shortly too.  If others are
> feeling uncomfortable about merging this into the 1.x line I am happy to
> wait until we have it in production.
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastructure@apache.org or file a JIRA ticket
> with INFRA.
> ---
>

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    I ran the same set of manual tests as before, but I now want to wait on #1699 to go into master, and then I will pull it in here.  We are in the process of rolling essentially what is this same patch out to staging at Yahoo, and plan to roll it out to production shortly too.  If others are feeling uncomfortable about merging this into the 1.x line I am happy to wait until we have it in production.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by srdo <gi...@git.apache.org>.
Github user srdo commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @revans2 I can't share our actual test code since it depends on pretty large chunks of our codebase. I'll try reproducing with an example topology.
    
    I'd be fine with filing a separate issue to fix the race so this PR isn't blocked.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    We also found #1700 so once that goes in I'll pull it in here too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1697#discussion_r79723534
  
    --- Diff: pom.xml ---
    @@ -1067,8 +1067,8 @@
                     <groupId>org.apache.maven.plugins</groupId>
                     <artifactId>maven-compiler-plugin</artifactId>
                     <configuration>
    -                    <source>1.7</source>
    -                    <target>1.7</target>
    +                    <source>1.8</source>
    +                    <target>1.8</target>
    --- End diff --
    
    Oops forgot to revert this, will go back to 1.7


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1697: STORM-2018: Supervisor V2

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/storm/pull/1697


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    The test failures look unrelated.  Some are rat failures caused by test logs not being excluded.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @HeartSaVioR Sorry this has taken so long.  I am going to upmerge this and pull in #1724 next.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    I found two issues, but other than that manual tests passed. Code review is already done from PR for master branch. +1 once these are resolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1697: STORM-2018: Supervisor V2

Posted by HeartSaVioR <gi...@git.apache.org>.
Github user HeartSaVioR commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1697#discussion_r81442669
  
    --- Diff: storm-core/src/jvm/org/apache/storm/command/HealthCheck.java ---
    @@ -0,0 +1,126 @@
    +/**
    --- End diff --
    
    This file seems to be clashed with healthcheck.clj.
    
    ```
    2016-10-01 11:53:24.105 timer o.a.s.d.s.DefaultUncaughtExceptionHandler [ERROR] Error when processing event
    java.lang.NoClassDefFoundError: org/apache/storm/command/HealthCheck (wrong name: org/apache/storm/command/healthcheck)
    	at java.lang.ClassLoader.defineClass1(Native Method)
    	at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
    	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    	at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    	at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    	at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    	at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    	at org.apache.storm.daemon.supervisor.timer.SupervisorHealthCheck.run(SupervisorHealthCheck.java:36)
    	at org.apache.storm.StormTimer$1.run(StormTimer.java:190)
    	at org.apache.storm.StormTimer$StormTimerTask.run(StormTimer.java:83)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @HeartSaVioR good catch, I thought I had deleted it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    OK so going through the code in both cases it looks like the only way that can happen is if the workers is somehow being shut down multiple times.  My guess is that because the slots are on different threads there is a race now between shutting down a worker through the slot and shutting down the worker through the cluster shutting down.
    
    I'll look into reproducing it.  @srdo is there any way you can share your test case with us?  It would make my job a lot simpler of trying to reproduce and fix it.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    Still have #1699 and #1712 to backport before this is ready


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm pull request #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/storm/pull/1697#discussion_r79856121
  
    --- Diff: pom.xml ---
    @@ -1067,8 +1067,8 @@
                     <groupId>org.apache.maven.plugins</groupId>
                     <artifactId>maven-compiler-plugin</artifactId>
                     <configuration>
    -                    <source>1.7</source>
    -                    <target>1.7</target>
    +                    <source>1.8</source>
    +                    <target>1.8</target>
    --- End diff --
    
    DONE


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] storm issue #1697: STORM-2018: Supervisor V2

Posted by revans2 <gi...@git.apache.org>.
Github user revans2 commented on the issue:

    https://github.com/apache/storm/pull/1697
  
    @srdo sounds good.  I filed https://issues.apache.org/jira/browse/STORM-2175 to address the race condition.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---