You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/01/27 01:20:00 UTC

[jira] [Commented] (SAMZA-1568) Handle ZkInterruptedException in zkclient.close.

    [ https://issues.apache.org/jira/browse/SAMZA-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341879#comment-16341879 ] 

ASF GitHub Bot commented on SAMZA-1568:
---------------------------------------

GitHub user shanthoosh opened a pull request:

    https://github.com/apache/samza/pull/416

    SAMZA-1568: Handle ZkInterruptedException in zkclient.close.

    When zookeeper session failures occur in a stream processor,   leaves the group(zkClient is closed) and joins the group again.
    
    The last step in that shutdown sequence is zkClient.close(). In some scenarios, it throws the following exception, 
    
        org.I0Itec.zkclient.exception.ZkInterruptedException: java.lang.InterruptedException
        at org.I0Itec.zkclient.ZkClient.close(ZkClient.java:1278)
        at org.apache.samza.zk.ZkControllerImpl.stop(ZkControllerImpl.java:92)
    
        at org.apache.samza.zk.ZkJobCoordinator.stop(ZkJobCoordinator.java:141)
    In existing implementation this is not handled, there by killing the stream processor.  The following codepath triggers this exception:
    
    `StreamProcessor.stop -> ZkJobCoordinator.stop() ->  zkController.stop() -> zkUtils.close`
    
    This exception causes the integration test to fail occasionally  and can cause LocalApplicationRunner.waitForFinish method call to block indefinitely(since this callback event success, updates the latch state required for waitForFinish to end).
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shanthoosh/samza zk_utils_close

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/416.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #416
    
----
commit 2542a0ad1bee19927761eaf0171a6f637f21ac3a
Author: Shanthoosh Venkataraman <sv...@...>
Date:   2018-01-25T19:44:55Z

    SAMZA-1568: Handle ZkInterruptedException in zkclient.close.

----


> Handle ZkInterruptedException in zkclient.close.
> ------------------------------------------------
>
>                 Key: SAMZA-1568
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1568
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Shanthoosh Venkataraman
>            Assignee: Shanthoosh Venkataraman
>            Priority: Major
>
> When zookeeper session failures occur in a stream processor,   leaves the group(zkClient is closed) and joins the group again.
> The last step in that shutdown sequence is zkClient.close(). In some scenarios, it throws the following exception, 
> {code:java}
>     org.I0Itec.zkclient.exception.ZkInterruptedException: java.lang.InterruptedException
>     at org.I0Itec.zkclient.ZkClient.close(ZkClient.java:1278)
>     at org.apache.samza.zk.ZkControllerImpl.stop(ZkControllerImpl.java:92)
>     at org.apache.samza.zk.ZkJobCoordinator.stop(ZkJobCoordinator.java:141)
> {code}
> In existing implementation this is not handled, there by killing the stream processor.  The following codepath triggers this exception:
> {code:java}
> StreamProcessor.stop -> ZkJobCoordinator.stop() ->  zkController.stop() -> zkUtils.close
> {code}
> This exception causes the integration test to fail occasionally  and can cause LocalApplicationRunner.waitForFinish method call to block indefinitely(since this callback event success, updates the latch state required for waitForFinish to end).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)