You are viewing a plain text version of this content. The canonical link for it is here.

Posted to notifications@accumulo.apache.org by "Bill Havanki (JIRA)" <ji...@apache.org> on 2014/04/05 01:59:17 UTC

[jira] [Commented] (ACCUMULO-2621) Masters not restarting during concurrent randomwalk

    [ https://issues.apache.org/jira/browse/ACCUMULO-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960819#comment-13960819 ] 

Bill Havanki commented on ACCUMULO-2621:
----------------------------------------

The problem may be centered on the {{Shutdown}} action. The action uses a {{MasterClient}} to continue connecting to the master until the connection fails, which is assumed to be because the master has exited. However, I see connection refused errors immediately after the shutdown begins, while the master is probably still running. The ensuing restart of the master would probably then fail because the old one is still up. More investigation is needed.

> Masters not restarting during concurrent randomwalk
> ---------------------------------------------------
>
>                 Key: ACCUMULO-2621
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2621
>             Project: Accumulo
>          Issue Type: Test
>          Components: test
>            Reporter: Bill Havanki
>            Priority: Critical
>              Labels: 16_qa_bug, randomwalk, test
>             Fix For: 1.6.1, 1.7.0
>
>
> The Concurrent randomwalk test can stop and restart the masters. Under 1.6.0-SNAPSHOT, the stopped masters are not restarting, and eventually the test becomes stuck reporting "No matchers..." forever.
> Tested on 7-node CentOS 6.4 cluster, 2 masters. The active master seems to die first, then the standby that becomes the new master.



--
This message was sent by Atlassian JIRA
(v6.2#6252)