You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Vladimir Avram (JIRA)" <ji...@apache.org> on 2014/07/17 01:25:04 UTC

[jira] [Updated] (CASSANDRA-7560) 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession

     [ https://issues.apache.org/jira/browse/CASSANDRA-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vladimir Avram updated CASSANDRA-7560:
--------------------------------------

    Description: 
Running {{nodetool repair -pr}} will sometimes hang on one of the resulting AntiEntropySessions.

The system logs will show the repair command starting

{noformat}
 INFO [Thread-3079] 2014-07-15 02:22:56,514 StorageService.java (line 2569) Starting repair command #1, repairing 256 ranges for keyspace x
{noformat}

You can then see a few AntiEntropySessions completing with:

{noformat}
INFO [AntiEntropySessions:2] 2014-07-15 02:28:12,766 RepairSession.java (line 282) [repair #eefb3c30-0bc6-11e4-83f7-a378978d0c49] session completed successfully
{noformat}

Finally we reach an AntiEntropySession at some point that hangs just before requesting the merkle trees for the next column family in line for repair. So we first see the previous CF being finished and the whole repair sessions hangs here with no visible progress or errors on this or any of the related nodes.

{noformat}
INFO [AntiEntropyStage:1] 2014-07-15 02:38:20,325 RepairSession.java (line 221) [repair #8f85c1b0-0bc8-11e4-83f7-a378978d0c49] previous_cf is fully synced
{noformat}

  was:
Running {{nodetool repair -pr}} will sometimes hang on one of the resulting AntiEntropySessions.

The system logs will show the repair command starting

{panel}
 INFO [Thread-3079] 2014-07-15 02:22:56,514 StorageService.java (line 2569) Starting repair command #1, repairing 256 ranges for keyspace x
{panel}

You can then see a few AntiEntropySessions completing with:

{panel}
INFO [AntiEntropySessions:2] 2014-07-15 02:28:12,766 RepairSession.java (line 282) [repair #eefb3c30-0bc6-11e4-83f7-a378978d0c49] session completed successfully
{panel}

Finally we reach an AntiEntropySession at some point that hangs just before requesting the merkle trees for the next column family in line for repair. So we first see the previous CF being finished and the whole repair sessions hangs here with no visible progress or errors on this or any of the related nodes.

{panel}
INFO [AntiEntropyStage:1] 2014-07-15 02:38:20,325 RepairSession.java (line 221) [repair #8f85c1b0-0bc8-11e4-83f7-a378978d0c49] previous_cf is fully synced
{panel}


> 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession
> ----------------------------------------------------------------------
>
>                 Key: CASSANDRA-7560
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7560
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Vladimir Avram
>
> Running {{nodetool repair -pr}} will sometimes hang on one of the resulting AntiEntropySessions.
> The system logs will show the repair command starting
> {noformat}
>  INFO [Thread-3079] 2014-07-15 02:22:56,514 StorageService.java (line 2569) Starting repair command #1, repairing 256 ranges for keyspace x
> {noformat}
> You can then see a few AntiEntropySessions completing with:
> {noformat}
> INFO [AntiEntropySessions:2] 2014-07-15 02:28:12,766 RepairSession.java (line 282) [repair #eefb3c30-0bc6-11e4-83f7-a378978d0c49] session completed successfully
> {noformat}
> Finally we reach an AntiEntropySession at some point that hangs just before requesting the merkle trees for the next column family in line for repair. So we first see the previous CF being finished and the whole repair sessions hangs here with no visible progress or errors on this or any of the related nodes.
> {noformat}
> INFO [AntiEntropyStage:1] 2014-07-15 02:38:20,325 RepairSession.java (line 221) [repair #8f85c1b0-0bc8-11e4-83f7-a378978d0c49] previous_cf is fully synced
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)