You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Chip Salzenberg <re...@gmail.com> on 2010/11/14 22:49:12 UTC

repair takes two days, and ends up stuck: stream at 1096% (yes, really)

My by-now infamous eight-node cluster running 0.7.0beta3+ dropped many
replication MUTATEs during load, so I decided to fix replication copies with
a "nodetool repair" on one of the nodes (X.21).  The repair has been running
for two days, and has finally gotten itself wedged into a state where it
can't proceed.

The log on X.21 continually describe the need to stream a data file,
unsuccessfully.  From other clues below I gather this is a receive stream.
This message repeated many many times, multiple per second, but has now
stopped:

INFO [Thread-13877] 2010-11-14 09:17:35,207 StreamInSession.java (line 124)
Streaming of file
/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
         progress=90112/219682197079 - 0% from
org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed: requesting a
retry.

Here's the best joke, though: "nodetool -h X.20 nestats" shows that the
given stream has been attempted a few times and is still being attempted,
but in a broken way, such that the progress percentage has gone way past
100%.  It's now at 1096% and still rising.

I'm not rebooting so I can poke around as devs suggest.  I'm also not
sending logs to the list, at least in part because they're, well, big.  If
any developers want them, though, I'm happy to send them.

-------------------------------------------------------------------------------------------------------------------
Mode: Normal
Streaming to: /X.21
   /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
         progress=2408587638800/219682197079 - 1096%
        <---- see this
   /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
         progress=0/182528797 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
         progress=0/908075169 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
         progress=0/784362565 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
         progress=0/896956312 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
         progress=0/894019840 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
         progress=0/901377643 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
         progress=0/22306924 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
         progress=0/888814566 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
         progress=0/889095219 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
         progress=0/893034298 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
         progress=0/371718620 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
         progress=0/14172830870 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
         progress=0/8939407316 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
         progress=0/897417147 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
         progress=0/357220526 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
         progress=0/899103394 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
         progress=0/898165901 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
         progress=0/13323957368 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
         progress=0/892116147 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
         progress=0/28216239303 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
         progress=0/307921317 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
         progress=0/185277927 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
         progress=0/893683568 - 0%
Streaming from: /X.21
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
         progress=0/176842211 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
         progress=0/447272883 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
         progress=0/444440243 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
         progress=0/14275850800 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
         progress=0/31878407176 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
         progress=0/446800028 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
         progress=0/367116560 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
         progress=0/445241132 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
         progress=0/4497953871 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
         progress=0/449662908 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
         progress=0/454101872 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
         progress=0/447381444 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
         progress=0/208633237 - 0%
Pool Name                    Active   Pending      Completed
Commands                        n/a         0      709910236
Responses                       n/a         0      363174385
-------------------------------------------------------------------------------------------------------------------


Meanwhile, "nodetool -h X.21 netstats" shows a large number of transfers
that are at 0% and haven't moved, AFAICT, for at least an hour:

-------------------------------------------------------------------------------------------------------------------
Mode: Normal
Streaming to: /X.20
   /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
         progress=0/208633237 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
         progress=0/367116560 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
         progress=0/4497953871 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
         progress=0/444440243 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
         progress=0/176842211 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
         progress=0/14275850800 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
         progress=0/445241132 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
         progress=0/31878407176 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
         progress=0/447272883 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
         progress=0/446800028 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
         progress=0/449662908 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
         progress=0/447381444 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
         progress=0/454101872 - 0%
Streaming to: /X.22
   /var/lib/cassandra/data/Attrs/TestAttrs-e-350-Data.db/(0,887780227)
         progress=0/887780227 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-341-Data.db/(0,885896138)
         progress=0/885896138 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db/(0,892560053)
         progress=0/892560053 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-358-Data.db/(0,888436251)
         progress=0/888436251 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,893446845)
         progress=0/893446845 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-354-Data.db/(0,889058842)
         progress=0/889058842 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-90-Data.db/(0,61505031301)
         progress=0/61505031301 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,887620464)
         progress=0/887620464 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-361-Data.db/(0,890820399)
         progress=0/890820399 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-345-Data.db/(0,887535512)
         progress=0/887535512 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-329-Data.db/(0,16876107370)
         progress=0/16876107370 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-364-Data.db/(0,893839028)
         progress=0/893839028 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-356-Data.db/(0,891862436)
         progress=0/891862436 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,886276363)
         progress=0/886276363 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-292-Data.db/(0,388239771)
         progress=0/388239771 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-379-Data.db/(0,907731463)
         progress=0/907731463 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-348-Data.db/(0,893114355)
         progress=0/893114355 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-371-Data.db/(0,888682755)
         progress=0/888682755 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-338-Data.db/(0,885144435)
         progress=0/885144435 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-340-Data.db/(0,890937418)
         progress=0/890937418 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-233-Data.db/(0,33902556016)
         progress=0/33902556016 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897426603)
         progress=0/897426603 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,888711957)
         progress=0/888711957 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
         progress=0/208633237 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,893954909)
         progress=0/893954909 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,897265056)
         progress=0/897265056 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-339-Data.db/(0,888998653)
         progress=0/888998653 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,901053427)
         progress=0/901053427 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
         progress=0/4497953871 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-343-Data.db/(0,891732427)
         progress=0/891732427 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-357-Data.db/(0,888267065)
         progress=0/888267065 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-335-Data.db/(0,889998928)
         progress=0/889998928 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-365-Data.db/(0,888528931)
         progress=0/888528931 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
         progress=0/14275850800 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-355-Data.db/(0,893535664)
         progress=0/893535664 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-347-Data.db/(0,891375566)
         progress=0/891375566 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,897994571)
         progress=0/897994571 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,897589898)
         progress=0/897589898 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-336-Data.db/(0,891079134)
         progress=0/891079134 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,892852094)
         progress=0/892852094 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-337-Data.db/(0,885983148)
         progress=0/885983148 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-346-Data.db/(0,886424157)
         progress=0/886424157 - 0%
   /var/lib/cassandra/data/Attrs/TestAttrs-e-353-Data.db/(0,889222127)
         progress=0/889222127 - 0%
Streaming from: /X.20
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
         progress=0/357220526 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
         progress=0/371718620 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
         progress=0/888814566 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
         progress=0/889095219 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
         progress=0/307921317 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
         progress=0/13323957368 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
         progress=0/28216239303 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
         progress=0/14172830870 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
         progress=0/899103394 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
         progress=0/894019840 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
         progress=0/185277927 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
         progress=0/896956312 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
         progress=0/893683568 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
         progress=0/182528797 - 0%
   Attrs: /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
         progress=0/22306924 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
         progress=0/897417147 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
         progress=0/8939407316 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
         progress=0/908075169 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
         progress=0/892116147 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
         progress=0/898165901 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
         progress=0/219682197079 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
         progress=0/784362565 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
         progress=0/893034298 - 0%
   Attrs:
/var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
         progress=0/901377643 - 0%
 Nothing streaming from /10.5.5.22
Pool Name                    Active   Pending      Completed
Commands                        n/a         0      433633667
Responses                       n/a         0      402612386

Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Reverend Chip <re...@gmail.com>.
Did I answer the question sufficiently?  I need repair to work, and the
cluster is sick.

On 11/14/2010 2:17 PM, Jonathan Ellis wrote:
> What exception is causing it to fail/retry?
>
> On Sun, Nov 14, 2010 at 3:49 PM, Chip Salzenberg <re...@gmail.com> wrote:
>> My by-now infamous eight-node cluster running 0.7.0beta3+ dropped many
>> replication MUTATEs during load, so I decided to fix replication copies with
>> a "nodetool repair" on one of the nodes (X.21).  The repair has been running
>> for two days, and has finally gotten itself wedged into a state where it
>> can't proceed.
>> The log on X.21 continually describe the need to stream a data file,
>> unsuccessfully.  From other clues below I gather this is a receive stream.
>> This message repeated many many times, multiple per second, but has now
>> stopped:
>> INFO [Thread-13877] 2010-11-14 09:17:35,207 StreamInSession.java (line 124)
>> Streaming of file
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>>          progress=90112/219682197079 - 0% from
>> org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed: requesting a
>> retry.
>> Here's the best joke, though: "nodetool -h X.20 nestats" shows that the
>> given stream has been attempted a few times and is still being attempted,
>> but in a broken way, such that the progress percentage has gone way past
>> 100%.  It's now at 1096% and still rising.
>> I'm not rebooting so I can poke around as devs suggest.  I'm also not
>> sending logs to the list, at least in part because they're, well, big.  If
>> any developers want them, though, I'm happy to send them.
>> -------------------------------------------------------------------------------------------------------------------
>> Mode: Normal
>> Streaming to: /X.21
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>>          progress=2408587638800/219682197079 - 1096%
>>         <---- see this
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>>          progress=0/182528797 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>>          progress=0/908075169 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>>          progress=0/784362565 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>>          progress=0/896956312 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>>          progress=0/894019840 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>>          progress=0/901377643 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>>          progress=0/22306924 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>>          progress=0/888814566 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>>          progress=0/889095219 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>>          progress=0/893034298 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>>          progress=0/371718620 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>>          progress=0/14172830870 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>>          progress=0/8939407316 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>>          progress=0/897417147 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>>          progress=0/357220526 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>>          progress=0/899103394 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>>          progress=0/898165901 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>>          progress=0/13323957368 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>>          progress=0/892116147 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>>          progress=0/28216239303 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>>          progress=0/307921317 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>>          progress=0/185277927 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>>          progress=0/893683568 - 0%
>> Streaming from: /X.21
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>>          progress=0/176842211 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>>          progress=0/447272883 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>>          progress=0/444440243 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>>          progress=0/14275850800 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>>          progress=0/31878407176 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>>          progress=0/446800028 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>>          progress=0/367116560 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>>          progress=0/445241132 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>>          progress=0/4497953871 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>>          progress=0/449662908 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>>          progress=0/454101872 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>>          progress=0/447381444 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>>          progress=0/208633237 - 0%
>> Pool Name                    Active   Pending      Completed
>> Commands                        n/a         0      709910236
>> Responses                       n/a         0      363174385
>> -------------------------------------------------------------------------------------------------------------------
>>
>> Meanwhile, "nodetool -h X.21 netstats" shows a large number of transfers
>> that are at 0% and haven't moved, AFAICT, for at least an hour:
>> -------------------------------------------------------------------------------------------------------------------
>> Mode: Normal
>> Streaming to: /X.20
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>>          progress=0/208633237 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>>          progress=0/367116560 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>>          progress=0/4497953871 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>>          progress=0/444440243 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>>          progress=0/176842211 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>>          progress=0/14275850800 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>>          progress=0/445241132 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>>          progress=0/31878407176 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>>          progress=0/447272883 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>>          progress=0/446800028 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>>          progress=0/449662908 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>>          progress=0/447381444 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>>          progress=0/454101872 - 0%
>> Streaming to: /X.22
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-350-Data.db/(0,887780227)
>>          progress=0/887780227 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-341-Data.db/(0,885896138)
>>          progress=0/885896138 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db/(0,892560053)
>>          progress=0/892560053 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-358-Data.db/(0,888436251)
>>          progress=0/888436251 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,893446845)
>>          progress=0/893446845 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-354-Data.db/(0,889058842)
>>          progress=0/889058842 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-90-Data.db/(0,61505031301)
>>          progress=0/61505031301 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,887620464)
>>          progress=0/887620464 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-361-Data.db/(0,890820399)
>>          progress=0/890820399 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-345-Data.db/(0,887535512)
>>          progress=0/887535512 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-329-Data.db/(0,16876107370)
>>          progress=0/16876107370 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-364-Data.db/(0,893839028)
>>          progress=0/893839028 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-356-Data.db/(0,891862436)
>>          progress=0/891862436 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,886276363)
>>          progress=0/886276363 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-292-Data.db/(0,388239771)
>>          progress=0/388239771 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-379-Data.db/(0,907731463)
>>          progress=0/907731463 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-348-Data.db/(0,893114355)
>>          progress=0/893114355 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-371-Data.db/(0,888682755)
>>          progress=0/888682755 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-338-Data.db/(0,885144435)
>>          progress=0/885144435 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-340-Data.db/(0,890937418)
>>          progress=0/890937418 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-233-Data.db/(0,33902556016)
>>          progress=0/33902556016 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897426603)
>>          progress=0/897426603 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,888711957)
>>          progress=0/888711957 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>>          progress=0/208633237 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,893954909)
>>          progress=0/893954909 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,897265056)
>>          progress=0/897265056 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-339-Data.db/(0,888998653)
>>          progress=0/888998653 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,901053427)
>>          progress=0/901053427 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>>          progress=0/4497953871 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-343-Data.db/(0,891732427)
>>          progress=0/891732427 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-357-Data.db/(0,888267065)
>>          progress=0/888267065 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-335-Data.db/(0,889998928)
>>          progress=0/889998928 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-365-Data.db/(0,888528931)
>>          progress=0/888528931 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>>          progress=0/14275850800 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-355-Data.db/(0,893535664)
>>          progress=0/893535664 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-347-Data.db/(0,891375566)
>>          progress=0/891375566 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,897994571)
>>          progress=0/897994571 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,897589898)
>>          progress=0/897589898 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-336-Data.db/(0,891079134)
>>          progress=0/891079134 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,892852094)
>>          progress=0/892852094 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-337-Data.db/(0,885983148)
>>          progress=0/885983148 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-346-Data.db/(0,886424157)
>>          progress=0/886424157 - 0%
>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-353-Data.db/(0,889222127)
>>          progress=0/889222127 - 0%
>> Streaming from: /X.20
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>>          progress=0/357220526 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>>          progress=0/371718620 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>>          progress=0/888814566 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>>          progress=0/889095219 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>>          progress=0/307921317 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>>          progress=0/13323957368 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>>          progress=0/28216239303 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>>          progress=0/14172830870 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>>          progress=0/899103394 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>>          progress=0/894019840 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>>          progress=0/185277927 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>>          progress=0/896956312 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>>          progress=0/893683568 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>>          progress=0/182528797 - 0%
>>    Attrs: /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>>          progress=0/22306924 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>>          progress=0/897417147 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>>          progress=0/8939407316 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>>          progress=0/908075169 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>>          progress=0/892116147 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>>          progress=0/898165901 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>>          progress=0/219682197079 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>>          progress=0/784362565 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>>          progress=0/893034298 - 0%
>>    Attrs:
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>>          progress=0/901377643 - 0%
>>  Nothing streaming from /10.5.5.22
>> Pool Name                    Active   Pending      Completed
>> Commands                        n/a         0      433633667
>> Responses                       n/a         0      402612386
>>
>
>


Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Jonathan Ellis <jb...@gmail.com>.
On Mon, Nov 15, 2010 at 5:05 PM, Reverend Chip <re...@gmail.com> wrote:
>> You can move the data files do a node with more space, compact there,
>> and move them back, but if you don't care about the data rebuilding is
>> more straightforward.
>
> The nodes are configured identically.  What's the point in just
> compacting, anyway?  Do these files contain duplicate data, then?

The odds are good that they do -- that is the repair bug I mentioned.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Reverend Chip <re...@gmail.com>.
On 11/15/2010 2:01 PM, Jonathan Ellis wrote:
> On Mon, Nov 15, 2010 at 3:05 PM, Reverend Chip <re...@gmail.com> wrote:
>>
>> There are a lot of non-tmps that were not included in the load
>> figure.  Having stopped the server and deleted tmp files, the data are
>> still using way more space than "ring" claimed -- and too much for
>> "cleanup" to work, as well:
>>
>> Filesystem            Size  Used Avail Use% Mounted on
>> /dev/mapper/flashcache
>>                      932G  723G  162G  82% /var/lib/cassandra/data
> I take it you restarted it?  Does it show the correct load now?

I didn't expect that to be helpful.  But I'll try it.

>> Given that the previous situation included incomplete replication, I
>> can't just kill the node and let it repopulate.  So I can either magick
>> up more disk space or reload the whole cluster.  :-(
> You can move the data files do a node with more space, compact there,
> and move them back, but if you don't care about the data rebuilding is
> more straightforward.

The nodes are configured identically.  What's the point in just
compacting, anyway?  Do these files contain duplicate data, then?


Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Jonathan Ellis <jb...@gmail.com>.
On Mon, Nov 15, 2010 at 3:05 PM, Reverend Chip <re...@gmail.com> wrote:
> On 11/15/2010 12:09 PM, Jonathan Ellis wrote:
>> On Mon, Nov 15, 2010 at 1:03 PM, Reverend Chip <re...@gmail.com> wrote:
>>> I find X.21's data disk is full.  "nodetool ring" says that X.21 has a
>>> load of only 326.2 GB, but the 1T partition is full.
>> Load only tracks live data -- is the rest tmp files?
>
> No, there are a lot of non-tmps that were not included in the load
> figure.  Having stopped the server and deleted tmp files, the data are
> still using way more space than "ring" claimed -- and too much for
> "cleanup" to work, as well:
>
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/mapper/flashcache
>                      932G  723G  162G  82% /var/lib/cassandra/data

I take it you restarted it?  Does it show the correct load now?

> Given that the previous situation included incomplete replication, I
> can't just kill the node and let it repopulate.  So I can either magick
> up more disk space or reload the whole cluster.  :-(

You can move the data files do a node with more space, compact there,
and move them back, but if you don't care about the data rebuilding is
more straightforward.

> Is there anything about the node's data directory that you need to see?
> Or is it reload time?

I don't think so, thanks.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Reverend Chip <re...@gmail.com>.
On 11/15/2010 12:09 PM, Jonathan Ellis wrote:
> On Mon, Nov 15, 2010 at 1:03 PM, Reverend Chip <re...@gmail.com> wrote:
>> I find X.21's data disk is full.  "nodetool ring" says that X.21 has a
>> load of only 326.2 GB, but the 1T partition is full.
> Load only tracks live data -- is the rest tmp files.

No, there are a lot of non-tmps that were not included in the load
figure.  Having stopped the server and deleted tmp files, the data are
still using way more space than "ring" claimed -- and too much for
"cleanup" to work, as well:

Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/flashcache
                      932G  723G  162G  82% /var/lib/cassandra/data

Given that the previous situation included incomplete replication, I
can't just kill the node and let it repopulate.  So I can either magick
up more disk space or reload the whole cluster.  :-(
Is there anything about the node's data directory that you need to see? 
Or is it reload time?

>> Somehow repair decided it needed to triple the data usage.  I would like
>> to understand this, and I invite recovery suggestions.
> We have https://issues.apache.org/jira/browse/CASSANDRA-1674 open for
> repair space usage.

OK thanks.


>> Meanwhile, I find this interesting sequence on X.20.
>> It could indicate compaction interfered with repair.  Note presence of
>> TestAttrs-e-332 in the compaction:
>>
>>  INFO 10:56:00,283 Compacting
>> [org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-351-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-401-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-414-Data.db')]
>> ...
>>  INFO 01:26:56,175 Deleted /var/lib/cassandra/data/Attrs/TestAttrs-e-332-<>
>> ...
>>  INFO 22:42:10,032 Need to re-stream file
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
>> ERROR 22:42:16,822 Error in ThreadPoolExecutor
>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>> It looks like compaction and repair interfere with each other, and
>> compactions and repairs should stay out of each others' way.
> It looks to me like the sequence of events is:
>  1. streaming had a FD for -332
>  2. compaction deleted the file
>  3. X.21 ran out of space
>  4. the stream errored out
>  5. retrying the stream fails repeatedly

This is plausible, assuming streaming works off an fd and the
disappearance of the file would not disturb it (e.g. it doesn't call
stat() on the filename).

> Notably if one thread has a file open on linux and another
> thread/process deletes it, nothing bad happens and the file is
> unlinked when closed.  (However attempting to do this causes errors on
> Windows, so we try to avoid that.)

Indeed, though to my mind the more serious problem with the deletion is
the inability to restart the stream if the node is rebooted.
 
>> It also looks like streaming doesn't recover gracefully from file deletion.
> If 5. comes before 3. then yes, otherwise I think erroring from out of
> disk space is the most likely culprit.

Yes, it appears that streaming non-recovery was from disk full on the
target, not file removal.


Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Jonathan Ellis <jb...@gmail.com>.
On Mon, Nov 15, 2010 at 1:03 PM, Reverend Chip <re...@gmail.com> wrote:
> I find X.21's data disk is full.  "nodetool ring" says that X.21 has a
> load of only 326.2 GB, but the 1T partition is full.

Load only tracks live data -- is the rest tmp files.

> Somehow repair decided it needed to triple the data usage.  I would like
> to understand this, and I invite recovery suggestions.

We have https://issues.apache.org/jira/browse/CASSANDRA-1674 open for
repair space usage.

> Meanwhile, I find this interesting sequence on X.20.
> It could indicate compaction interfered with repair.  Note presence of
> TestAttrs-e-332 in the compaction:
>
>  INFO 10:56:00,283 Compacting
> [org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-351-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-401-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-414-Data.db')]
> ...
>  INFO 01:26:56,175 Deleted /var/lib/cassandra/data/Attrs/TestAttrs-e-332-<>
> ...
>  INFO 22:42:10,032 Need to re-stream file
> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
> ERROR 22:42:16,822 Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.IOException: Broken pipe

> It looks like compaction and repair interfere with each other, and
> compactions and repairs should stay out of each others' way.

It looks to me like the sequence of events is:
 1. streaming had a FD for -332
 2. compaction deleted the file
 3. X.21 ran out of space
 4. the stream errored out
 5. retrying the stream fails repeatedly

Notably if one thread has a file open on linux and another
thread/process deletes it, nothing bad happens and the file is
unlinked when closed.  (However attempting to do this causes errors on
Windows, so we try to avoid that.)

IOW I think the compaction/repair race is a bug, but not a smoking gun
in this case unless I am wrong and 4. comes before 3.

> It also looks like streaming doesn't recover gracefully from file deletion.

If 5. comes before 3. then yes, otherwise I think erroring from out of
disk space is the most likely culprit.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Reverend Chip <re...@gmail.com>.
On 11/15/2010 10:30 AM, Jonathan Ellis wrote:
> Is X.20 spewing these errors constantly now?

Yes.

> Did X.21 log anything when/before the errors started on X.20?

I find X.21's data disk is full.  "nodetool ring" says that X.21 has a
load of only 326.2 GB, but the 1T partition is full.
When I tracked down the incident start on X.21, this is what I found:

 INFO 22:42:10,031 Streaming of file
/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
         progress=76405497856/219682197079 - 34% from
org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed:
requesting a retry.
[...]
 INFO 23:47:09,917 Streaming of file
/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
         progress=59782791168/219682197079 - 27% from
org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed:
requesting a retry.
ERROR 23:47:09,972 Fatal exception in thread
Thread[CompactionExecutor:1,1,main]
java.io.IOError: java.io.IOException: No space left on device
[...]

Somehow repair decided it needed to triple the data usage.  I would like
to understand this, and I invite recovery suggestions.


Meanwhile, I find this interesting sequence on X.20.
It could indicate compaction interfered with repair.  Note presence of
TestAttrs-e-332 in the compaction:

 INFO 10:56:00,283 Compacting
[org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-351-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-401-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='/var/lib/cassandra/data/Attrs/TestAttrs-e-414-Data.db')]
...
 INFO 01:26:56,175 Deleted /var/lib/cassandra/data/Attrs/TestAttrs-e-332-<>
...
 INFO 22:42:10,032 Need to re-stream file
/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
ERROR 22:42:16,822 Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Broken pipe
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.IOException: Broken pipe
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
        at
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
        at
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
        at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more

It looks like compaction and repair interfere with each other, and
compactions and repairs should stay out of each others' way.
It also looks like streaming doesn't recover gracefully from file deletion.
Meanwhile it seems we should disable compaction manually when repair is
contemplated?


> It looks to me like you had an out-of-Cassandra's-control network
> problem (or maybe X.21 restarted) and the retry code isn't working.
>
> On Sun, Nov 14, 2010 at 7:26 PM, Chip Salzenberg <re...@gmail.com> wrote:
>> Excerpts from the log of X.20...  Hope you can make something of it.
>> -------------------------------------------------------------------------------------------------------
>>  INFO [MiscStage:1] 2010-11-14 17:19:33,105 StreamReplyVerbHandler.java
>> (line 58) Need to re-stream file
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
>> ERROR [Streaming:284690] 2010-11-14 17:19:33,105
>> DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>         at java.lang.Thread.run(Thread.java:636)
>> Caused by: java.io.IOException: Broken pipe
>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>         at
>> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
>>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
>>         at
>> org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
>>         at
>> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>         ... 3 more
>> ERROR [Streaming:284690] 2010-11-14 17:19:33,106
>> AbstractCassandraDaemon.java (line 90) Fatal exception in thread
>> Thread[Streaming:284690,1,main]
>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>         at java.lang.Thread.run(Thread.java:636)
>> Caused by: java.io.IOException: Broken pipe
>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>         at
>> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
>>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
>>         at
>> org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
>>         at
>> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>         ... 3 more
>> ERROR [Streaming:284691] 2010-11-14 17:19:33,107
>> DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>         at java.lang.Thread.run(Thread.java:636)
>> Caused by: java.io.IOException: Broken pipe
>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>>         at
>> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
>>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
>>         at
>> org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
>>         at
>> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>         ... 3 more
>>  INFO [MiscStage:1] 2010-11-14 17:19:33,171 StreamReplyVerbHandler.java
>> (line 58) Need to re-stream file
>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
>> ...
>> -------------------------------------------------------------------------------------------------------
>> On Sun, Nov 14, 2010 at 2:17 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>> What exception is causing it to fail/retry?
>>>
>>> On Sun, Nov 14, 2010 at 3:49 PM, Chip Salzenberg <re...@gmail.com>
>>> wrote:
>>>> My by-now infamous eight-node cluster running 0.7.0beta3+ dropped many
>>>> replication MUTATEs during load, so I decided to fix replication copies
>>>> with
>>>> a "nodetool repair" on one of the nodes (X.21).  The repair has been
>>>> running
>>>> for two days, and has finally gotten itself wedged into a state where it
>>>> can't proceed.
>>>> The log on X.21 continually describe the need to stream a data file,
>>>> unsuccessfully.  From other clues below I gather this is a receive
>>>> stream.
>>>> This message repeated many many times, multiple per second, but has now
>>>> stopped:
>>>> INFO [Thread-13877] 2010-11-14 09:17:35,207 StreamInSession.java (line
>>>> 124)
>>>> Streaming of file
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>>>>          progress=90112/219682197079 - 0% from
>>>> org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed:
>>>> requesting a
>>>> retry.
>>>> Here's the best joke, though: "nodetool -h X.20 nestats" shows that the
>>>> given stream has been attempted a few times and is still being
>>>> attempted,
>>>> but in a broken way, such that the progress percentage has gone way past
>>>> 100%.  It's now at 1096% and still rising.
>>>> I'm not rebooting so I can poke around as devs suggest.  I'm also not
>>>> sending logs to the list, at least in part because they're, well, big.
>>>>  If
>>>> any developers want them, though, I'm happy to send them.
>>>>
>>>> -------------------------------------------------------------------------------------------------------------------
>>>> Mode: Normal
>>>> Streaming to: /X.21
>>>>
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>>>>          progress=2408587638800/219682197079 - 1096%
>>>>         <---- see this
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>>>>          progress=0/182528797 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>>>>          progress=0/908075169 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>>>>          progress=0/784362565 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>>>>          progress=0/896956312 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>>>>          progress=0/894019840 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>>>>          progress=0/901377643 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>>>>          progress=0/22306924 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>>>>          progress=0/888814566 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>>>>          progress=0/889095219 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>>>>          progress=0/893034298 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>>>>          progress=0/371718620 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>>>>          progress=0/14172830870 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>>>>          progress=0/8939407316 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>>>>          progress=0/897417147 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>>>>          progress=0/357220526 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>>>>          progress=0/899103394 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>>>>          progress=0/898165901 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>>>>          progress=0/13323957368 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>>>>          progress=0/892116147 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>>>>          progress=0/28216239303 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>>>>          progress=0/307921317 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>>>>          progress=0/185277927 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>>>>          progress=0/893683568 - 0%
>>>> Streaming from: /X.21
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>>>>          progress=0/176842211 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>>>>          progress=0/447272883 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>>>>          progress=0/444440243 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>>>>          progress=0/14275850800 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>>>>          progress=0/31878407176 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>>>>          progress=0/446800028 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>>>>          progress=0/367116560 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>>>>          progress=0/445241132 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>>>>          progress=0/4497953871 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>>>>          progress=0/449662908 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>>>>          progress=0/454101872 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>>>>          progress=0/447381444 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>>>>          progress=0/208633237 - 0%
>>>> Pool Name                    Active   Pending      Completed
>>>> Commands                        n/a         0      709910236
>>>> Responses                       n/a         0      363174385
>>>>
>>>> -------------------------------------------------------------------------------------------------------------------
>>>>
>>>> Meanwhile, "nodetool -h X.21 netstats" shows a large number of transfers
>>>> that are at 0% and haven't moved, AFAICT, for at least an hour:
>>>>
>>>> -------------------------------------------------------------------------------------------------------------------
>>>> Mode: Normal
>>>> Streaming to: /X.20
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>>>>          progress=0/208633237 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>>>>          progress=0/367116560 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>>>>          progress=0/4497953871 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>>>>          progress=0/444440243 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>>>>          progress=0/176842211 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>>>>          progress=0/14275850800 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>>>>          progress=0/445241132 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>>>>          progress=0/31878407176 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>>>>          progress=0/447272883 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>>>>          progress=0/446800028 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>>>>          progress=0/449662908 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>>>>          progress=0/447381444 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>>>>          progress=0/454101872 - 0%
>>>> Streaming to: /X.22
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-350-Data.db/(0,887780227)
>>>>          progress=0/887780227 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-341-Data.db/(0,885896138)
>>>>          progress=0/885896138 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db/(0,892560053)
>>>>          progress=0/892560053 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-358-Data.db/(0,888436251)
>>>>          progress=0/888436251 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,893446845)
>>>>          progress=0/893446845 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-354-Data.db/(0,889058842)
>>>>          progress=0/889058842 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-90-Data.db/(0,61505031301)
>>>>          progress=0/61505031301 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,887620464)
>>>>          progress=0/887620464 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-361-Data.db/(0,890820399)
>>>>          progress=0/890820399 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-345-Data.db/(0,887535512)
>>>>          progress=0/887535512 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-329-Data.db/(0,16876107370)
>>>>          progress=0/16876107370 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-364-Data.db/(0,893839028)
>>>>          progress=0/893839028 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-356-Data.db/(0,891862436)
>>>>          progress=0/891862436 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,886276363)
>>>>          progress=0/886276363 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-292-Data.db/(0,388239771)
>>>>          progress=0/388239771 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-379-Data.db/(0,907731463)
>>>>          progress=0/907731463 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-348-Data.db/(0,893114355)
>>>>          progress=0/893114355 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-371-Data.db/(0,888682755)
>>>>          progress=0/888682755 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-338-Data.db/(0,885144435)
>>>>          progress=0/885144435 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-340-Data.db/(0,890937418)
>>>>          progress=0/890937418 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-233-Data.db/(0,33902556016)
>>>>          progress=0/33902556016 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897426603)
>>>>          progress=0/897426603 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,888711957)
>>>>          progress=0/888711957 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>>>>          progress=0/208633237 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,893954909)
>>>>          progress=0/893954909 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,897265056)
>>>>          progress=0/897265056 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-339-Data.db/(0,888998653)
>>>>          progress=0/888998653 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,901053427)
>>>>          progress=0/901053427 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>>>>          progress=0/4497953871 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-343-Data.db/(0,891732427)
>>>>          progress=0/891732427 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-357-Data.db/(0,888267065)
>>>>          progress=0/888267065 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-335-Data.db/(0,889998928)
>>>>          progress=0/889998928 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-365-Data.db/(0,888528931)
>>>>          progress=0/888528931 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>>>>          progress=0/14275850800 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-355-Data.db/(0,893535664)
>>>>          progress=0/893535664 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-347-Data.db/(0,891375566)
>>>>          progress=0/891375566 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,897994571)
>>>>          progress=0/897994571 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,897589898)
>>>>          progress=0/897589898 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-336-Data.db/(0,891079134)
>>>>          progress=0/891079134 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,892852094)
>>>>          progress=0/892852094 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-337-Data.db/(0,885983148)
>>>>          progress=0/885983148 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-346-Data.db/(0,886424157)
>>>>          progress=0/886424157 - 0%
>>>>    /var/lib/cassandra/data/Attrs/TestAttrs-e-353-Data.db/(0,889222127)
>>>>          progress=0/889222127 - 0%
>>>> Streaming from: /X.20
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>>>>          progress=0/357220526 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>>>>          progress=0/371718620 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>>>>          progress=0/888814566 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>>>>          progress=0/889095219 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>>>>          progress=0/307921317 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>>>>          progress=0/13323957368 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>>>>          progress=0/28216239303 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>>>>          progress=0/14172830870 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>>>>          progress=0/899103394 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>>>>          progress=0/894019840 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>>>>          progress=0/185277927 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>>>>          progress=0/896956312 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>>>>          progress=0/893683568 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>>>>          progress=0/182528797 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>>>>          progress=0/22306924 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>>>>          progress=0/897417147 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>>>>          progress=0/8939407316 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>>>>          progress=0/908075169 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>>>>          progress=0/892116147 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>>>>          progress=0/898165901 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>>>>          progress=0/219682197079 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>>>>          progress=0/784362565 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>>>>          progress=0/893034298 - 0%
>>>>    Attrs:
>>>> /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>>>>          progress=0/901377643 - 0%
>>>>  Nothing streaming from /X.22
>>>> Pool Name                    Active   Pending      Completed
>>>> Commands                        n/a         0      433633667
>>>> Responses                       n/a         0      402612386
>>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>
>
>


Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Jonathan Ellis <jb...@gmail.com>.
Is X.20 spewing these errors constantly now?

Did X.21 log anything when/before the errors started on X.20?

It looks to me like you had an out-of-Cassandra's-control network
problem (or maybe X.21 restarted) and the retry code isn't working.

On Sun, Nov 14, 2010 at 7:26 PM, Chip Salzenberg <re...@gmail.com> wrote:
> Excerpts from the log of X.20...  Hope you can make something of it.
> -------------------------------------------------------------------------------------------------------
>  INFO [MiscStage:1] 2010-11-14 17:19:33,105 StreamReplyVerbHandler.java
> (line 58) Need to re-stream file
> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
> ERROR [Streaming:284690] 2010-11-14 17:19:33,105
> DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.IOException: Broken pipe
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:636)
> Caused by: java.io.IOException: Broken pipe
>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>         at
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
>         at
> org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
>         at
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more
> ERROR [Streaming:284690] 2010-11-14 17:19:33,106
> AbstractCassandraDaemon.java (line 90) Fatal exception in thread
> Thread[Streaming:284690,1,main]
> java.lang.RuntimeException: java.io.IOException: Broken pipe
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:636)
> Caused by: java.io.IOException: Broken pipe
>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>         at
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
>         at
> org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
>         at
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more
> ERROR [Streaming:284691] 2010-11-14 17:19:33,107
> DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.IOException: Broken pipe
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:636)
> Caused by: java.io.IOException: Broken pipe
>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>         at
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
>         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
>         at
> org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
>         at
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         ... 3 more
>  INFO [MiscStage:1] 2010-11-14 17:19:33,171 StreamReplyVerbHandler.java
> (line 58) Need to re-stream file
> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
> ...
> -------------------------------------------------------------------------------------------------------
> On Sun, Nov 14, 2010 at 2:17 PM, Jonathan Ellis <jb...@gmail.com> wrote:
>>
>> What exception is causing it to fail/retry?
>>
>> On Sun, Nov 14, 2010 at 3:49 PM, Chip Salzenberg <re...@gmail.com>
>> wrote:
>> > My by-now infamous eight-node cluster running 0.7.0beta3+ dropped many
>> > replication MUTATEs during load, so I decided to fix replication copies
>> > with
>> > a "nodetool repair" on one of the nodes (X.21).  The repair has been
>> > running
>> > for two days, and has finally gotten itself wedged into a state where it
>> > can't proceed.
>> > The log on X.21 continually describe the need to stream a data file,
>> > unsuccessfully.  From other clues below I gather this is a receive
>> > stream.
>> > This message repeated many many times, multiple per second, but has now
>> > stopped:
>> > INFO [Thread-13877] 2010-11-14 09:17:35,207 StreamInSession.java (line
>> > 124)
>> > Streaming of file
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>> >          progress=90112/219682197079 - 0% from
>> > org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed:
>> > requesting a
>> > retry.
>> > Here's the best joke, though: "nodetool -h X.20 nestats" shows that the
>> > given stream has been attempted a few times and is still being
>> > attempted,
>> > but in a broken way, such that the progress percentage has gone way past
>> > 100%.  It's now at 1096% and still rising.
>> > I'm not rebooting so I can poke around as devs suggest.  I'm also not
>> > sending logs to the list, at least in part because they're, well, big.
>> >  If
>> > any developers want them, though, I'm happy to send them.
>> >
>> > -------------------------------------------------------------------------------------------------------------------
>> > Mode: Normal
>> > Streaming to: /X.21
>> >
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>> >          progress=2408587638800/219682197079 - 1096%
>> >         <---- see this
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>> >          progress=0/182528797 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>> >          progress=0/908075169 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>> >          progress=0/784362565 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>> >          progress=0/896956312 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>> >          progress=0/894019840 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>> >          progress=0/901377643 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>> >          progress=0/22306924 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>> >          progress=0/888814566 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>> >          progress=0/889095219 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>> >          progress=0/893034298 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>> >          progress=0/371718620 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>> >          progress=0/14172830870 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>> >          progress=0/8939407316 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>> >          progress=0/897417147 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>> >          progress=0/357220526 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>> >          progress=0/899103394 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>> >          progress=0/898165901 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>> >          progress=0/13323957368 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>> >          progress=0/892116147 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>> >          progress=0/28216239303 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>> >          progress=0/307921317 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>> >          progress=0/185277927 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>> >          progress=0/893683568 - 0%
>> > Streaming from: /X.21
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>> >          progress=0/176842211 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>> >          progress=0/447272883 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>> >          progress=0/444440243 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>> >          progress=0/14275850800 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>> >          progress=0/31878407176 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>> >          progress=0/446800028 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>> >          progress=0/367116560 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>> >          progress=0/445241132 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>> >          progress=0/4497953871 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>> >          progress=0/449662908 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>> >          progress=0/454101872 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>> >          progress=0/447381444 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>> >          progress=0/208633237 - 0%
>> > Pool Name                    Active   Pending      Completed
>> > Commands                        n/a         0      709910236
>> > Responses                       n/a         0      363174385
>> >
>> > -------------------------------------------------------------------------------------------------------------------
>> >
>> > Meanwhile, "nodetool -h X.21 netstats" shows a large number of transfers
>> > that are at 0% and haven't moved, AFAICT, for at least an hour:
>> >
>> > -------------------------------------------------------------------------------------------------------------------
>> > Mode: Normal
>> > Streaming to: /X.20
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>> >          progress=0/208633237 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>> >          progress=0/367116560 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>> >          progress=0/4497953871 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>> >          progress=0/444440243 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>> >          progress=0/176842211 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>> >          progress=0/14275850800 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>> >          progress=0/445241132 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>> >          progress=0/31878407176 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>> >          progress=0/447272883 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>> >          progress=0/446800028 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>> >          progress=0/449662908 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>> >          progress=0/447381444 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>> >          progress=0/454101872 - 0%
>> > Streaming to: /X.22
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-350-Data.db/(0,887780227)
>> >          progress=0/887780227 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-341-Data.db/(0,885896138)
>> >          progress=0/885896138 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db/(0,892560053)
>> >          progress=0/892560053 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-358-Data.db/(0,888436251)
>> >          progress=0/888436251 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,893446845)
>> >          progress=0/893446845 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-354-Data.db/(0,889058842)
>> >          progress=0/889058842 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-90-Data.db/(0,61505031301)
>> >          progress=0/61505031301 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,887620464)
>> >          progress=0/887620464 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-361-Data.db/(0,890820399)
>> >          progress=0/890820399 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-345-Data.db/(0,887535512)
>> >          progress=0/887535512 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-329-Data.db/(0,16876107370)
>> >          progress=0/16876107370 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-364-Data.db/(0,893839028)
>> >          progress=0/893839028 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-356-Data.db/(0,891862436)
>> >          progress=0/891862436 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,886276363)
>> >          progress=0/886276363 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-292-Data.db/(0,388239771)
>> >          progress=0/388239771 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-379-Data.db/(0,907731463)
>> >          progress=0/907731463 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-348-Data.db/(0,893114355)
>> >          progress=0/893114355 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-371-Data.db/(0,888682755)
>> >          progress=0/888682755 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-338-Data.db/(0,885144435)
>> >          progress=0/885144435 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-340-Data.db/(0,890937418)
>> >          progress=0/890937418 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-233-Data.db/(0,33902556016)
>> >          progress=0/33902556016 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897426603)
>> >          progress=0/897426603 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,888711957)
>> >          progress=0/888711957 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>> >          progress=0/208633237 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,893954909)
>> >          progress=0/893954909 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,897265056)
>> >          progress=0/897265056 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-339-Data.db/(0,888998653)
>> >          progress=0/888998653 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,901053427)
>> >          progress=0/901053427 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>> >          progress=0/4497953871 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-343-Data.db/(0,891732427)
>> >          progress=0/891732427 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-357-Data.db/(0,888267065)
>> >          progress=0/888267065 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-335-Data.db/(0,889998928)
>> >          progress=0/889998928 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-365-Data.db/(0,888528931)
>> >          progress=0/888528931 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>> >          progress=0/14275850800 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-355-Data.db/(0,893535664)
>> >          progress=0/893535664 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-347-Data.db/(0,891375566)
>> >          progress=0/891375566 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,897994571)
>> >          progress=0/897994571 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,897589898)
>> >          progress=0/897589898 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-336-Data.db/(0,891079134)
>> >          progress=0/891079134 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,892852094)
>> >          progress=0/892852094 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-337-Data.db/(0,885983148)
>> >          progress=0/885983148 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-346-Data.db/(0,886424157)
>> >          progress=0/886424157 - 0%
>> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-353-Data.db/(0,889222127)
>> >          progress=0/889222127 - 0%
>> > Streaming from: /X.20
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>> >          progress=0/357220526 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>> >          progress=0/371718620 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>> >          progress=0/888814566 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>> >          progress=0/889095219 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>> >          progress=0/307921317 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>> >          progress=0/13323957368 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>> >          progress=0/28216239303 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>> >          progress=0/14172830870 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>> >          progress=0/899103394 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>> >          progress=0/894019840 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>> >          progress=0/185277927 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>> >          progress=0/896956312 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>> >          progress=0/893683568 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>> >          progress=0/182528797 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>> >          progress=0/22306924 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>> >          progress=0/897417147 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>> >          progress=0/8939407316 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>> >          progress=0/908075169 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>> >          progress=0/892116147 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>> >          progress=0/898165901 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>> >          progress=0/219682197079 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>> >          progress=0/784362565 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>> >          progress=0/893034298 - 0%
>> >    Attrs:
>> > /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>> >          progress=0/901377643 - 0%
>> >  Nothing streaming from /10.5.5.22
>> > Pool Name                    Active   Pending      Completed
>> > Commands                        n/a         0      433633667
>> > Responses                       n/a         0      402612386
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Chip Salzenberg <re...@gmail.com>.
Excerpts from the log of X.20...  Hope you can make something of it.

-------------------------------------------------------------------------------------------------------
 INFO [MiscStage:1] 2010-11-14 17:19:33,105 StreamReplyVerbHandler.java
(line 58) Need to re-stream file
/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
ERROR [Streaming:284690] 2010-11-14 17:19:33,105
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Broken pipe
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.IOException: Broken pipe
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
        at
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
        at
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
        at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more
ERROR [Streaming:284690] 2010-11-14 17:19:33,106
AbstractCassandraDaemon.java (line 90) Fatal exception in thread
Thread[Streaming:284690,1,main]
java.lang.RuntimeException: java.io.IOException: Broken pipe
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.IOException: Broken pipe
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
        at
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
        at
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
        at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more
ERROR [Streaming:284691] 2010-11-14 17:19:33,107
DebuggableThreadPoolExecutor.java (line 103) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Broken pipe
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.IOException: Broken pipe
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
        at
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:456)
        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:557)
        at
org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:102)
        at
org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:64)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more
 INFO [MiscStage:1] 2010-11-14 17:19:33,171 StreamReplyVerbHandler.java
(line 58) Need to re-stream file
/var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db to /X.21
...
-------------------------------------------------------------------------------------------------------

On Sun, Nov 14, 2010 at 2:17 PM, Jonathan Ellis <jb...@gmail.com> wrote:

> What exception is causing it to fail/retry?
>
> On Sun, Nov 14, 2010 at 3:49 PM, Chip Salzenberg <re...@gmail.com>
> wrote:
> > My by-now infamous eight-node cluster running 0.7.0beta3+ dropped many
> > replication MUTATEs during load, so I decided to fix replication copies
> with
> > a "nodetool repair" on one of the nodes (X.21).  The repair has been
> running
> > for two days, and has finally gotten itself wedged into a state where it
> > can't proceed.
> > The log on X.21 continually describe the need to stream a data file,
> > unsuccessfully.  From other clues below I gather this is a receive
> stream.
> > This message repeated many many times, multiple per second, but has now
> > stopped:
> > INFO [Thread-13877] 2010-11-14 09:17:35,207 StreamInSession.java (line
> 124)
> > Streaming of file
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
> >          progress=90112/219682197079 - 0% from
> > org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed:
> requesting a
> > retry.
> > Here's the best joke, though: "nodetool -h X.20 nestats" shows that the
> > given stream has been attempted a few times and is still being attempted,
> > but in a broken way, such that the progress percentage has gone way past
> > 100%.  It's now at 1096% and still rising.
> > I'm not rebooting so I can poke around as devs suggest.  I'm also not
> > sending logs to the list, at least in part because they're, well, big.
>  If
> > any developers want them, though, I'm happy to send them.
> >
> -------------------------------------------------------------------------------------------------------------------
> > Mode: Normal
> > Streaming to: /X.21
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
> >          progress=2408587638800/219682197079 - 1096%
> >         <---- see this
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
> >          progress=0/182528797 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
> >          progress=0/908075169 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
> >          progress=0/784362565 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
> >          progress=0/896956312 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
> >          progress=0/894019840 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
> >          progress=0/901377643 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
> >          progress=0/22306924 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
> >          progress=0/888814566 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
> >          progress=0/889095219 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
> >          progress=0/893034298 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
> >          progress=0/371718620 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
> >          progress=0/14172830870 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
> >          progress=0/8939407316 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
> >          progress=0/897417147 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
> >          progress=0/357220526 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
> >          progress=0/899103394 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
> >          progress=0/898165901 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
> >          progress=0/13323957368 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
> >          progress=0/892116147 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
> >          progress=0/28216239303 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
> >          progress=0/307921317 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
> >          progress=0/185277927 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
> >          progress=0/893683568 - 0%
> > Streaming from: /X.21
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
> >          progress=0/176842211 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
> >          progress=0/447272883 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
> >          progress=0/444440243 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
> >          progress=0/14275850800 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
> >          progress=0/31878407176 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
> >          progress=0/446800028 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
> >          progress=0/367116560 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
> >          progress=0/445241132 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
> >          progress=0/4497953871 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
> >          progress=0/449662908 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
> >          progress=0/454101872 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
> >          progress=0/447381444 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
> >          progress=0/208633237 - 0%
> > Pool Name                    Active   Pending      Completed
> > Commands                        n/a         0      709910236
> > Responses                       n/a         0      363174385
> >
> -------------------------------------------------------------------------------------------------------------------
> >
> > Meanwhile, "nodetool -h X.21 netstats" shows a large number of transfers
> > that are at 0% and haven't moved, AFAICT, for at least an hour:
> >
> -------------------------------------------------------------------------------------------------------------------
> > Mode: Normal
> > Streaming to: /X.20
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
> >          progress=0/208633237 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
> >          progress=0/367116560 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
> >          progress=0/4497953871 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
> >          progress=0/444440243 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
> >          progress=0/176842211 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
> >          progress=0/14275850800 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
> >          progress=0/445241132 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
> >          progress=0/31878407176 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
> >          progress=0/447272883 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
> >          progress=0/446800028 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
> >          progress=0/449662908 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
> >          progress=0/447381444 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
> >          progress=0/454101872 - 0%
> > Streaming to: /X.22
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-350-Data.db/(0,887780227)
> >          progress=0/887780227 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-341-Data.db/(0,885896138)
> >          progress=0/885896138 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db/(0,892560053)
> >          progress=0/892560053 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-358-Data.db/(0,888436251)
> >          progress=0/888436251 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,893446845)
> >          progress=0/893446845 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-354-Data.db/(0,889058842)
> >          progress=0/889058842 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-90-Data.db/(0,61505031301)
> >          progress=0/61505031301 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,887620464)
> >          progress=0/887620464 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-361-Data.db/(0,890820399)
> >          progress=0/890820399 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-345-Data.db/(0,887535512)
> >          progress=0/887535512 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-329-Data.db/(0,16876107370)
> >          progress=0/16876107370 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-364-Data.db/(0,893839028)
> >          progress=0/893839028 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-356-Data.db/(0,891862436)
> >          progress=0/891862436 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,886276363)
> >          progress=0/886276363 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-292-Data.db/(0,388239771)
> >          progress=0/388239771 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-379-Data.db/(0,907731463)
> >          progress=0/907731463 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-348-Data.db/(0,893114355)
> >          progress=0/893114355 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-371-Data.db/(0,888682755)
> >          progress=0/888682755 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-338-Data.db/(0,885144435)
> >          progress=0/885144435 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-340-Data.db/(0,890937418)
> >          progress=0/890937418 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-233-Data.db/(0,33902556016)
> >          progress=0/33902556016 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897426603)
> >          progress=0/897426603 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,888711957)
> >          progress=0/888711957 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
> >          progress=0/208633237 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,893954909)
> >          progress=0/893954909 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,897265056)
> >          progress=0/897265056 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-339-Data.db/(0,888998653)
> >          progress=0/888998653 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,901053427)
> >          progress=0/901053427 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
> >          progress=0/4497953871 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-343-Data.db/(0,891732427)
> >          progress=0/891732427 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-357-Data.db/(0,888267065)
> >          progress=0/888267065 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-335-Data.db/(0,889998928)
> >          progress=0/889998928 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-365-Data.db/(0,888528931)
> >          progress=0/888528931 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
> >          progress=0/14275850800 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-355-Data.db/(0,893535664)
> >          progress=0/893535664 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-347-Data.db/(0,891375566)
> >          progress=0/891375566 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,897994571)
> >          progress=0/897994571 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,897589898)
> >          progress=0/897589898 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-336-Data.db/(0,891079134)
> >          progress=0/891079134 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,892852094)
> >          progress=0/892852094 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-337-Data.db/(0,885983148)
> >          progress=0/885983148 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-346-Data.db/(0,886424157)
> >          progress=0/886424157 - 0%
> >    /var/lib/cassandra/data/Attrs/TestAttrs-e-353-Data.db/(0,889222127)
> >          progress=0/889222127 - 0%
> > Streaming from: /X.20
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
> >          progress=0/357220526 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
> >          progress=0/371718620 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
> >          progress=0/888814566 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
> >          progress=0/889095219 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
> >          progress=0/307921317 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
> >          progress=0/13323957368 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
> >          progress=0/28216239303 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
> >          progress=0/14172830870 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
> >          progress=0/899103394 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
> >          progress=0/894019840 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
> >          progress=0/185277927 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
> >          progress=0/896956312 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
> >          progress=0/893683568 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
> >          progress=0/182528797 - 0%
> >    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
> >          progress=0/22306924 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
> >          progress=0/897417147 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
> >          progress=0/8939407316 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
> >          progress=0/908075169 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
> >          progress=0/892116147 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
> >          progress=0/898165901 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
> >          progress=0/219682197079 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
> >          progress=0/784362565 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
> >          progress=0/893034298 - 0%
> >    Attrs:
> > /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
> >          progress=0/901377643 - 0%
> >  Nothing streaming from /10.5.5.22
> > Pool Name                    Active   Pending      Completed
> > Commands                        n/a         0      433633667
> > Responses                       n/a         0      402612386
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: repair takes two days, and ends up stuck: stream at 1096% (yes, really)

Posted by Jonathan Ellis <jb...@gmail.com>.
What exception is causing it to fail/retry?

On Sun, Nov 14, 2010 at 3:49 PM, Chip Salzenberg <re...@gmail.com> wrote:
> My by-now infamous eight-node cluster running 0.7.0beta3+ dropped many
> replication MUTATEs during load, so I decided to fix replication copies with
> a "nodetool repair" on one of the nodes (X.21).  The repair has been running
> for two days, and has finally gotten itself wedged into a state where it
> can't proceed.
> The log on X.21 continually describe the need to stream a data file,
> unsuccessfully.  From other clues below I gather this is a receive stream.
> This message repeated many many times, multiple per second, but has now
> stopped:
> INFO [Thread-13877] 2010-11-14 09:17:35,207 StreamInSession.java (line 124)
> Streaming of file
> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>          progress=90112/219682197079 - 0% from
> org.apache.cassandra.streaming.StreamInSession@3ee3da2c failed: requesting a
> retry.
> Here's the best joke, though: "nodetool -h X.20 nestats" shows that the
> given stream has been attempted a few times and is still being attempted,
> but in a broken way, such that the progress percentage has gone way past
> 100%.  It's now at 1096% and still rising.
> I'm not rebooting so I can poke around as devs suggest.  I'm also not
> sending logs to the list, at least in part because they're, well, big.  If
> any developers want them, though, I'm happy to send them.
> -------------------------------------------------------------------------------------------------------------------
> Mode: Normal
> Streaming to: /X.21
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>          progress=2408587638800/219682197079 - 1096%
>         <---- see this
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>          progress=0/182528797 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>          progress=0/908075169 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>          progress=0/784362565 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>          progress=0/896956312 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>          progress=0/894019840 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>          progress=0/901377643 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>          progress=0/22306924 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>          progress=0/888814566 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>          progress=0/889095219 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>          progress=0/893034298 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>          progress=0/371718620 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>          progress=0/14172830870 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>          progress=0/8939407316 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>          progress=0/897417147 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>          progress=0/357220526 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>          progress=0/899103394 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>          progress=0/898165901 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>          progress=0/13323957368 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>          progress=0/892116147 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>          progress=0/28216239303 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>          progress=0/307921317 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>          progress=0/185277927 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>          progress=0/893683568 - 0%
> Streaming from: /X.21
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>          progress=0/176842211 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>          progress=0/447272883 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>          progress=0/444440243 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>          progress=0/14275850800 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>          progress=0/31878407176 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>          progress=0/446800028 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>          progress=0/367116560 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>          progress=0/445241132 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>          progress=0/4497953871 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>          progress=0/449662908 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>          progress=0/454101872 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>          progress=0/447381444 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>          progress=0/208633237 - 0%
> Pool Name                    Active   Pending      Completed
> Commands                        n/a         0      709910236
> Responses                       n/a         0      363174385
> -------------------------------------------------------------------------------------------------------------------
>
> Meanwhile, "nodetool -h X.21 netstats" shows a large number of transfers
> that are at 0% and haven't moved, AFAICT, for at least an hour:
> -------------------------------------------------------------------------------------------------------------------
> Mode: Normal
> Streaming to: /X.20
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>          progress=0/208633237 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-439-Data.db/(0,367116560)
>          progress=0/367116560 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>          progress=0/4497953871 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-412-Data.db/(0,444440243)
>          progress=0/444440243 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-440-Data.db/(0,176842211)
>          progress=0/176842211 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>          progress=0/14275850800 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,445241132)
>          progress=0/445241132 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-397-Data.db/(0,31878407176)
>          progress=0/31878407176 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,447272883)
>          progress=0/447272883 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-393-Data.db/(0,446800028)
>          progress=0/446800028 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-396-Data.db/(0,449662908)
>          progress=0/449662908 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,447381444)
>          progress=0/447381444 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-409-Data.db/(0,454101872)
>          progress=0/454101872 - 0%
> Streaming to: /X.22
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-350-Data.db/(0,887780227)
>          progress=0/887780227 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-341-Data.db/(0,885896138)
>          progress=0/885896138 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-368-Data.db/(0,892560053)
>          progress=0/892560053 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-358-Data.db/(0,888436251)
>          progress=0/888436251 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,893446845)
>          progress=0/893446845 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-354-Data.db/(0,889058842)
>          progress=0/889058842 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-90-Data.db/(0,61505031301)
>          progress=0/61505031301 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,887620464)
>          progress=0/887620464 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-361-Data.db/(0,890820399)
>          progress=0/890820399 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-345-Data.db/(0,887535512)
>          progress=0/887535512 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-329-Data.db/(0,16876107370)
>          progress=0/16876107370 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-364-Data.db/(0,893839028)
>          progress=0/893839028 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-356-Data.db/(0,891862436)
>          progress=0/891862436 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,886276363)
>          progress=0/886276363 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-292-Data.db/(0,388239771)
>          progress=0/388239771 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-379-Data.db/(0,907731463)
>          progress=0/907731463 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-348-Data.db/(0,893114355)
>          progress=0/893114355 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-371-Data.db/(0,888682755)
>          progress=0/888682755 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-338-Data.db/(0,885144435)
>          progress=0/885144435 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-340-Data.db/(0,890937418)
>          progress=0/890937418 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-233-Data.db/(0,33902556016)
>          progress=0/33902556016 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897426603)
>          progress=0/897426603 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,888711957)
>          progress=0/888711957 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-327-Data.db/(0,208633237)
>          progress=0/208633237 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,893954909)
>          progress=0/893954909 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,897265056)
>          progress=0/897265056 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-339-Data.db/(0,888998653)
>          progress=0/888998653 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,901053427)
>          progress=0/901053427 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-271-Data.db/(0,4497953871)
>          progress=0/4497953871 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-343-Data.db/(0,891732427)
>          progress=0/891732427 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-357-Data.db/(0,888267065)
>          progress=0/888267065 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-335-Data.db/(0,889998928)
>          progress=0/889998928 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-365-Data.db/(0,888528931)
>          progress=0/888528931 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-328-Data.db/(0,14275850800)
>          progress=0/14275850800 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-355-Data.db/(0,893535664)
>          progress=0/893535664 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-347-Data.db/(0,891375566)
>          progress=0/891375566 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,897994571)
>          progress=0/897994571 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,897589898)
>          progress=0/897589898 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-336-Data.db/(0,891079134)
>          progress=0/891079134 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,892852094)
>          progress=0/892852094 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-337-Data.db/(0,885983148)
>          progress=0/885983148 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-346-Data.db/(0,886424157)
>          progress=0/886424157 - 0%
>    /var/lib/cassandra/data/Attrs/TestAttrs-e-353-Data.db/(0,889222127)
>          progress=0/889222127 - 0%
> Streaming from: /X.20
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-385-Data.db/(0,357220526)
>          progress=0/357220526 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-389-Data.db/(0,371718620)
>          progress=0/371718620 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-369-Data.db/(0,888814566)
>          progress=0/888814566 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-374-Data.db/(0,889095219)
>          progress=0/889095219 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-330-Data.db/(0,307921317)
>          progress=0/307921317 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-331-Data.db/(0,13323957368)
>          progress=0/13323957368 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-383-Data.db/(0,28216239303)
>          progress=0/28216239303 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-319-Data.db/(0,14172830870)
>          progress=0/14172830870 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-366-Data.db/(0,899103394)
>          progress=0/899103394 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-367-Data.db/(0,894019840)
>          progress=0/894019840 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-390-Data.db/(0,185277927)
>          progress=0/185277927 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-378-Data.db/(0,896956312)
>          progress=0/896956312 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-372-Data.db/(0,893683568)
>          progress=0/893683568 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-386-Data.db/(0,182528797)
>          progress=0/182528797 - 0%
>    Attrs: /var/lib/cassandra/data/Attrs/TestAttrs-e-333-Data.db/(0,22306924)
>          progress=0/22306924 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-376-Data.db/(0,897417147)
>          progress=0/897417147 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-283-Data.db/(0,8939407316)
>          progress=0/8939407316 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-381-Data.db/(0,908075169)
>          progress=0/908075169 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-373-Data.db/(0,892116147)
>          progress=0/892116147 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-377-Data.db/(0,898165901)
>          progress=0/898165901 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-332-Data.db/(0,219682197079)
>          progress=0/219682197079 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-382-Data.db/(0,784362565)
>          progress=0/784362565 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-375-Data.db/(0,893034298)
>          progress=0/893034298 - 0%
>    Attrs:
> /var/lib/cassandra/data/Attrs/TestAttrs-e-380-Data.db/(0,901377643)
>          progress=0/901377643 - 0%
>  Nothing streaming from /10.5.5.22
> Pool Name                    Active   Pending      Completed
> Commands                        n/a         0      433633667
> Responses                       n/a         0      402612386
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com