You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Todd Burruss <bb...@expedia.com> on 2011/09/23 18:40:38 UTC

Assertion

Fyi … I am seeing the exception (at end of message) using 1.0-beta1.

Notes:

- I was running 0.8.5 before dropping in 1.0-beta1
- upgraded yaml file to be 1.0
- some CFs were created in 0.8.5 and some in 1.0-beta1

A couple of observations after seeing this:

- cannot nicely kill cassandra, must use kill -9
- the CLI can connect and 'get' at ONE or QUORUM, but cannot 'set' at either CL


INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java (line 259) Started hinted handoff for token: 150124573641590498586782915043427152112 with IP: /10.185.35.39
ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[HintedHandoff:4,5,main]
java.lang.AssertionError
        at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282)
        at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:81)
        at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:333)
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)


Re: Assertion

Posted by Todd Burruss <bb...@expedia.com>.
I picked up the 1.0.0-rc1 build and testing now .. But I bet you are
correct

On 9/24/11 6:25 PM, "Jonathan Ellis" <jb...@gmail.com> wrote:

>I bet this is https://issues.apache.org/jira/browse/CASSANDRA-3253.
>
>On Fri, Sep 23, 2011 at 6:00 PM, Todd Burruss <bb...@expedia.com>
>wrote:
>> My last test, I cleaned everything, started from scratch and I can still
>> repo.  The AssertionError you mention below could be unrelated to my
>> writes not working.
>>
>> This time there are _no_ exceptions or errors reported from any
>>cassandra
>> node.  My app runs for a while before writes start failing.  I can kill
>>my
>> app and will happen immediately upon restarting it.
>>
>> I didn't dig as hard last time, but now I can see that nodetool tpstats
>>on
>> 3 consecutive machines in the ring have their MutationStage "stuck" with
>> piled up requests.  Completed is not going up and pending is not going
>> down.  I have concurrent_writes = 32
>>
>> Here is a sample from one of the machines:
>>
>> Pool Name     Active   Pending     Completed   Blocked  All time blocked
>> MutationStage                    32     1492    1416296   0       0
>>
>>
>>
>>
>>
>> On 9/23/11 3:11 PM, "Brandon Williams" <dr...@gmail.com> wrote:
>>
>>>On 9/23/11, Todd Burruss <bb...@expedia.com> wrote:
>>>> INFO [HintedHandoff:4] 2011-09-22 22:59:27,626
>>>>HintedHandOffManager.java
>>>> (line 259) Started hinted handoff for token:
>>>> 150124573641590498586782915043427152112 with IP: /10.185.35.39
>>>> ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648
>>>>AbstractCassandraDaemon.java
>>>> (line 133) Fatal exception in thread Thread[HintedHandoff:4,5,main]
>>>> java.lang.AssertionError
>>>>         at
>>>>
>>>>org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hin
>>>>te
>>>>dHandOffManager.java:282)
>>>>         at
>>>>
>>>>org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffMa
>>>>na
>>>>ger.java:81)
>>>>         at
>>>>
>>>>org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOf
>>>>fM
>>>>anager.java:333)
>>>>         at
>>>> 
>>>>org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>>>         at
>>>>
>>>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecut
>>>>or
>>>>.java:886)
>>>>         at
>>>>
>>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.j
>>>>av
>>>>a:908)
>>>>         at java.lang.Thread.run(Thread.java:662)
>>>
>>>
>>>This sounds like you have old hints 1.0 doesn't understand:
>>>
>>>                assert versionColumn != null;
>>>
>>>-Brandon
>>
>>
>
>
>
>-- 
>Jonathan Ellis
>Project Chair, Apache Cassandra
>co-founder of DataStax, the source for professional Cassandra support
>http://www.datastax.com


Re: Assertion

Posted by Jonathan Ellis <jb...@gmail.com>.
I bet this is https://issues.apache.org/jira/browse/CASSANDRA-3253.

On Fri, Sep 23, 2011 at 6:00 PM, Todd Burruss <bb...@expedia.com> wrote:
> My last test, I cleaned everything, started from scratch and I can still
> repo.  The AssertionError you mention below could be unrelated to my
> writes not working.
>
> This time there are _no_ exceptions or errors reported from any cassandra
> node.  My app runs for a while before writes start failing.  I can kill my
> app and will happen immediately upon restarting it.
>
> I didn't dig as hard last time, but now I can see that nodetool tpstats on
> 3 consecutive machines in the ring have their MutationStage "stuck" with
> piled up requests.  Completed is not going up and pending is not going
> down.  I have concurrent_writes = 32
>
> Here is a sample from one of the machines:
>
> Pool Name     Active   Pending     Completed   Blocked  All time blocked
> MutationStage                    32     1492    1416296   0       0
>
>
>
>
>
> On 9/23/11 3:11 PM, "Brandon Williams" <dr...@gmail.com> wrote:
>
>>On 9/23/11, Todd Burruss <bb...@expedia.com> wrote:
>>> INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java
>>> (line 259) Started hinted handoff for token:
>>> 150124573641590498586782915043427152112 with IP: /10.185.35.39
>>> ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648
>>>AbstractCassandraDaemon.java
>>> (line 133) Fatal exception in thread Thread[HintedHandoff:4,5,main]
>>> java.lang.AssertionError
>>>         at
>>>
>>>org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hinte
>>>dHandOffManager.java:282)
>>>         at
>>>
>>>org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffMana
>>>ger.java:81)
>>>         at
>>>
>>>org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffM
>>>anager.java:333)
>>>         at
>>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>>         at
>>>
>>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor
>>>.java:886)
>>>         at
>>>
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>>a:908)
>>>         at java.lang.Thread.run(Thread.java:662)
>>
>>
>>This sounds like you have old hints 1.0 doesn't understand:
>>
>>                assert versionColumn != null;
>>
>>-Brandon
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Assertion

Posted by Todd Burruss <bb...@expedia.com>.
Yep, attached

On 9/23/11 4:21 PM, "Brandon Williams" <dr...@gmail.com> wrote:

>On Fri, Sep 23, 2011 at 6:00 PM, Todd Burruss <bb...@expedia.com>
>wrote:
>> Here is a sample from one of the machines:
>>
>> Pool Name     Active   Pending     Completed   Blocked  All time blocked
>> MutationStage                    32     1492    1416296   0       0
>
>Can you get a thread dump to see what they are blocked on?
>
>-Brandon


Re: Assertion

Posted by Brandon Williams <dr...@gmail.com>.
On Fri, Sep 23, 2011 at 6:00 PM, Todd Burruss <bb...@expedia.com> wrote:
> Here is a sample from one of the machines:
>
> Pool Name     Active   Pending     Completed   Blocked  All time blocked
> MutationStage                    32     1492    1416296   0       0

Can you get a thread dump to see what they are blocked on?

-Brandon

Re: Assertion

Posted by Todd Burruss <bb...@expedia.com>.
My last test, I cleaned everything, started from scratch and I can still
repo.  The AssertionError you mention below could be unrelated to my
writes not working.

This time there are _no_ exceptions or errors reported from any cassandra
node.  My app runs for a while before writes start failing.  I can kill my
app and will happen immediately upon restarting it.

I didn't dig as hard last time, but now I can see that nodetool tpstats on
3 consecutive machines in the ring have their MutationStage "stuck" with
piled up requests.  Completed is not going up and pending is not going
down.  I have concurrent_writes = 32

Here is a sample from one of the machines:

Pool Name     Active   Pending	   Completed   Blocked  All time blocked
MutationStage                    32     1492    1416296   0       0





On 9/23/11 3:11 PM, "Brandon Williams" <dr...@gmail.com> wrote:

>On 9/23/11, Todd Burruss <bb...@expedia.com> wrote:
>> INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java
>> (line 259) Started hinted handoff for token:
>> 150124573641590498586782915043427152112 with IP: /10.185.35.39
>> ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648
>>AbstractCassandraDaemon.java
>> (line 133) Fatal exception in thread Thread[HintedHandoff:4,5,main]
>> java.lang.AssertionError
>>         at
>> 
>>org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hinte
>>dHandOffManager.java:282)
>>         at
>> 
>>org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffMana
>>ger.java:81)
>>         at
>> 
>>org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffM
>>anager.java:333)
>>         at
>> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>         at
>> 
>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor
>>.java:886)
>>         at
>> 
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>a:908)
>>         at java.lang.Thread.run(Thread.java:662)
>
>
>This sounds like you have old hints 1.0 doesn't understand:
>
>                assert versionColumn != null;
>
>-Brandon


Re: Assertion

Posted by Brandon Williams <dr...@gmail.com>.
On 9/23/11, Todd Burruss <bb...@expedia.com> wrote:
> INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java
> (line 259) Started hinted handoff for token:
> 150124573641590498586782915043427152112 with IP: /10.185.35.39
> ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648 AbstractCassandraDaemon.java
> (line 133) Fatal exception in thread Thread[HintedHandoff:4,5,main]
> java.lang.AssertionError
>         at
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282)
>         at
> org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:81)
>         at
> org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:333)
>         at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)


This sounds like you have old hints 1.0 doesn't understand:

                assert versionColumn != null;

-Brandon

Re: Assertion

Posted by Todd Burruss <bb...@expedia.com>.
No

On 9/23/11 3:04 PM, "Jonathan Ellis" <jb...@gmail.com> wrote:

>New errors in the log?
>
>On Fri, Sep 23, 2011 at 4:45 PM, Todd Burruss <bb...@expedia.com>
>wrote:
>> More information ... My cluster is in the state where I can read, but
>>not
>> write, again.  I used CLI to drop and recreate my keyspace and CFs.  I
>> then started inserting my data.  Before inserting all my test data I
>> starting getting TimedOutException on the client (Hector), then I
>>realized
>> it was in the state where writes are no longer working.  During all my
>> writes I am reading as well.
>>
>> Nodetool reports all nodes are "up", thrift is running, gossip is
>>active,
>> and all are 1.0.0-beta1
>>
>>
>>
>> On 9/23/11 9:40 AM, "Todd Burruss" <bb...@expedia.com> wrote:
>>
>>>Fyi Š I am seeing the exception (at end of message) using 1.0-beta1.
>>>
>>>Notes:
>>>
>>>- I was running 0.8.5 before dropping in 1.0-beta1
>>>- upgraded yaml file to be 1.0
>>>- some CFs were created in 0.8.5 and some in 1.0-beta1
>>>
>>>A couple of observations after seeing this:
>>>
>>>- cannot nicely kill cassandra, must use kill -9
>>>- the CLI can connect and 'get' at ONE or QUORUM, but cannot 'set' at
>>>either CL
>>>
>>>
>>>INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java
>>>(line 259) Started hinted handoff for token:
>>>150124573641590498586782915043427152112 with IP: /10.185.35.39
>>>ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648
>>>AbstractCassandraDaemon.java (line 133) Fatal exception in thread
>>>Thread[HintedHandoff:4,5,main]
>>>java.lang.AssertionError
>>>        at
>>>org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hint
>>>ed
>>>HandOffManager.java:282)
>>>        at
>>>org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffMan
>>>ag
>>>er.java:81)
>>>        at
>>>org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOff
>>>Ma
>>>nager.java:333)
>>>        at
>>>org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>>        at
>>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto
>>>r.
>>>java:886)
>>>        at
>>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja
>>>va
>>>:908)
>>>        at java.lang.Thread.run(Thread.java:662)
>>>
>>
>>
>
>
>
>-- 
>Jonathan Ellis
>Project Chair, Apache Cassandra
>co-founder of DataStax, the source for professional Cassandra support
>http://www.datastax.com


Re: Assertion

Posted by Jonathan Ellis <jb...@gmail.com>.
New errors in the log?

On Fri, Sep 23, 2011 at 4:45 PM, Todd Burruss <bb...@expedia.com> wrote:
> More information ... My cluster is in the state where I can read, but not
> write, again.  I used CLI to drop and recreate my keyspace and CFs.  I
> then started inserting my data.  Before inserting all my test data I
> starting getting TimedOutException on the client (Hector), then I realized
> it was in the state where writes are no longer working.  During all my
> writes I am reading as well.
>
> Nodetool reports all nodes are "up", thrift is running, gossip is active,
> and all are 1.0.0-beta1
>
>
>
> On 9/23/11 9:40 AM, "Todd Burruss" <bb...@expedia.com> wrote:
>
>>Fyi Š I am seeing the exception (at end of message) using 1.0-beta1.
>>
>>Notes:
>>
>>- I was running 0.8.5 before dropping in 1.0-beta1
>>- upgraded yaml file to be 1.0
>>- some CFs were created in 0.8.5 and some in 1.0-beta1
>>
>>A couple of observations after seeing this:
>>
>>- cannot nicely kill cassandra, must use kill -9
>>- the CLI can connect and 'get' at ONE or QUORUM, but cannot 'set' at
>>either CL
>>
>>
>>INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java
>>(line 259) Started hinted handoff for token:
>>150124573641590498586782915043427152112 with IP: /10.185.35.39
>>ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648
>>AbstractCassandraDaemon.java (line 133) Fatal exception in thread
>>Thread[HintedHandoff:4,5,main]
>>java.lang.AssertionError
>>        at
>>org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hinted
>>HandOffManager.java:282)
>>        at
>>org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManag
>>er.java:81)
>>        at
>>org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffMa
>>nager.java:333)
>>        at
>>org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>>        at
>>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
>>java:886)
>>        at
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>>:908)
>>        at java.lang.Thread.run(Thread.java:662)
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Assertion

Posted by Todd Burruss <bb...@expedia.com>.
More information ... My cluster is in the state where I can read, but not
write, again.  I used CLI to drop and recreate my keyspace and CFs.  I
then started inserting my data.  Before inserting all my test data I
starting getting TimedOutException on the client (Hector), then I realized
it was in the state where writes are no longer working.  During all my
writes I am reading as well.

Nodetool reports all nodes are "up", thrift is running, gossip is active,
and all are 1.0.0-beta1



On 9/23/11 9:40 AM, "Todd Burruss" <bb...@expedia.com> wrote:

>Fyi Š I am seeing the exception (at end of message) using 1.0-beta1.
>
>Notes:
>
>- I was running 0.8.5 before dropping in 1.0-beta1
>- upgraded yaml file to be 1.0
>- some CFs were created in 0.8.5 and some in 1.0-beta1
>
>A couple of observations after seeing this:
>
>- cannot nicely kill cassandra, must use kill -9
>- the CLI can connect and 'get' at ONE or QUORUM, but cannot 'set' at
>either CL
>
>
>INFO [HintedHandoff:4] 2011-09-22 22:59:27,626 HintedHandOffManager.java
>(line 259) Started hinted handoff for token:
>150124573641590498586782915043427152112 with IP: /10.185.35.39
>ERROR [HintedHandoff:4] 2011-09-22 22:59:27,648
>AbstractCassandraDaemon.java (line 133) Fatal exception in thread
>Thread[HintedHandoff:4,5,main]
>java.lang.AssertionError
>        at 
>org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(Hinted
>HandOffManager.java:282)
>        at 
>org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManag
>er.java:81)
>        at 
>org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffMa
>nager.java:333)
>        at 
>org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>        at 
>java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.
>java:886)
>        at 
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:908)
>        at java.lang.Thread.run(Thread.java:662)
>