You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Dathan Pattishall <da...@gmail.com> on 2010/07/21 05:59:06 UTC
what causes a cassandra to block and throw a null exception
Type 'help' or '?' for help. Type 'quit' or 'exit' to quit.
cassandra> connect cass01/9160
cassandra> get TimeFrameClicks.Standard2['test_cassandra_alive']
Exception null
The data exists and I can grab the data after I restart all the nodes,
but once the cluster runs for a few minutes I cannot grab this
specific key or random other keys. It takes about 3 seconds until the
Exception null message. My storage-conf.xml is very simple:
<Keyspace Name="TimeFrameClicks">
<ColumnFamily Name="Standard2" CompareWith="UTF8Type" />
....
Now my data is very small like 20 GB across 4 servers. Writes
consistently remain fast, yet reads fail like crazy. I hope its
something that I am doing wrong because
nodetool cfstats
says that the read latency for the keyspace and this specific column
family is less then 0.3 ms which means that something is lying to me.
To head off some questions:
CPU utilization is very little.
There is hardly any I/O on the box
The servers are all the same class enterprise boxes
There is 12 GB of ram per server
Each Server uses a local RAID.
Nothing in any of the system logs that indicates there any problem.
Additionally is there a stat or series of stats that I can lookup to
determine the health of read performance.
Re: what causes a cassandra to block and throw a null exception
Posted by Dathan Pattishall <da...@gmail.com>.
Just sent one of the nodes back.
Pool Name Active Pending Completed
STREAM-STAGE 0 0 0
RESPONSE-STAGE 0 0 151071
ROW-READ-STAGE 0 0 100398
LB-OPERATIONS 0 0 0
MESSAGE-DESERIALIZER-POOL 0 0 281268
GMFD 0 0 935
LB-TARGET 0 0 0
CONSISTENCY-MANAGER 0 0 59545
ROW-MUTATION-STAGE 0 0 71453
MESSAGE-STREAMING-POOL 0 0 0
LOAD-BALANCER-STAGE 0 0 0
FLUSH-SORTER-POOL 0 0 0
MEMTABLE-POST-FLUSHER 0 0 1
FLUSH-WRITER-POOL 0 0 1
AE-SERVICE-STAGE 0 0 0
HINTED-HANDOFF-POOL 0 0 3
On Tue, Jul 20, 2010 at 9:03 PM, Chris Goffinet <cg...@chrisgoffinet.com>
wrote:
> Can you provide the output from `nodetool tpstats`.
>
> -Chris
>
> On Jul 20, 2010, at 8:59 PM, Dathan Pattishall wrote:
>
>> Type 'help' or '?' for help. Type 'quit' or 'exit' to quit.
>> cassandra> connect cass01/9160
>> cassandra> get TimeFrameClicks.Standard2['test_cassandra_alive']
>> Exception null
>>
>> The data exists and I can grab the data after I restart all the nodes,
>> but once the cluster runs for a few minutes I cannot grab this
>> specific key or random other keys. It takes about 3 seconds until the
>> Exception null message. My storage-conf.xml is very simple:
>>
>> <Keyspace Name="TimeFrameClicks">
>> <ColumnFamily Name="Standard2" CompareWith="UTF8Type" />
>> ....
>>
>> Now my data is very small like 20 GB across 4 servers. Writes
>> consistently remain fast, yet reads fail like crazy. I hope its
>> something that I am doing wrong because
>>
>> nodetool cfstats
>>
>> says that the read latency for the keyspace and this specific column
>> family is less then 0.3 ms which means that something is lying to me.
>>
>> To head off some questions:
>>
>> CPU utilization is very little.
>> There is hardly any I/O on the box
>> The servers are all the same class enterprise boxes
>> There is 12 GB of ram per server
>> Each Server uses a local RAID.
>> Nothing in any of the system logs that indicates there any problem.
>>
>> Additionally is there a stat or series of stats that I can lookup to
>> determine the health of read performance.
>
>
Re: what causes a cassandra to block and throw a null exception
Posted by Chris Goffinet <cg...@chrisgoffinet.com>.
Can you provide the output from `nodetool tpstats`.
-Chris
On Jul 20, 2010, at 8:59 PM, Dathan Pattishall wrote:
> Type 'help' or '?' for help. Type 'quit' or 'exit' to quit.
> cassandra> connect cass01/9160
> cassandra> get TimeFrameClicks.Standard2['test_cassandra_alive']
> Exception null
>
> The data exists and I can grab the data after I restart all the nodes,
> but once the cluster runs for a few minutes I cannot grab this
> specific key or random other keys. It takes about 3 seconds until the
> Exception null message. My storage-conf.xml is very simple:
>
> <Keyspace Name="TimeFrameClicks">
> <ColumnFamily Name="Standard2" CompareWith="UTF8Type" />
> ....
>
> Now my data is very small like 20 GB across 4 servers. Writes
> consistently remain fast, yet reads fail like crazy. I hope its
> something that I am doing wrong because
>
> nodetool cfstats
>
> says that the read latency for the keyspace and this specific column
> family is less then 0.3 ms which means that something is lying to me.
>
> To head off some questions:
>
> CPU utilization is very little.
> There is hardly any I/O on the box
> The servers are all the same class enterprise boxes
> There is 12 GB of ram per server
> Each Server uses a local RAID.
> Nothing in any of the system logs that indicates there any problem.
>
> Additionally is there a stat or series of stats that I can lookup to
> determine the health of read performance.