You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Alex Li <al...@gmail.com> on 2010/04/22 09:40:06 UTC
Periodically hiccups
Hello,
We recently deployed a cluster of 5 Cassandra nodes into production, and
ran into big problems with periodically hiccups (individual node goes
down, high CPU, client connection timeout). It was terrible with 0.5
(one hiccups every 5-10 minutes), today we upgraded to 0.6.1, it happens
less frequently now (likely once every 30 minutes or so). But it is
still quite frustrating.
We used ReplicationFactor=3 for all column families. 5 nodes are behind
haproxy. Java client goes through haproxy. The most obvious behavior is:
as soon as one node goes down, the connections between haproxy and
Cassandra nodes just shoot up to 1000 (in normal case it is stable at
40), and the connections don't go down for quite a while. Meanwhile Java
clients just get all kind of TimeoutException, then kept on retrying.
Eventually we have to restart haproxy, then things go back to normal.
Each node has 5GB max heap, powerful enough CPU (quad-core), software
RAID mirror. We are definitely NOT putting lots of load yet, mostly
20-50 concurrent requests to Cassandra, but it is not holding up! Please
help, we are on the verge of giving up Cassandra after 5 days of
periodic "outage".
Couple observations:
- cfstats shows significant read latency on "system" keyspace, almost 5s
(see below)
- RecentReadLatencyMicros and RecentWriteLatencyMicros are super high
for StorageProxy, as well as every column family in JMX: up to 43 s and
9s (see screenshot). However, in cfstats, they are quite small.
- Every second we see 5-10 DigestMismatchException in the log:
INFO [pool-1-thread-15857] 2010-04-22 00:37:37,887 StorageProxy.java
(line 499) DigestMismatchException: Mismatch for key 1068022523
(d41d8cd98f00b204e9800998ecf8427e vs 0dd4cdaeeb1a334ae133c6955e109629)
Please advice. Thank you!
storage-conf, cfstats, tpstats are listed below:
root@cdb-006:/glass/sfw/cassandra# cat conf/storage-conf.xml
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing,
~ software distributed under the License is distributed on an
~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
~ KIND, either express or implied. See the License for the
~ specific language governing permissions and limitations
~ under the License.
-->
<Storage>
<!--======================================================================-->
<!-- Basic
Configuration -->
<!--======================================================================-->
<!--
~ The name of this cluster. This is mainly used to prevent
machines in
~ one logical cluster from joining another.
-->
<ClusterName>Test Cluster</ClusterName>
<!--
~ Turn on to make new [non-seed] nodes automatically migrate the
right data
~ to themselves. (If no InitialToken is specified, they will
pick one
~ such that they will get half the range of the most-loaded node.)
~ If a node starts up without bootstrapping, it will mark itself
bootstrapped
~ so that you can't subsequently accidently bootstrap a node with
~ data on it. (You can reset this by wiping your data and commitlog
~ directories.)
~
~ Off by default so that new clusters and upgraders from 0.4 don't
~ bootstrap immediately. You should turn this on when you start
adding
~ new nodes to a cluster that already has data on it. (If you
are upgrading
~ from 0.4, start your cluster with it off once before changing
it to true.
~ Otherwise, no data will be lost but you will incur a lot of
unnecessary
~ I/O before your cluster starts up.)
-->
<AutoBootstrap>false</AutoBootstrap>
<!--
~ Keyspaces and ColumnFamilies:
~ A ColumnFamily is the Cassandra concept closest to a relational
~ table. Keyspaces are separate groups of ColumnFamilies. Except in
~ very unusual circumstances you will have one Keyspace per
application.
~ There is an implicit keyspace named 'system' for Cassandra
internals.
-->
<Keyspaces>
<Keyspace Name="Pandora">
<ColumnFamily CompareWith="BytesType" Name="Standard1"/>
<ColumnFamily CompareWith="UTF8Type" Name="Standard2"/>
<ColumnFamily CompareWith="TimeUUIDType"
Name="StandardByUUID1"/>
<ColumnFamily CompareWith="TimeUUIDType" Name="Folder1"/>
<ColumnFamily CompareWith="UTF8Type" Name="FolderInfo"/>
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="User"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="Message"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="Folder"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="Attachment"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
<ReplicationFactor>3</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>
<Keyspace Name="Titan">
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="Club"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="Payment"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="User"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ColumnFamily ColumnType="Super"
CompareWith="UTF8Type"
CompareSubcolumnsWith="UTF8Type"
Name="FbUser"
Comment="A column family with supercolumns,
whose column and subcolumn names are UTF8 strings"/>
<ColumnFamily CompareWith="UTF8Type" Name="RandomUsers"/>
<ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
<ReplicationFactor>3</ReplicationFactor>
<EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
</Keyspace>
</Keyspaces>
<!--
~ Authenticator: any IAuthenticator may be used,
including your own as long
~ as it is on the classpath. Out of the box, Cassandra
provides
~ org.apache.cassandra.auth.AllowAllAuthenticator and,
~ org.apache.cassandra.auth.SimpleAuthenticator
~ (SimpleAuthenticator uses access.properties and
passwd.properties by
~ default).
~
~ If you don't specify an authenticator,
AllowAllAuthenticator is used.
-->
<Authenticator>org.apache.cassandra.auth.AllowAllAuthenticator</Authenticator>
<!--
~ Partitioner: any IPartitioner may be used, including
your own as long
~ as it is on the classpath. Out of the box, Cassandra
provides
~ org.apache.cassandra.dht.RandomPartitioner,
~ org.apache.cassandra.dht.OrderPreservingPartitioner, and
~
org.apache.cassandra.dht.CollatingOrderPreservingPartitioner.
~ (CollatingOPP colates according to EN,US rules, not
naive byte
~ ordering. Use this as an example if you need
locale-aware collation.)
~ Range queries require using an order-preserving
partitioner.
~
~ Achtung! Changing this parameter requires wiping your data
~ directories, since the partitioner can modify the
sstable on-disk
~ format.
-->
<Partitioner>org.apache.cassandra.dht.RandomPartitioner</Partitioner>
<!--
~ If you are using an order-preserving partitioner and
you know your key
~ distribution, you can specify the token for this node
to use. (Keys
~ are sent to the node with the "closest" token, so
distributing your
~ tokens equally along the key distribution space will
spread keys
~ evenly across your cluster.) This setting is only
checked the first
~ time a node is started.
~ This can also be useful with RandomPartitioner to force
equal spacing
~ of tokens around the hash space, especially for
clusters with a small
~ number of nodes.
-->
<InitialToken></InitialToken>
<!--
~ Directories: Specify where Cassandra should store
different data on
~ disk. Keep the data disks and the CommitLog disks
separate for best
~ performance
-->
<CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
<DataFileDirectories>
<DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
</DataFileDirectories>
<!--
~ Addresses of hosts that are deemed contact points.
Cassandra nodes
~ use this list of hosts to find each other and learn the
topology of
~ the ring. You must change this if you are running
multiple nodes!
-->
<Seeds>
<Seed>10.10.104.11</Seed>
<Seed>10.10.104.13</Seed>
<Seed>10.10.104.15</Seed>
</Seeds>
<!-- Miscellaneous -->
<!-- Time to wait for a reply from other nodes before failing the
command -->
<RpcTimeoutInMillis>10000</RpcTimeoutInMillis>
<!-- Size to allow commitlog to grow to before creating a new
segment -->
<CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB>
<!-- Local hosts and ports -->
<!--
~ Address to bind to and tell other nodes to connect to.
You _must_
~ change this if you want multiple nodes to be able to
communicate!
~
~ Leaving it blank leaves it up to
InetAddress.getLocalHost(). This
~ will always do the Right Thing *if* the node is
properly configured
~ (hostname, name resolution, etc), and the Right Thing
is to use the
~ address associated with the hostname (it might not be).
-->
<ListenAddress></ListenAddress>
<!-- internal communications port -->
<StoragePort>7000</StoragePort>
<!--
~ The address to bind the Thrift RPC service to. Unlike
ListenAddress
~ above, you *can* specify 0.0.0.0 here if you want
Thrift to listen on
~ all interfaces.
~
~ Leaving this blank has the same effect it does for
ListenAddress,
~ (i.e. it will be based on the configured hostname of
the node).
-->
<ThriftAddress>0.0.0.0</ThriftAddress>
<!-- Thrift RPC port (the port clients connect to). -->
<ThriftPort>9160</ThriftPort>
<!--
~ Whether or not to use a framed transport for Thrift. If
this option
~ is set to true then you must also use a framed
transport on the
~ client-side, (framed and non-framed transports are not
compatible).
-->
<ThriftFramedTransport>false</ThriftFramedTransport>
<!--======================================================================-->
<!-- Memory, Disk, and
Performance -->
<!--======================================================================-->
<!--
~ Access mode. mmapped i/o is substantially faster, but
only practical on
~ a 64bit machine (which notably does not include EC2
"small" instances)
~ or relatively small datasets. "auto", the safe choice,
will enable
~ mmapping on a 64bit JVM. Other values are "mmap",
"mmap_index_only"
~ (which may allow you to get part of the benefits of
mmap on a 32bit
~ machine by mmapping only index files) and "standard".
~ (The buffer size settings that follow only apply to
standard,
~ non-mmapped i/o.)
-->
<DiskAccessMode>auto</DiskAccessMode>
<!--
~ Size of compacted row above which to log a warning.
(If compacted
~ rows do not fit in memory, Cassandra will crash. This
is explained
~ in
http://wiki.apache.org/cassandra/CassandraLimitations and is
~ scheduled to be fixed in 0.7.)
-->
<RowWarningThresholdInMB>512</RowWarningThresholdInMB>
<!--
~ Buffer size to use when performing contiguous column
slices. Increase
~ this to the size of the column slices you typically
perform.
~ (Name-based queries are performed with a buffer size of
~ ColumnIndexSizeInKB.)
-->
<SlicedBufferSizeInKB>64</SlicedBufferSizeInKB>
<!--
~ Buffer size to use when flushing memtables to disk.
(Only one
~ memtable is ever flushed at a time.) Increase
(decrease) the index
~ buffer size relative to the data buffer if you have few
(many)
~ columns per key. Bigger is only better _if_ your
memtables get large
~ enough to use the space. (Check in your data directory
after your
~ app has been running long enough.) -->
<FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB>
<FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB>
<!--
~ Add column indexes to a row after its contents reach
this size.
~ Increase if your column values are large, or if you
have a very large
~ number of columns. The competing causes are, Cassandra
has to
~ deserialize this much of the row to read a single
column, so you want
~ it to be small - at least if you do many partial-row
reads - but all
~ the index data is read for each access, so you don't
want to generate
~ that wastefully either.
-->
<ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
<!--
~ Flush memtable after this much data has been inserted,
including
~ overwritten data. There is one memtable per column
family, and
~ this threshold is based solely on the amount of data
stored, not
~ actual heap memory usage (there is some overhead in
indexing the
~ columns).
-->
<MemtableThroughputInMB>64</MemtableThroughputInMB>
<!--
~ Throughput setting for Binary Memtables. Typically
these are
~ used for bulk load so you want them to be larger.
-->
<BinaryMemtableThroughputInMB>256</BinaryMemtableThroughputInMB>
<!--
~ The maximum number of columns in millions to store in
memory per
~ ColumnFamily before flushing to disk. This is also a
per-memtable
~ setting. Use with MemtableThroughputInMB to tune
memory usage.
-->
<MemtableOperationsInMillions>0.3</MemtableOperationsInMillions>
<!--
~ The maximum time to leave a dirty memtable unflushed.
~ (While any affected columnfamilies have unflushed data
from a
~ commit log segment, that segment cannot be deleted.)
~ This needs to be large enough that it won't cause a
flush storm
~ of all your memtables flushing at once because none has hit
~ the size or count thresholds yet. For production, a larger
~ value such as 1440 is recommended.
-->
<MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
<!--
~ Unlike most systems, in Cassandra writes are faster
than reads, so
~ you can afford more of those in parallel. A good rule
of thumb is 2
~ concurrent reads per processor core. Increase
ConcurrentWrites to
~ the number of clients writing at once if you enable
CommitLogSync +
~ CommitLogSyncDelay. -->
<ConcurrentReads>8</ConcurrentReads>
<ConcurrentWrites>32</ConcurrentWrites>
<!--
~ CommitLogSync may be either "periodic" or "batch."
When in batch
~ mode, Cassandra won't ack writes until the commit log
has been
~ fsynced to disk. It will wait up to
CommitLogSyncBatchWindowInMS
~ milliseconds for other writes, before performing the sync.
~ This is less necessary in Cassandra than in traditional
databases
~ since replication reduces the odds of losing data from
a failure
~ after writing the log entry but before it actually
reaches the disk.
~ So the other option is "periodic," where writes may be
acked immediately
~ and the CommitLog is simply synced every
CommitLogSyncPeriodInMS
~ milliseconds.
-->
<CommitLogSync>periodic</CommitLogSync>
<!--
~ Interval at which to perform syncs of the CommitLog in
periodic mode.
~ Usually the default of 10000ms is fine; increase it if
your i/o
~ load is such that syncs are taking excessively long times.
-->
<CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS>
<!--
~ Delay (in milliseconds) during which additional commit
log entries
~ may be written before fsync in batch mode. This will
increase
~ latency slightly, but can vastly improve throughput
where there are
~ many writers. Set to zero to disable (each entry will
be synced
~ individually). Reasonable values range from a minimal
0.1 to 10 or
~ even more if throughput matters more than latency.
-->
<!-- <CommitLogSyncBatchWindowInMS>1</CommitLogSyncBatchWindowInMS> -->
<!--
~ Time to wait before garbage-collection deletion
markers. Set this to
~ a large enough value that you are confident that the
deletion marker
~ will be propagated to all replicas by the time this
many seconds has
~ elapsed, even in the face of hardware failures. The
default value is
~ ten days.
-->
<GCGraceSeconds>864000</GCGraceSeconds>
</Storage>
root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost cfstats
Keyspace: system
Read Count: 878
Read Latency: 5752.042634396355 ms.
Write Count: 2260398
Write Latency: 0.014567047926957996 ms.
Pending Tasks: 0
Column Family: LocationInfo
SSTable count: 2
Space used (live): 3569
Space used (total): 3569
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 1
Read Count: 1
Read Latency: NaN ms.
Write Count: 6
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 2
Key cache size: 1
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0
Column Family: HintsColumnFamily
SSTable count: 2
Space used (live): 70272035
Space used (total): 70272035
Memtable Columns Count: 56264
Memtable Data Size: 486854
Memtable Switch Count: 21
Read Count: 877
Read Latency: 13614.412 ms.
Write Count: 2260392
Write Latency: 0.142 ms.
Pending Tasks: 0
Key cache capacity: 2
Key cache size: 2
Key cache hit rate: 0.25
Row cache: disabled
Compacted row minimum size: 78567
Compacted row maximum size: 39561901
Compacted row mean size: 27878603
----------------
Keyspace: Titan
Read Count: 8948702
Read Latency: 7.949136100185256 ms.
Write Count: 3393490
Write Latency: 0.19255415398306758 ms.
Pending Tasks: 0
Column Family: FbUser
SSTable count: 6
Space used (live): 3675014807
Space used (total): 3675014807
Memtable Columns Count: 250055
Memtable Data Size: 9146339
Memtable Switch Count: 9
Read Count: 6591406
Read Latency: 8.361 ms.
Write Count: 343030
Write Latency: 0.078 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 200000
Key cache hit rate: 0.6341912478864628
Row cache: disabled
Compacted row minimum size: 320
Compacted row maximum size: 586
Compacted row mean size: 479
Column Family: Payment
SSTable count: 3
Space used (live): 2473
Space used (total): 2473
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 27728
Read Latency: 0.059 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0
Column Family: Club
SSTable count: 6
Space used (live): 363609256
Space used (total): 477968226
Memtable Columns Count: 124134
Memtable Data Size: 38144903
Memtable Switch Count: 140
Read Count: 63668
Read Latency: 29.187 ms.
Write Count: 1966098
Write Latency: 0.184 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 63751
Key cache hit rate: 0.6543073524771221
Row cache: disabled
Compacted row minimum size: 274
Compacted row maximum size: 49784
Compacted row mean size: 7711
Column Family: RandomUsers
SSTable count: 1
Space used (live): 1329464
Space used (total): 1329464
Memtable Columns Count: 170433
Memtable Data Size: 5525529
Memtable Switch Count: 8
Read Count: 58236
Read Latency: 6.215 ms.
Write Count: 143339
Write Latency: 0.575 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 21
Key cache hit rate: 1.0
Row cache: disabled
Compacted row minimum size: 1244196
Compacted row maximum size: 1327409
Compacted row mean size: 1287929
Column Family: User
SSTable count: 4
Space used (live): 583812728
Space used (total): 583812728
Memtable Columns Count: 248833
Memtable Data Size: 7909687
Memtable Switch Count: 12
Read Count: 2207667
Read Latency: 8.413 ms.
Write Count: 941029
Write Latency: 0.107 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 94940
Key cache hit rate: 0.9769078238427732
Row cache: disabled
Compacted row minimum size: 259
Compacted row maximum size: 174340
Compacted row mean size: 3916
----------------
Keyspace: Pandora
Read Count: 1475530
Read Latency: 5.249669284257182 ms.
Write Count: 856550
Write Latency: 0.16070848053236822 ms.
Pending Tasks: 0
Column Family: Folder
SSTable count: 7
Space used (live): 942926793
Space used (total): 942926793
Memtable Columns Count: 150119
Memtable Data Size: 814194
Memtable Switch Count: 21
Read Count: 604605
Read Latency: 7.539 ms.
Write Count: 553015
Write Latency: 0.173 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 111884
Key cache hit rate: 0.9770064886992118
Row cache: disabled
Compacted row minimum size: 234
Compacted row maximum size: 1644541
Compacted row mean size: 1238
Column Family: Attachment
SSTable count: 5
Space used (live): 754692822
Space used (total): 754692822
Memtable Columns Count: 548
Memtable Data Size: 7110
Memtable Switch Count: 8
Read Count: 22708
Read Latency: 6.950 ms.
Write Count: 29835
Write Latency: 0.025 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 8299
Key cache hit rate: 0.017857142857142856
Row cache: disabled
Compacted row minimum size: 269
Compacted row maximum size: 274
Compacted row mean size: 272
Column Family: Message
SSTable count: 7
Space used (live): 3689251440
Space used (total): 3689251440
Memtable Columns Count: 50399
Memtable Data Size: 1361773
Memtable Switch Count: 10
Read Count: 163630
Read Latency: 16.981 ms.
Write Count: 257843
Write Latency: 0.113 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 52427
Key cache hit rate: 0.5041365046535677
Row cache: disabled
Compacted row minimum size: 272
Compacted row maximum size: 2754
Compacted row mean size: 752
Column Family: FolderInfo
SSTable count: 0
Space used (live): 0
Space used (total): 0
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 8704
Read Latency: 0.008 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0
Column Family: StandardByUUID1
SSTable count: 0
Space used (live): 0
Space used (total): 0
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 8704
Read Latency: 0.005 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0
Column Family: User
SSTable count: 4
Space used (live): 120121309
Space used (total): 120121309
Memtable Columns Count: 1786
Memtable Data Size: 50657
Memtable Switch Count: 8
Read Count: 641067
Read Latency: 2.728 ms.
Write Count: 15857
Write Latency: 0.148 ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 45965
Key cache hit rate: 0.957460499034818
Row cache: disabled
Compacted row minimum size: 271
Compacted row maximum size: 506
Compacted row mean size: 499
Column Family: Standard1
SSTable count: 0
Space used (live): 0
Space used (total): 0
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 8704
Read Latency: 0.005 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0
Column Family: Standard2
SSTable count: 0
Space used (live): 0
Space used (total): 0
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 8704
Read Latency: 0.006 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0
Column Family: Folder1
SSTable count: 0
Space used (live): 0
Space used (total): 0
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch Count: 0
Read Count: 8704
Read Latency: 0.005 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Tasks: 0
Key cache capacity: 200000
Key cache size: 0
Key cache hit rate: NaN
Row cache: disabled
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0
----------------
root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost tpstats
Pool Name Active Pending Completed
FILEUTILS-DELETE-POOL 0 0 70
STREAM-STAGE 0 0 0
RESPONSE-STAGE 0 0 18298968
ROW-READ-STAGE 1 0 10168985
LB-OPERATIONS 0 0 0
MESSAGE-DESERIALIZER-POOL 0 0 24960681
GMFD 0 0 93647
LB-TARGET 0 0 0
CONSISTENCY-MANAGER 0 0 0
ROW-MUTATION-STAGE 0 0 4280703
MESSAGE-STREAMING-POOL 0 0 12
LOAD-BALANCER-STAGE 0 0 0
FLUSH-SORTER-POOL 0 0 0
MEMTABLE-POST-FLUSHER 0 0 224
FLUSH-WRITER-POOL 0 0 224
AE-SERVICE-STAGE 0 0 22
HINTED-HANDOFF-POOL 1 5 73
Re: Periodically hiccups
Posted by Alex Li <al...@gmail.com>.
Here are some system load data I collected.
top - 00:43:43 up 1 day, 20:15, 2 users, load average: 2.96, 2.74, 2.37
Tasks: 160 total, 1 running, 159 sleeping, 0 stopped, 0 zombie
Cpu0 : 10.9%us, 2.6%sy, 0.0%ni, 83.2%id, 3.2%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu1 : 9.6%us, 2.2%sy, 0.0%ni, 85.4%id, 2.8%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu2 : 37.1%us, 4.2%sy, 0.0%ni, 23.3%id, 33.8%wa, 1.2%hi, 0.3%si,
0.0%st
Cpu3 : 18.5%us, 3.9%sy, 0.0%ni, 62.6%id, 9.5%wa, 4.1%hi, 1.4%si,
0.0%st
Cpu4 : 11.1%us, 2.6%sy, 0.0%ni, 80.1%id, 3.2%wa, 2.9%hi, 0.2%si,
0.0%st
Cpu5 : 10.1%us, 2.7%sy, 0.0%ni, 84.4%id, 2.7%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu6 : 20.1%us, 2.6%sy, 0.0%ni, 62.6%id, 14.7%wa, 0.0%hi, 0.0%si,
0.0%st
Cpu7 : 18.9%us, 4.4%sy, 0.0%ni, 69.0%id, 7.7%wa, 0.0%hi, 0.1%si,
0.0%st
Mem: 8188108k total, 8135168k used, 52940k free, 12968k buffers
Swap: 62500792k total, 277484k used, 62223308k free, 2381704k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
18000 root 20 0 15.6g 6.7g 1.5g S 64 85.4 552:54.13
java
1 root 20 0 19316 672 412 S 0 0.0 0:03.36 init
root@cdb-006:/glass/sfw/cassandra# vmstat 5
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
5 1 277984 50180 12888 2382292 84 67 880 127 10 4 17 4
69 10
3 1 278000 51536 12856 2381316 4 14 2556 37 8660 10714 7
4 82 8
2 0 277968 52084 12768 2380156 7 2 891 618 6287 8865 6 3
87 3
3 0 277872 52020 12764 2379412 24 7 1070 22 5492 10111 8
3 84 4
0 1 277528 50764 12760 2379176 2 10 1530 696 8773 13661 8
4 83 5
Alex Li wrote:
> Hello,
>
> We recently deployed a cluster of 5 Cassandra nodes into production,
> and ran into big problems with periodically hiccups (individual node
> goes down, high CPU, client connection timeout). It was terrible with
> 0.5 (one hiccups every 5-10 minutes), today we upgraded to 0.6.1, it
> happens less frequently now (likely once every 30 minutes or so). But
> it is still quite frustrating.
>
> We used ReplicationFactor=3 for all column families. 5 nodes are
> behind haproxy. Java client goes through haproxy. The most obvious
> behavior is: as soon as one node goes down, the connections between
> haproxy and Cassandra nodes just shoot up to 1000 (in normal case it
> is stable at 40), and the connections don't go down for quite a while.
> Meanwhile Java clients just get all kind of TimeoutException, then
> kept on retrying. Eventually we have to restart haproxy, then things
> go back to normal.
>
> Each node has 5GB max heap, powerful enough CPU (quad-core), software
> RAID mirror. We are definitely NOT putting lots of load yet, mostly
> 20-50 concurrent requests to Cassandra, but it is not holding up!
> Please help, we are on the verge of giving up Cassandra after 5 days
> of periodic "outage".
>
> Couple observations:
>
> - cfstats shows significant read latency on "system" keyspace, almost
> 5s (see below)
>
> - RecentReadLatencyMicros and RecentWriteLatencyMicros are super high
> for StorageProxy, as well as every column family in JMX: up to
> 152676.92 and 6950 (they are in ms, right?). However, in cfstats, they
> are quite small.
>
> - Every second we see 5-10 DigestMismatchException in the log:
>
> INFO [pool-1-thread-15857] 2010-04-22 00:37:37,887 StorageProxy.java
> (line 499) DigestMismatchException: Mismatch for key 1068022523
> (d41d8cd98f00b204e9800998ecf8427e vs 0dd4cdaeeb1a334ae133c6955e109629)
>
> Please advice. Thank you!
>
> storage-conf, cfstats, tpstats are listed below:
>
> root@cdb-006:/glass/sfw/cassandra# cat conf/storage-conf.xml
> <!--
> ~ Licensed to the Apache Software Foundation (ASF) under one
> ~ or more contributor license agreements. See the NOTICE file
> ~ distributed with this work for additional information
> ~ regarding copyright ownership. The ASF licenses this file
> ~ to you under the Apache License, Version 2.0 (the
> ~ "License"); you may not use this file except in compliance
> ~ with the License. You may obtain a copy of the License at
> ~
> ~ http://www.apache.org/licenses/LICENSE-2.0
> ~
> ~ Unless required by applicable law or agreed to in writing,
> ~ software distributed under the License is distributed on an
> ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> ~ KIND, either express or implied. See the License for the
> ~ specific language governing permissions and limitations
> ~ under the License.
> -->
> <Storage>
>
> <!--======================================================================-->
>
> <!-- Basic
> Configuration -->
>
> <!--======================================================================-->
>
>
> <!--
> ~ The name of this cluster. This is mainly used to prevent
> machines in
> ~ one logical cluster from joining another.
> -->
> <ClusterName>Test Cluster</ClusterName>
>
> <!--
> ~ Turn on to make new [non-seed] nodes automatically migrate the
> right data
> ~ to themselves. (If no InitialToken is specified, they will
> pick one
> ~ such that they will get half the range of the most-loaded node.)
> ~ If a node starts up without bootstrapping, it will mark itself
> bootstrapped
> ~ so that you can't subsequently accidently bootstrap a node with
> ~ data on it. (You can reset this by wiping your data and
> commitlog
> ~ directories.)
> ~
> ~ Off by default so that new clusters and upgraders from 0.4 don't
> ~ bootstrap immediately. You should turn this on when you start
> adding
> ~ new nodes to a cluster that already has data on it. (If you
> are upgrading
> ~ from 0.4, start your cluster with it off once before changing
> it to true.
> ~ Otherwise, no data will be lost but you will incur a lot of
> unnecessary
> ~ I/O before your cluster starts up.)
> -->
> <AutoBootstrap>false</AutoBootstrap>
>
> <!--
> ~ Keyspaces and ColumnFamilies:
> ~ A ColumnFamily is the Cassandra concept closest to a relational
> ~ table. Keyspaces are separate groups of ColumnFamilies.
> Except in
> ~ very unusual circumstances you will have one Keyspace per
> application.
>
> ~ There is an implicit keyspace named 'system' for Cassandra
> internals.
> -->
> <Keyspaces>
> <Keyspace Name="Pandora">
> <ColumnFamily CompareWith="BytesType" Name="Standard1"/>
> <ColumnFamily CompareWith="UTF8Type" Name="Standard2"/>
> <ColumnFamily CompareWith="TimeUUIDType"
> Name="StandardByUUID1"/>
> <ColumnFamily CompareWith="TimeUUIDType" Name="Folder1"/>
> <ColumnFamily CompareWith="UTF8Type" Name="FolderInfo"/>
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="User"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="Message"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="Folder"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="Attachment"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
>
>
> <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
>
> <ReplicationFactor>3</ReplicationFactor>
>
> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
>
> </Keyspace>
>
> <Keyspace Name="Titan">
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="Club"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="Payment"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="User"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
> <ColumnFamily ColumnType="Super"
> CompareWith="UTF8Type"
> CompareSubcolumnsWith="UTF8Type"
> Name="FbUser"
> Comment="A column family with supercolumns,
> whose column and subcolumn names are UTF8 strings"/>
> <ColumnFamily CompareWith="UTF8Type" Name="RandomUsers"/>
>
>
> <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>
>
> <ReplicationFactor>3</ReplicationFactor>
>
> <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
>
> </Keyspace>
> </Keyspaces>
>
> <!--
> ~ Authenticator: any IAuthenticator may be used,
> including your own as long
> ~ as it is on the classpath. Out of the box, Cassandra
> provides
> ~ org.apache.cassandra.auth.AllowAllAuthenticator and,
> ~ org.apache.cassandra.auth.SimpleAuthenticator
> ~ (SimpleAuthenticator uses access.properties and
> passwd.properties by
> ~ default).
> ~
> ~ If you don't specify an authenticator,
> AllowAllAuthenticator is used.
> -->
>
> <Authenticator>org.apache.cassandra.auth.AllowAllAuthenticator</Authenticator>
>
>
> <!--
> ~ Partitioner: any IPartitioner may be used, including
> your own as long
> ~ as it is on the classpath. Out of the box, Cassandra
> provides
> ~ org.apache.cassandra.dht.RandomPartitioner,
> ~ org.apache.cassandra.dht.OrderPreservingPartitioner, and
> ~
> org.apache.cassandra.dht.CollatingOrderPreservingPartitioner.
> ~ (CollatingOPP colates according to EN,US rules, not
> naive byte
> ~ ordering. Use this as an example if you need
> locale-aware collation.)
> ~ Range queries require using an order-preserving
> partitioner.
> ~
> ~ Achtung! Changing this parameter requires wiping your
> data
> ~ directories, since the partitioner can modify the
> sstable on-disk
> ~ format.
> -->
> <Partitioner>org.apache.cassandra.dht.RandomPartitioner</Partitioner>
>
> <!--
> ~ If you are using an order-preserving partitioner and
> you know your key
> ~ distribution, you can specify the token for this node
> to use. (Keys
> ~ are sent to the node with the "closest" token, so
> distributing your
> ~ tokens equally along the key distribution space will
> spread keys
> ~ evenly across your cluster.) This setting is only
> checked the first
> ~ time a node is started.
>
> ~ This can also be useful with RandomPartitioner to
> force equal spacing
> ~ of tokens around the hash space, especially for
> clusters with a small
> ~ number of nodes.
> -->
> <InitialToken></InitialToken>
>
> <!--
> ~ Directories: Specify where Cassandra should store
> different data on
> ~ disk. Keep the data disks and the CommitLog disks
> separate for best
> ~ performance
> -->
> <CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
> <DataFileDirectories>
> <DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
> </DataFileDirectories>
>
>
> <!--
> ~ Addresses of hosts that are deemed contact points.
> Cassandra nodes
> ~ use this list of hosts to find each other and learn
> the topology of
> ~ the ring. You must change this if you are running
> multiple nodes!
> -->
> <Seeds>
> <Seed>10.10.104.11</Seed>
> <Seed>10.10.104.13</Seed>
> <Seed>10.10.104.15</Seed>
> </Seeds>
>
>
> <!-- Miscellaneous -->
>
> <!-- Time to wait for a reply from other nodes before failing the
> command -->
> <RpcTimeoutInMillis>10000</RpcTimeoutInMillis>
> <!-- Size to allow commitlog to grow to before creating a new
> segment -->
> <CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB>
>
>
> <!-- Local hosts and ports -->
>
> <!--
> ~ Address to bind to and tell other nodes to connect
> to. You _must_
> ~ change this if you want multiple nodes to be able to
> communicate!
> ~
> ~ Leaving it blank leaves it up to
> InetAddress.getLocalHost(). This
> ~ will always do the Right Thing *if* the node is
> properly configured
> ~ (hostname, name resolution, etc), and the Right Thing
> is to use the
> ~ address associated with the hostname (it might not be).
> -->
> <ListenAddress></ListenAddress>
> <!-- internal communications port -->
> <StoragePort>7000</StoragePort>
>
> <!--
> ~ The address to bind the Thrift RPC service to. Unlike
> ListenAddress
> ~ above, you *can* specify 0.0.0.0 here if you want
> Thrift to listen on
> ~ all interfaces.
> ~
> ~ Leaving this blank has the same effect it does for
> ListenAddress,
> ~ (i.e. it will be based on the configured hostname of
> the node).
> -->
> <ThriftAddress>0.0.0.0</ThriftAddress>
> <!-- Thrift RPC port (the port clients connect to). -->
> <ThriftPort>9160</ThriftPort>
> <!--
> ~ Whether or not to use a framed transport for Thrift.
> If this option
> ~ is set to true then you must also use a framed
> transport on the
> ~ client-side, (framed and non-framed transports are not
> compatible).
> -->
> <ThriftFramedTransport>false</ThriftFramedTransport>
>
>
>
> <!--======================================================================-->
>
> <!-- Memory, Disk, and
> Performance -->
>
> <!--======================================================================-->
>
>
> <!--
> ~ Access mode. mmapped i/o is substantially faster, but
> only practical on
> ~ a 64bit machine (which notably does not include EC2
> "small" instances)
> ~ or relatively small datasets. "auto", the safe
> choice, will enable
> ~ mmapping on a 64bit JVM. Other values are "mmap",
> "mmap_index_only"
> ~ (which may allow you to get part of the benefits of
> mmap on a 32bit
> ~ machine by mmapping only index files) and "standard".
> ~ (The buffer size settings that follow only apply to
> standard,
> ~ non-mmapped i/o.)
> -->
> <DiskAccessMode>auto</DiskAccessMode>
>
> <!--
> ~ Size of compacted row above which to log a warning.
> (If compacted
> ~ rows do not fit in memory, Cassandra will crash. This
> is explained
> ~ in
> http://wiki.apache.org/cassandra/CassandraLimitations and is
> ~ scheduled to be fixed in 0.7.)
> -->
> <RowWarningThresholdInMB>512</RowWarningThresholdInMB>
>
> <!--
> ~ Buffer size to use when performing contiguous column
> slices. Increase
> ~ this to the size of the column slices you typically
> perform.
> ~ (Name-based queries are performed with a buffer size of
> ~ ColumnIndexSizeInKB.)
> -->
> <SlicedBufferSizeInKB>64</SlicedBufferSizeInKB>
>
> <!--
> ~ Buffer size to use when flushing memtables to disk.
> (Only one
> ~ memtable is ever flushed at a time.) Increase
> (decrease) the index
> ~ buffer size relative to the data buffer if you have
> few (many)
> ~ columns per key. Bigger is only better _if_ your
> memtables get large
> ~ enough to use the space. (Check in your data directory
> after your
> ~ app has been running long enough.) -->
> <FlushDataBufferSizeInMB>32</FlushDataBufferSizeInMB>
> <FlushIndexBufferSizeInMB>8</FlushIndexBufferSizeInMB>
>
> <!--
> ~ Add column indexes to a row after its contents reach
> this size.
> ~ Increase if your column values are large, or if you
> have a very large
> ~ number of columns. The competing causes are,
> Cassandra has to
> ~ deserialize this much of the row to read a single
> column, so you want
> ~ it to be small - at least if you do many partial-row
> reads - but all
> ~ the index data is read for each access, so you don't
> want to generate
> ~ that wastefully either.
> -->
> <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>
>
> <!--
> ~ Flush memtable after this much data has been inserted,
> including
> ~ overwritten data. There is one memtable per column
> family, and
> ~ this threshold is based solely on the amount of data
> stored, not
> ~ actual heap memory usage (there is some overhead in
> indexing the
> ~ columns).
> -->
> <MemtableThroughputInMB>64</MemtableThroughputInMB>
> <!--
> ~ Throughput setting for Binary Memtables. Typically
> these are
> ~ used for bulk load so you want them to be larger.
> -->
> <BinaryMemtableThroughputInMB>256</BinaryMemtableThroughputInMB>
> <!--
> ~ The maximum number of columns in millions to store in
> memory per
> ~ ColumnFamily before flushing to disk. This is also a
> per-memtable
> ~ setting. Use with MemtableThroughputInMB to tune
> memory usage.
> -->
> <MemtableOperationsInMillions>0.3</MemtableOperationsInMillions>
> <!--
> ~ The maximum time to leave a dirty memtable unflushed.
> ~ (While any affected columnfamilies have unflushed data
> from a
> ~ commit log segment, that segment cannot be deleted.)
> ~ This needs to be large enough that it won't cause a
> flush storm
> ~ of all your memtables flushing at once because none
> has hit
> ~ the size or count thresholds yet. For production, a
> larger
> ~ value such as 1440 is recommended.
> -->
> <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>
>
> <!--
> ~ Unlike most systems, in Cassandra writes are faster
> than reads, so
> ~ you can afford more of those in parallel. A good rule
> of thumb is 2
> ~ concurrent reads per processor core. Increase
> ConcurrentWrites to
> ~ the number of clients writing at once if you enable
> CommitLogSync +
> ~ CommitLogSyncDelay. -->
> <ConcurrentReads>8</ConcurrentReads>
> <ConcurrentWrites>32</ConcurrentWrites>
>
> <!--
> ~ CommitLogSync may be either "periodic" or "batch."
> When in batch
> ~ mode, Cassandra won't ack writes until the commit log
> has been
> ~ fsynced to disk. It will wait up to
> CommitLogSyncBatchWindowInMS
> ~ milliseconds for other writes, before performing the
> sync.
>
> ~ This is less necessary in Cassandra than in
> traditional databases
> ~ since replication reduces the odds of losing data from
> a failure
> ~ after writing the log entry but before it actually
> reaches the disk.
> ~ So the other option is "periodic," where writes may be
> acked immediately
> ~ and the CommitLog is simply synced every
> CommitLogSyncPeriodInMS
> ~ milliseconds.
> -->
> <CommitLogSync>periodic</CommitLogSync>
> <!--
> ~ Interval at which to perform syncs of the CommitLog in
> periodic mode.
> ~ Usually the default of 10000ms is fine; increase it if
> your i/o
> ~ load is such that syncs are taking excessively long
> times.
> -->
> <CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS>
> <!--
> ~ Delay (in milliseconds) during which additional commit
> log entries
> ~ may be written before fsync in batch mode. This will
> increase
> ~ latency slightly, but can vastly improve throughput
> where there are
> ~ many writers. Set to zero to disable (each entry will
> be synced
> ~ individually). Reasonable values range from a minimal
> 0.1 to 10 or
> ~ even more if throughput matters more than latency.
> -->
> <!-- <CommitLogSyncBatchWindowInMS>1</CommitLogSyncBatchWindowInMS>
> -->
>
> <!--
> ~ Time to wait before garbage-collection deletion
> markers. Set this to
> ~ a large enough value that you are confident that the
> deletion marker
> ~ will be propagated to all replicas by the time this
> many seconds has
> ~ elapsed, even in the face of hardware failures. The
> default value is
> ~ ten days.
> -->
> <GCGraceSeconds>864000</GCGraceSeconds>
> </Storage>
> root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost cfstats
> Keyspace: system
> Read Count: 878
> Read Latency: 5752.042634396355 ms.
> Write Count: 2260398
> Write Latency: 0.014567047926957996 ms.
> Pending Tasks: 0
> Column Family: LocationInfo
> SSTable count: 2
> Space used (live): 3569
> Space used (total): 3569
> Memtable Columns Count: 0
> Memtable Data Size: 0
> Memtable Switch Count: 1
> Read Count: 1
> Read Latency: NaN ms.
> Write Count: 6
> Write Latency: NaN ms.
> Pending Tasks: 0
> Key cache capacity: 2
> Key cache size: 1
> Key cache hit rate: NaN
> Row cache: disabled
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
>
> Column Family: HintsColumnFamily
> SSTable count: 2
> Space used (live): 70272035
> Space used (total): 70272035
> Memtable Columns Count: 56264
> Memtable Data Size: 486854
> Memtable Switch Count: 21
> Read Count: 877
> Read Latency: 13614.412 ms.
> Write Count: 2260392
> Write Latency: 0.142 ms.
> Pending Tasks: 0
> Key cache capacity: 2
> Key cache size: 2
> Key cache hit rate: 0.25
> Row cache: disabled
> Compacted row minimum size: 78567
> Compacted row maximum size: 39561901
> Compacted row mean size: 27878603
>
> ----------------
> Keyspace: Titan
> Read Count: 8948702
> Read Latency: 7.949136100185256 ms.
> Write Count: 3393490
> Write Latency: 0.19255415398306758 ms.
> Pending Tasks: 0
> Column Family: FbUser
> SSTable count: 6
> Space used (live): 3675014807
> Space used (total): 3675014807
> Memtable Columns Count: 250055
> Memtable Data Size: 9146339
> Memtable Switch Count: 9
> Read Count: 6591406
> Read Latency: 8.361 ms.
> Write Count: 343030
> Write Latency: 0.078 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 200000
> Key cache hit rate: 0.6341912478864628
> Row cache: disabled
> Compacted row minimum size: 320
> Compacted row maximum size: 586
> Compacted row mean size: 479
>
> Column Family: Payment
> SSTable count: 3
> Space used (live): 2473
> Space used (total): 2473
> Memtable Columns Count: 0
> Memtable Data Size: 0
> Memtable Switch Count: 0
> Read Count: 27728
> Read Latency: 0.059 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 0
> Key cache hit rate: NaN
> Row cache: disabled
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
>
> Column Family: Club
> SSTable count: 6
> Space used (live): 363609256
> Space used (total): 477968226
> Memtable Columns Count: 124134
> Memtable Data Size: 38144903
> Memtable Switch Count: 140
> Read Count: 63668
> Read Latency: 29.187 ms.
> Write Count: 1966098
> Write Latency: 0.184 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 63751
> Key cache hit rate: 0.6543073524771221
> Row cache: disabled
> Compacted row minimum size: 274
> Compacted row maximum size: 49784
> Compacted row mean size: 7711
>
> Column Family: RandomUsers
> SSTable count: 1
> Space used (live): 1329464
> Space used (total): 1329464
> Memtable Columns Count: 170433
> Memtable Data Size: 5525529
> Memtable Switch Count: 8
> Read Count: 58236
> Read Latency: 6.215 ms.
> Write Count: 143339
> Write Latency: 0.575 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 21
> Key cache hit rate: 1.0
> Row cache: disabled
> Compacted row minimum size: 1244196
> Compacted row maximum size: 1327409
> Compacted row mean size: 1287929
>
> Column Family: User
> SSTable count: 4
> Space used (live): 583812728
> Space used (total): 583812728
> Memtable Columns Count: 248833
> Memtable Data Size: 7909687
> Memtable Switch Count: 12
> Read Count: 2207667
> Read Latency: 8.413 ms.
> Write Count: 941029
> Write Latency: 0.107 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 94940
> Key cache hit rate: 0.9769078238427732
> Row cache: disabled
> Compacted row minimum size: 259
> Compacted row maximum size: 174340
> Compacted row mean size: 3916
>
> ----------------
> Keyspace: Pandora
> Read Count: 1475530
> Read Latency: 5.249669284257182 ms.
> Write Count: 856550
> Write Latency: 0.16070848053236822 ms.
> Pending Tasks: 0
> Column Family: Folder
> SSTable count: 7
> Space used (live): 942926793
> Space used (total): 942926793
> Memtable Columns Count: 150119
> Memtable Data Size: 814194
> Memtable Switch Count: 21
> Read Count: 604605
> Read Latency: 7.539 ms.
> Write Count: 553015
> Write Latency: 0.173 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 111884
> Key cache hit rate: 0.9770064886992118
> Row cache: disabled
> Compacted row minimum size: 234
> Compacted row maximum size: 1644541
> Compacted row mean size: 1238
>
> Column Family: Attachment
> SSTable count: 5
> Space used (live): 754692822
> Space used (total): 754692822
> Memtable Columns Count: 548
> Memtable Data Size: 7110
> Memtable Switch Count: 8
> Read Count: 22708
> Read Latency: 6.950 ms.
> Write Count: 29835
> Write Latency: 0.025 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 8299
> Key cache hit rate: 0.017857142857142856
> Row cache: disabled
> Compacted row minimum size: 269
> Compacted row maximum size: 274
> Compacted row mean size: 272
>
> Column Family: Message
> SSTable count: 7
> Space used (live): 3689251440
> Space used (total): 3689251440
> Memtable Columns Count: 50399
> Memtable Data Size: 1361773
> Memtable Switch Count: 10
> Read Count: 163630
> Read Latency: 16.981 ms.
> Write Count: 257843
> Write Latency: 0.113 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 52427
> Key cache hit rate: 0.5041365046535677
> Row cache: disabled
> Compacted row minimum size: 272
> Compacted row maximum size: 2754
> Compacted row mean size: 752
>
> Column Family: FolderInfo
> SSTable count: 0
> Space used (live): 0
> Space used (total): 0
> Memtable Columns Count: 0
> Memtable Data Size: 0
> Memtable Switch Count: 0
> Read Count: 8704
> Read Latency: 0.008 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 0
> Key cache hit rate: NaN
> Row cache: disabled
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
>
> Column Family: StandardByUUID1
> SSTable count: 0
> Space used (live): 0
> Space used (total): 0
> Memtable Columns Count: 0
> Memtable Data Size: 0
> Memtable Switch Count: 0
> Read Count: 8704
> Read Latency: 0.005 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 0
> Key cache hit rate: NaN
> Row cache: disabled
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
>
> Column Family: User
> SSTable count: 4
> Space used (live): 120121309
> Space used (total): 120121309
> Memtable Columns Count: 1786
> Memtable Data Size: 50657
> Memtable Switch Count: 8
> Read Count: 641067
> Read Latency: 2.728 ms.
> Write Count: 15857
> Write Latency: 0.148 ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 45965
> Key cache hit rate: 0.957460499034818
> Row cache: disabled
> Compacted row minimum size: 271
> Compacted row maximum size: 506
> Compacted row mean size: 499
>
> Column Family: Standard1
> SSTable count: 0
> Space used (live): 0
> Space used (total): 0
> Memtable Columns Count: 0
> Memtable Data Size: 0
> Memtable Switch Count: 0
> Read Count: 8704
> Read Latency: 0.005 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 0
> Key cache hit rate: NaN
> Row cache: disabled
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
>
> Column Family: Standard2
> SSTable count: 0
> Space used (live): 0
> Space used (total): 0
> Memtable Columns Count: 0
> Memtable Data Size: 0
> Memtable Switch Count: 0
> Read Count: 8704
> Read Latency: 0.006 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 0
> Key cache hit rate: NaN
> Row cache: disabled
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
>
> Column Family: Folder1
> SSTable count: 0
> Space used (live): 0
> Space used (total): 0
> Memtable Columns Count: 0
> Memtable Data Size: 0
> Memtable Switch Count: 0
> Read Count: 8704
> Read Latency: 0.005 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Tasks: 0
> Key cache capacity: 200000
> Key cache size: 0
> Key cache hit rate: NaN
> Row cache: disabled
> Compacted row minimum size: 0
> Compacted row maximum size: 0
> Compacted row mean size: 0
>
> ----------------
> root@cdb-006:/glass/sfw/cassandra# bin/nodetool -h localhost tpstats
> Pool Name Active Pending Completed
> FILEUTILS-DELETE-POOL 0 0 70
> STREAM-STAGE 0 0 0
> RESPONSE-STAGE 0 0 18298968
> ROW-READ-STAGE 1 0 10168985
> LB-OPERATIONS 0 0 0
> MESSAGE-DESERIALIZER-POOL 0 0 24960681
> GMFD 0 0 93647
> LB-TARGET 0 0 0
> CONSISTENCY-MANAGER 0 0 0
> ROW-MUTATION-STAGE 0 0 4280703
> MESSAGE-STREAMING-POOL 0 0 12
> LOAD-BALANCER-STAGE 0 0 0
> FLUSH-SORTER-POOL 0 0 0
> MEMTABLE-POST-FLUSHER 0 0 224
> FLUSH-WRITER-POOL 0 0 224
> AE-SERVICE-STAGE 0 0 22
> HINTED-HANDOFF-POOL 1 5 73
>
>
> ------------------------------------------------------------------------
RE: Periodically hiccups
Posted by "Dr. Martin Grabmüller" <Ma...@eleven.de>.
Hello Alex,
unfortunately I can not help with your problem, just one hint:
> - RecentReadLatencyMicros and RecentWriteLatencyMicros are super high
> for StorageProxy, as well as every column family in JMX: up
> to 43 s and
> 9s (see screenshot). However, in cfstats, they are quite small.
Remember that the latency values in JMX are in microseconds, whereas
cfstats reports in ms. This was changed between 0.5 and 0.6, IIRC.
Greetings,
Martin