You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Bryce Godfrey <Br...@azaleos.com> on 2012/03/14 20:33:42 UTC

Large hints column family

The system HintsColumnFamily seems large in my cluster, and I want to track down why that is.  I try invoking "listEndpointsPendingHints()" for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against.  It's a 3 node cluster, and all nodes have been up and running without issue for a while.  Any help on where to start with this?

               Column Family: HintsColumnFamily
                SSTable count: 11
                Space used (live): 11271669539
                Space used (total): 11271669539
                Number of Keys (estimate): 1408
                Memtable Columns Count: 338
                Memtable Data Size: 0
                Memtable Switch Count: 1
                Read Count: 3
                Read Latency: 4354.669 ms.
                Write Count: 848
                Write Latency: 0.029 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 0
                Bloom Filter False Ratio: 0.00000
                Bloom Filter Space Used: 12656
                Key cache capacity: 14
                Key cache size: 11
                Key cache hit rate: 0.6666666666666666
                Row cache: disabled
                Compacted row minimum size: 105779
                Compacted row maximum size: 7152383774
                Compacted row mean size: 590818614

Thanks,
Bryce

RE: Large hints column family

Posted by Bryce Godfrey <Br...@azaleos.com>.
I took the reset the world approach, things are much better now and the hints table is staying empty.  Bit disconcerting that it could get so large and not be able to recover itself, but at least there was a solution.  Thanks


From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Thursday, March 15, 2012 7:24 PM
To: user@cassandra.apache.org
Subject: Re: Large hints column family

These messages make it look like the node is having trouble delivering hints.
INFO [HintedHandoff:1] 2012-03-13 16:13:34,188 HintedHandOffManager.java (line 284) Endpoint /192.168.20.4 died before hint delivery, aborting
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries

Take another look at the logs on this machine and on 20.4 and 20.3.

I would be looking int why so many hints are been stored. GC ? are there also logs about dropped messages ?

If you want to reset the world, make sure the nodes have all run repair and then drop the hints. Either via JMX or stopped in the node and deleting the files on disk.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/03/2012, at 12:58 PM, Bryce Godfrey wrote:


We were having some occasional memory pressure issues, but we just added some more RAM a few days ago to the nodes and things are running more smoothly now, but in general nodes have not been going up and down.

I tried to do a "list HintsColumnFamily" from Cassandra-cli and it locks my Cassandra node and never returns, forcing me to kill the Cassandra process and restart it to get the node back.

Here is my settings which I believe are default since I don't remember changing them:

hinted_handoff_enabled: true
max_hint_window_in_ms: 3600000 # one hour
hinted_handoff_throttle_delay_in_ms: 50

Greping for Hinted in system log I get these
INFO [HintedHandoff:1] 2012-03-13 16:13:22,215 HintedHandOffManager.java (line 373) Finished hinted handoff of 852703 rows to endpoint /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 16:13:34,188 HintedHandOffManager.java (line 284) Endpoint /192.168.20.4 died before hint delivery, aborting
INFO [ScheduledTasks:1] 2012-03-13 16:15:32,569 StatusLogger.java (line 65) HintedHandoff                     1         1         0
INFO [HintedHandoff:1] 2012-03-13 16:15:44,362 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 16:21:37,266 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-13 16:23:07,662 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:25:49,330 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:30:52,503 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:42:22,202 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-HintsColumnFamily@661547256(34298224/74465815 serialized/live bytes, 78808 ops)
INFO [HintedHandoff:1] 2012-03-13 17:11:00,098 HintedHandOffManager.java (line 373) Finished hinted handoff of 44160 rows to endpoint /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 17:11:36,596 HintedHandOffManager.java (line 296) Started hinted handoff for token: 56713727820156407428984779325531226112 with IP: /192.168.20.4
INFO [ScheduledTasks:1] 2012-03-13 17:12:25,248 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-13 18:47:56,151 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-13 18:50:24,326 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:12:48,177 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:13:57,685 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:14:57,258 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:14:58,260 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:15:59,093 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:16:59,428 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:18:01,862 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:18:01,898 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:19:04,527 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:19:04,541 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:20:07,712 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:20:08,332 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-14 12:27:13,033 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-15 15:05:00,954 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:06:07,750 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
INFO [ScheduledTasks:1] 2012-03-15 15:06:07,802 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:06:07,809 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-HintsColumnFamily@254668880(103911/8312880 serialized/live bytes, 63877 ops)
INFO [ScheduledTasks:1] 2012-03-15 15:07:13,503 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:15:43,842 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3


From: aaron morton [mailto:aaron@thelastpickle.com]<mailto:[mailto:aaron@thelastpickle.com]>
Sent: Thursday, March 15, 2012 1:51 AM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Large hints column family

Is there anything going on in the logs ? Are nodes going up and down ? Can you see any messages about delivering hints ?

If the query to read the hints errors it will log "HintsCF getEPPendingHints timed out" at INFO level.

Also checking, do the hinted_handoff_*  settings in cassandra.yaml have their default settings ?

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/03/2012, at 8:35 AM, Bryce Godfrey wrote:



Forgot to mention that this is on 1.0.8

From: Bryce Godfrey [mailto:Bryce.Godfrey@azaleos.com]<mailto:[mailto:Bryce.Godfrey@azaleos.com]>
Sent: Wednesday, March 14, 2012 12:34 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Large hints column family

The system HintsColumnFamily seems large in my cluster, and I want to track down why that is.  I try invoking "listEndpointsPendingHints()" for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against.  It's a 3 node cluster, and all nodes have been up and running without issue for a while.  Any help on where to start with this?

               Column Family: HintsColumnFamily
                SSTable count: 11
                Space used (live): 11271669539
                Space used (total): 11271669539
                Number of Keys (estimate): 1408
                Memtable Columns Count: 338
                Memtable Data Size: 0
                Memtable Switch Count: 1
                Read Count: 3
                Read Latency: 4354.669 ms.
                Write Count: 848
                Write Latency: 0.029 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 0
                Bloom Filter False Ratio: 0.00000
                Bloom Filter Space Used: 12656
                Key cache capacity: 14
                Key cache size: 11
                Key cache hit rate: 0.6666666666666666
                Row cache: disabled
                Compacted row minimum size: 105779
                Compacted row maximum size: 7152383774
                Compacted row mean size: 590818614

Thanks,
Bryce


Re: Large hints column family

Posted by aaron morton <aa...@thelastpickle.com>.
These messages make it look like the node is having trouble delivering hints. 
> INFO [HintedHandoff:1] 2012-03-13 16:13:34,188 HintedHandOffManager.java (line 284) Endpoint /192.168.20.4 died before hint delivery, aborting
> INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
 
Take another look at the logs on this machine and on 20.4 and 20.3. 

I would be looking int why so many hints are been stored. GC ? are there also logs about dropped messages ? 

If you want to reset the world, make sure the nodes have all run repair and then drop the hints. Either via JMX or stopped in the node and deleting the files on disk. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/03/2012, at 12:58 PM, Bryce Godfrey wrote:

> We were having some occasional memory pressure issues, but we just added some more RAM a few days ago to the nodes and things are running more smoothly now, but in general nodes have not been going up and down.
>  
> I tried to do a “list HintsColumnFamily” from Cassandra-cli and it locks my Cassandra node and never returns, forcing me to kill the Cassandra process and restart it to get the node back.
>  
> Here is my settings which I believe are default since I don’t remember changing them:
>  
> hinted_handoff_enabled: true
> max_hint_window_in_ms: 3600000 # one hour
> hinted_handoff_throttle_delay_in_ms: 50
>  
> Greping for Hinted in system log I get these
> INFO [HintedHandoff:1] 2012-03-13 16:13:22,215 HintedHandOffManager.java (line 373) Finished hinted handoff of 852703 rows to endpoint /192.168.20.3
> INFO [HintedHandoff:1] 2012-03-13 16:13:34,188 HintedHandOffManager.java (line 284) Endpoint /192.168.20.4 died before hint delivery, aborting
> INFO [ScheduledTasks:1] 2012-03-13 16:15:32,569 StatusLogger.java (line 65) HintedHandoff                     1         1         0
> INFO [HintedHandoff:1] 2012-03-13 16:15:44,362 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
> INFO [HintedHandoff:1] 2012-03-13 16:21:37,266 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
> INFO [ScheduledTasks:1] 2012-03-13 16:23:07,662 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-13 16:25:49,330 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-13 16:30:52,503 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-13 16:42:22,202 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
> INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-HintsColumnFamily@661547256(34298224/74465815 serialized/live bytes, 78808 ops)
> INFO [HintedHandoff:1] 2012-03-13 17:11:00,098 HintedHandOffManager.java (line 373) Finished hinted handoff of 44160 rows to endpoint /192.168.20.3
> INFO [HintedHandoff:1] 2012-03-13 17:11:36,596 HintedHandOffManager.java (line 296) Started hinted handoff for token: 56713727820156407428984779325531226112 with IP: /192.168.20.4
> INFO [ScheduledTasks:1] 2012-03-13 17:12:25,248 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [HintedHandoff:1] 2012-03-13 18:47:56,151 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
> INFO [ScheduledTasks:1] 2012-03-13 18:50:24,326 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:12:48,177 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:13:57,685 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:14:57,258 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:14:58,260 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:15:59,093 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:16:59,428 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:18:01,862 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:18:01,898 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:19:04,527 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:19:04,541 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:20:07,712 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [ScheduledTasks:1] 2012-03-14 12:20:08,332 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [HintedHandoff:1] 2012-03-14 12:27:13,033 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
> INFO [ScheduledTasks:1] 2012-03-15 15:05:00,954 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [HintedHandoff:1] 2012-03-15 15:06:07,750 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
> INFO [ScheduledTasks:1] 2012-03-15 15:06:07,802 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [HintedHandoff:1] 2012-03-15 15:06:07,809 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-HintsColumnFamily@254668880(103911/8312880 serialized/live bytes, 63877 ops)
> INFO [ScheduledTasks:1] 2012-03-15 15:07:13,503 StatusLogger.java (line 65) HintedHandoff                     1         2         0
> INFO [HintedHandoff:1] 2012-03-15 15:15:43,842 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
>  
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Thursday, March 15, 2012 1:51 AM
> To: user@cassandra.apache.org
> Subject: Re: Large hints column family
>  
> Is there anything going on in the logs ? Are nodes going up and down ? Can you see any messages about delivering hints ? 
>  
> If the query to read the hints errors it will log "HintsCF getEPPendingHints timed out" at INFO level. 
>  
> Also checking, do the hinted_handoff_*  settings in cassandra.yaml have their default settings ?
>  
> Cheers
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 15/03/2012, at 8:35 AM, Bryce Godfrey wrote:
> 
> 
> Forgot to mention that this is on 1.0.8
>  
> From: Bryce Godfrey [mailto:Bryce.Godfrey@azaleos.com] 
> Sent: Wednesday, March 14, 2012 12:34 PM
> To: user@cassandra.apache.org
> Subject: Large hints column family
>  
> The system HintsColumnFamily seems large in my cluster, and I want to track down why that is.  I try invoking “listEndpointsPendingHints()” for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against.  It’s a 3 node cluster, and all nodes have been up and running without issue for a while.  Any help on where to start with this?
>  
>                Column Family: HintsColumnFamily
>                 SSTable count: 11
>                 Space used (live): 11271669539
>                 Space used (total): 11271669539
>                 Number of Keys (estimate): 1408
>                 Memtable Columns Count: 338
>                 Memtable Data Size: 0
>                 Memtable Switch Count: 1
>                 Read Count: 3
>                 Read Latency: 4354.669 ms.
>                 Write Count: 848
>                 Write Latency: 0.029 ms.
>                 Pending Tasks: 0
>                 Bloom Filter False Postives: 0
>                 Bloom Filter False Ratio: 0.00000
>                 Bloom Filter Space Used: 12656
>                 Key cache capacity: 14
>                 Key cache size: 11
>                 Key cache hit rate: 0.6666666666666666
>                 Row cache: disabled
>                 Compacted row minimum size: 105779
>                 Compacted row maximum size: 7152383774
>                 Compacted row mean size: 590818614
>  
> Thanks,
> Bryce


RE: Large hints column family

Posted by Bryce Godfrey <Br...@azaleos.com>.
We were having some occasional memory pressure issues, but we just added some more RAM a few days ago to the nodes and things are running more smoothly now, but in general nodes have not been going up and down.

I tried to do a "list HintsColumnFamily" from Cassandra-cli and it locks my Cassandra node and never returns, forcing me to kill the Cassandra process and restart it to get the node back.

Here is my settings which I believe are default since I don't remember changing them:

hinted_handoff_enabled: true
max_hint_window_in_ms: 3600000 # one hour
hinted_handoff_throttle_delay_in_ms: 50

Greping for Hinted in system log I get these
INFO [HintedHandoff:1] 2012-03-13 16:13:22,215 HintedHandOffManager.java (line 373) Finished hinted handoff of 852703 rows to endpoint /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 16:13:34,188 HintedHandOffManager.java (line 284) Endpoint /192.168.20.4 died before hint delivery, aborting
INFO [ScheduledTasks:1] 2012-03-13 16:15:32,569 StatusLogger.java (line 65) HintedHandoff                     1         1         0
INFO [HintedHandoff:1] 2012-03-13 16:15:44,362 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 16:21:37,266 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-13 16:23:07,662 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:25:49,330 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:30:52,503 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-13 16:42:22,202 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
INFO [HintedHandoff:1] 2012-03-13 17:03:50,986 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-HintsColumnFamily@661547256(34298224/74465815 serialized/live bytes, 78808 ops)
INFO [HintedHandoff:1] 2012-03-13 17:11:00,098 HintedHandOffManager.java (line 373) Finished hinted handoff of 44160 rows to endpoint /192.168.20.3
INFO [HintedHandoff:1] 2012-03-13 17:11:36,596 HintedHandOffManager.java (line 296) Started hinted handoff for token: 56713727820156407428984779325531226112 with IP: /192.168.20.4
INFO [ScheduledTasks:1] 2012-03-13 17:12:25,248 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-13 18:47:56,151 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-13 18:50:24,326 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:12:48,177 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:13:57,685 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:14:57,258 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:14:58,260 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:15:59,093 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:16:59,428 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:18:01,862 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:18:01,898 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:19:04,527 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:19:04,541 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:20:07,712 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [ScheduledTasks:1] 2012-03-14 12:20:08,332 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-14 12:27:13,033 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3
INFO [ScheduledTasks:1] 2012-03-15 15:05:00,954 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:06:07,750 HintedHandOffManager.java (line 354) Timed out replaying hints to /192.168.20.3; aborting further deliveries
INFO [ScheduledTasks:1] 2012-03-15 15:06:07,802 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:06:07,809 ColumnFamilyStore.java (line 704) Enqueuing flush of Memtable-HintsColumnFamily@254668880(103911/8312880 serialized/live bytes, 63877 ops)
INFO [ScheduledTasks:1] 2012-03-15 15:07:13,503 StatusLogger.java (line 65) HintedHandoff                     1         2         0
INFO [HintedHandoff:1] 2012-03-15 15:15:43,842 HintedHandOffManager.java (line 296) Started hinted handoff for token: 113427455640312814857969558651062452224 with IP: /192.168.20.3


From: aaron morton [mailto:aaron@thelastpickle.com]<mailto:[mailto:aaron@thelastpickle.com]>
Sent: Thursday, March 15, 2012 1:51 AM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Re: Large hints column family

Is there anything going on in the logs ? Are nodes going up and down ? Can you see any messages about delivering hints ?

If the query to read the hints errors it will log "HintsCF getEPPendingHints timed out" at INFO level.

Also checking, do the hinted_handoff_*  settings in cassandra.yaml have their default settings ?

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/03/2012, at 8:35 AM, Bryce Godfrey wrote:


Forgot to mention that this is on 1.0.8

From: Bryce Godfrey [mailto:Bryce.Godfrey@azaleos.com]<mailto:[mailto:Bryce.Godfrey@azaleos.com]>
Sent: Wednesday, March 14, 2012 12:34 PM
To: user@cassandra.apache.org<ma...@cassandra.apache.org>
Subject: Large hints column family

The system HintsColumnFamily seems large in my cluster, and I want to track down why that is.  I try invoking "listEndpointsPendingHints()" for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against.  It's a 3 node cluster, and all nodes have been up and running without issue for a while.  Any help on where to start with this?

               Column Family: HintsColumnFamily
                SSTable count: 11
                Space used (live): 11271669539
                Space used (total): 11271669539
                Number of Keys (estimate): 1408
                Memtable Columns Count: 338
                Memtable Data Size: 0
                Memtable Switch Count: 1
                Read Count: 3
                Read Latency: 4354.669 ms.
                Write Count: 848
                Write Latency: 0.029 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 0
                Bloom Filter False Ratio: 0.00000
                Bloom Filter Space Used: 12656
                Key cache capacity: 14
                Key cache size: 11
                Key cache hit rate: 0.6666666666666666
                Row cache: disabled
                Compacted row minimum size: 105779
                Compacted row maximum size: 7152383774
                Compacted row mean size: 590818614

Thanks,
Bryce


Re: Large hints column family

Posted by aaron morton <aa...@thelastpickle.com>.
Is there anything going on in the logs ? Are nodes going up and down ? Can you see any messages about delivering hints ? 

If the query to read the hints errors it will log "HintsCF getEPPendingHints timed out" at INFO level. 

Also checking, do the hinted_handoff_*  settings in cassandra.yaml have their default settings ?

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/03/2012, at 8:35 AM, Bryce Godfrey wrote:

> Forgot to mention that this is on 1.0.8
>  
> From: Bryce Godfrey [mailto:Bryce.Godfrey@azaleos.com] 
> Sent: Wednesday, March 14, 2012 12:34 PM
> To: user@cassandra.apache.org
> Subject: Large hints column family
>  
> The system HintsColumnFamily seems large in my cluster, and I want to track down why that is.  I try invoking “listEndpointsPendingHints()” for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against.  It’s a 3 node cluster, and all nodes have been up and running without issue for a while.  Any help on where to start with this?
>  
>                Column Family: HintsColumnFamily
>                 SSTable count: 11
>                 Space used (live): 11271669539
>                 Space used (total): 11271669539
>                 Number of Keys (estimate): 1408
>                 Memtable Columns Count: 338
>                 Memtable Data Size: 0
>                 Memtable Switch Count: 1
>                 Read Count: 3
>                 Read Latency: 4354.669 ms.
>                 Write Count: 848
>                 Write Latency: 0.029 ms.
>                 Pending Tasks: 0
>                 Bloom Filter False Postives: 0
>                 Bloom Filter False Ratio: 0.00000
>                 Bloom Filter Space Used: 12656
>                 Key cache capacity: 14
>                 Key cache size: 11
>                 Key cache hit rate: 0.6666666666666666
>                 Row cache: disabled
>                 Compacted row minimum size: 105779
>                 Compacted row maximum size: 7152383774
>                 Compacted row mean size: 590818614
>  
> Thanks,
> Bryce


RE: Large hints column family

Posted by Bryce Godfrey <Br...@azaleos.com>.
Forgot to mention that this is on 1.0.8

From: Bryce Godfrey [mailto:Bryce.Godfrey@azaleos.com]
Sent: Wednesday, March 14, 2012 12:34 PM
To: user@cassandra.apache.org
Subject: Large hints column family

The system HintsColumnFamily seems large in my cluster, and I want to track down why that is.  I try invoking "listEndpointsPendingHints()" for o.a.c.db.HintedHandoffManager and it never returns, and also freezes the node that its invoked against.  It's a 3 node cluster, and all nodes have been up and running without issue for a while.  Any help on where to start with this?

               Column Family: HintsColumnFamily
                SSTable count: 11
                Space used (live): 11271669539
                Space used (total): 11271669539
                Number of Keys (estimate): 1408
                Memtable Columns Count: 338
                Memtable Data Size: 0
                Memtable Switch Count: 1
                Read Count: 3
                Read Latency: 4354.669 ms.
                Write Count: 848
                Write Latency: 0.029 ms.
                Pending Tasks: 0
                Bloom Filter False Postives: 0
                Bloom Filter False Ratio: 0.00000
                Bloom Filter Space Used: 12656
                Key cache capacity: 14
                Key cache size: 11
                Key cache hit rate: 0.6666666666666666
                Row cache: disabled
                Compacted row minimum size: 105779
                Compacted row maximum size: 7152383774
                Compacted row mean size: 590818614

Thanks,
Bryce