You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Chip Salzenberg <re...@gmail.com> on 2010/11/12 22:19:21 UTC

Gossip yoyo under write load

After I rebooted my 0.7.0beta3+ cluster to increase threads (read=100
write=200 ... they're beefy machines), and putting them under load again, I
find gossip reporting yoyo up-down-up-down status for the other nodes.
 Anyone know what this is a symptom of, and/or how to avoid it?  I haven't
seen any particular symptoms other than the log messages; and I suppose I'm
also dropping replication MUTATEs which had been happening already, anyway.

cas001   INFO [ScheduledTasks:1] 2010-11-12 13:00:02,891 Gossiper.java (line
195) InetAddress /X.20 is now dead.
cas001   INFO [GossipStage:1] 2010-11-12 13:00:07,567 Gossiper.java (line
569) InetAddress /X.20 is now UP
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:00:53,662 Gossiper.java (line
195) InetAddress /X.21 is now dead.
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:00:56,967 GCInspector.java
(line 133) GC for ParNew: 255 ms, 135668944 reclaimed leaving 18375966648
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:01:01,135
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289595661135.log
cas001   INFO [GossipStage:1] 2010-11-12 13:01:08,148 Gossiper.java (line
569) InetAddress /X.21 is now UP
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:01:56,753 GCInspector.java
(line 133) GC for ParNew: 268 ms, 132609096 reclaimed leaving 20102566032
used; max is 34557919232
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:01:57,771 GCInspector.java
(line 133) GC for ParNew: 274 ms, 115223104 reclaimed leaving 20214228560
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:02:14,746
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289595734746.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:03:02,868 GCInspector.java
(line 133) GC for ParNew: 297 ms, 62163960 reclaimed leaving 22200082216
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:03:29,123
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289595809123.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:04:09,626 GCInspector.java
(line 133) GC for ParNew: 321 ms, 125585880 reclaimed leaving 24138058936
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:04:44,852
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289595884852.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:05:13,695 GCInspector.java
(line 133) GC for ParNew: 242 ms, 126754312 reclaimed leaving 26019407576
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:06:01,941
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289595961941.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:06:25,617 GCInspector.java
(line 133) GC for ParNew: 307 ms, 134631824 reclaimed leaving 11283839952
used; max is 34557919232
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:06:37,032 Gossiper.java (line
195) InetAddress /X.18 is now dead.
cas001   INFO [GossipStage:1] 2010-11-12 13:06:38,666 Gossiper.java (line
569) InetAddress /X.18 is now UP
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:07:23,417
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289596043417.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:07:33,034 GCInspector.java
(line 133) GC for ParNew: 231 ms, 108391848 reclaimed leaving 13146098816
used; max is 34557919232
cas001   INFO [MutationStage:169] 2010-11-12 13:08:12,548
ColumnFamilyStore.java (line 580) switching in a fresh Memtable for
TestAttrs at
CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1289596043417.log',
position=84947614)
cas001   INFO [MutationStage:169] 2010-11-12 13:08:12,549
ColumnFamilyStore.java (line 879) Enqueuing flush of
Memtable-TestAttrs@1721243764(293461476 bytes, 8388614 operations)
cas001   INFO [FlushWriter:1] 2010-11-12 13:08:12,549 Memtable.java (line
155) Writing Memtable-TestAttrs@1721243764(293461476 bytes, 8388614
operations)
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:08:40,628 GCInspector.java
(line 133) GC for ParNew: 278 ms, 135521080 reclaimed leaving 15121172544
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:08:42,349
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289596122349.log
cas001   INFO [FlushWriter:1] 2010-11-12 13:09:20,586 Memtable.java (line
162) Completed flushing
/var/lib/cassandra/data/Attrs/TestAttrs-e-305-Data.db (954244537 bytes)
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:09:31,873 GCInspector.java
(line 133) GC for ParNew: 228 ms, 201874960 reclaimed leaving 16512318064
used; max is 34557919232
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:09:47,127 GCInspector.java
(line 133) GC for ParNew: 266 ms, 123236640 reclaimed leaving 17009227136
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:10:01,895
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289596201895.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:10:48,377 GCInspector.java
(line 133) GC for ParNew: 296 ms, 124419344 reclaimed leaving 18796530912
used; max is 34557919232
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:10:54,391 Gossiper.java (line
195) InetAddress /X.21 is now dead.
cas001   INFO [GossipStage:1] 2010-11-12 13:10:57,688 Gossiper.java (line
569) InetAddress /X.21 is now UP
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:11:17,968
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289596277968.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:11:47,678 GCInspector.java
(line 133) GC for ParNew: 281 ms, 128569912 reclaimed leaving 20548892048
used; max is 34557919232
cas001   INFO [COMMIT-LOG-WRITER] 2010-11-12 13:12:33,745
CommitLogSegment.java (line 50) Creating new commitlog segment
/var/lib/cassandra/commitlog/CommitLog-1289596353745.log
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:12:59,242 GCInspector.java
(line 133) GC for ParNew: 253 ms, 109047624 reclaimed leaving 22601534128
used; max is 34557919232
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:13:30,583 GCInspector.java
(line 133) GC for ParNew: 201 ms, 202760624 reclaimed leaving 23439605592
used; max is 34557919232
cas001   INFO [ScheduledTasks:1] 2010-11-12 13:13:32,586 Gossiper.java (line
195) InetAddress /X.14 is now dead.
cas001   INFO [GossipStage:1] 2010-11-12 13:13:33,979 Gossiper.java (line
569) InetAddress /X.14 is now UP

Re: Gossip yoyo under write load

Posted by Jonathan Ellis <jb...@gmail.com>.

On Mon, Nov 15, 2010 at 2:08 PM, Reverend Chip <re...@gmail.com> wrote:
> None of those messages has appeared in either output.log or system.log.  ?

This was only added to 0.6/0.7 branches a couple days ago.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Gossip yoyo under write load

Posted by Reverend Chip <re...@gmail.com>.

On 11/15/2010 12:13 PM, Rob Coli wrote:
> On 11/15/10 12:08 PM, Reverend Chip wrote:
>>> "
>>> logger_.warn("Unable to lock JVM memory (ENOMEM)."
>>> or
>>> logger.warn("Unknown mlockall error " + errno(e));
>>> "
>>>
>>> Trunk also logs if it is successful :
>>> "
>>> logger.info("JNA mlockall successful");
>>
>> None of those messages has appeared in either output.log or
>> system.log.  ?
>
> They are at WARN and INFO, you would have to explicitly enable INFO
> logging in log4j to see the INFO one. The lack of the WARN ones in
> your logs is relatively promising. :)
>
> =Rob

My code base is too old... though I do log INFO and WARN, so I will see
them later.

Re: Gossip yoyo under write load

Posted by Rob Coli <rc...@digg.com>.

On 11/15/10 12:08 PM, Reverend Chip wrote:
>> "
>> logger_.warn("Unable to lock JVM memory (ENOMEM)."
>> or
>> logger.warn("Unknown mlockall error " + errno(e));
>> "
>>
>> Trunk also logs if it is successful :
>> "
>> logger.info("JNA mlockall successful");
>
> None of those messages has appeared in either output.log or system.log.  ?

They are at WARN and INFO, you would have to explicitly enable INFO 
logging in log4j to see the INFO one. The lack of the WARN ones in your 
logs is relatively promising. :)

=Rob

Re: Gossip yoyo under write load

Posted by Reverend Chip <re...@gmail.com>.

On 11/15/2010 11:34 AM, Rob Coli wrote:
> On 11/13/10 11:59 AM, Reverend Chip wrote:
>> Swapping could conceivably be a
>> factor; the JVM is 32G out of 72G, but the machine is 2.5G into swap
>> anyway.  I'm going to disable swap and see if the gossip issues resolve.
>
> Are you using JNA/memlock to prevent the JVM's heap from being swapped?

I didn't take any extra steps in that direction; though having disabled
swap, I doubt it's an issue I should worry about now.

>
> There's a log message about it at startup time if it fails, which
> looks like :
>
> "
> logger_.warn("Unable to lock JVM memory (ENOMEM)."
> or
> logger.warn("Unknown mlockall error " + errno(e));
> "
>
> Trunk also logs if it is successful :
> "
> logger.info("JNA mlockall successful");

None of those messages has appeared in either output.log or system.log.  ?

> See here for background :
>
> https://issues.apache.org/jira/browse/CASSANDRA-1214
>
> =Rob
>

Re: Gossip yoyo under write load

Posted by Rob Coli <rc...@digg.com>.

On 11/13/10 11:59 AM, Reverend Chip wrote:
>Swapping could conceivably be a
> factor; the JVM is 32G out of 72G, but the machine is 2.5G into swap
> anyway.  I'm going to disable swap and see if the gossip issues resolve.

Are you using JNA/memlock to prevent the JVM's heap from being swapped?

There's a log message about it at startup time if it fails, which looks 
like :

"
logger_.warn("Unable to lock JVM memory (ENOMEM)."
or
logger.warn("Unknown mlockall error " + errno(e));
"

Trunk also logs if it is successful :
"
logger.info("JNA mlockall successful");
"

See here for background :

https://issues.apache.org/jira/browse/CASSANDRA-1214

=Rob

Re: Gossip yoyo under write load

Posted by Reverend Chip <re...@gmail.com>.

On 11/12/2010 6:46 PM, Jonathan Ellis wrote:
> On Fri, Nov 12, 2010 at 3:19 PM, Chip Salzenberg <re...@gmail.com> wrote:
>> After I rebooted my 0.7.0beta3+ cluster to increase threads (read=100
>> write=200 ... they're beefy machines), and putting them under load again, I
>> find gossip reporting yoyo up-down-up-down status for the other nodes.
>>  Anyone know what this is a symptom of, and/or how to avoid it?
> It means "the system is too overloaded to process gossip data in a
> timely manner."  Usually this means GC storming but that does not like
> the problem here.  Swapping is a less frequent offender.

The system is not overloaded in the sense of load average; but disk I/O
was and is heavy (write load then, repair now).  Two nodes are streaming
(because one is repairing), and there are some compactions, but the
cluster is almost idle otherwise.  Swapping could conceivably be a
factor; the JVM is 32G out of 72G, but the machine is 2.5G into swap
anyway.  I'm going to disable swap and see if the gossip issues resolve.

Perhaps 200 is a bit too high on the threads, despite the presence of
eight fast true cores plus hyperthreading?

>   Since you
> are seeing this after bumping to extremely high thread counts I would
> guess context switching might be a factor.
>
> What are tpstats?

I ran the thread count up because the mutate events pending was very
high -- that was what led to the dropped mutates, I assumed.  It did
help; the tpstats are staying low now.  For example, the node that's
repairing has this:

Pool Name                    Active   Pending      Completed
ReadStage                         0         0              4
Request_responseStage             0         0      394392313
MutationStage                     0         0      422750725
ReadRepair                        0         0              0
GossipStage                       0         0         291951
AntientropyStage                  0         0              5
MigrationStage                    0         0              0
MemtablePostFlusher               0         0             61
StreamStage                       0         0              0
Internal_responseStage            0         0              0
FlushWriter                       0         0             61
FILEUTILS-DELETE-POOL             0         0            728
MiscStage                         0         0             14
FlushSorter                       0         0              0
HintedHandoff                     1         1             18

(the HintedHandoff numbers are nonzero on at least two nodes, and are
not resolving; and all nodes are up.  Odd, but probably harmless. (?))

>>  I haven't
>> seen any particular symptoms other than the log messages; and I suppose I'm
>> also dropping replication MUTATEs which had been happening already, anyway.
> I don't see any WARN lines about that, did you elide them?

No; this part of my message is badly written, sorry.  The dropped
MUTATES were the motivation for increasing the thread count, and are
gone AFAICT.

Re: Gossip yoyo under write load

Posted by Jonathan Ellis <jb...@gmail.com>.

On Fri, Nov 12, 2010 at 3:19 PM, Chip Salzenberg <re...@gmail.com> wrote:
> After I rebooted my 0.7.0beta3+ cluster to increase threads (read=100
> write=200 ... they're beefy machines), and putting them under load again, I
> find gossip reporting yoyo up-down-up-down status for the other nodes.
>  Anyone know what this is a symptom of, and/or how to avoid it?

It means "the system is too overloaded to process gossip data in a
timely manner."  Usually this means GC storming but that does not like
the problem here.  Swapping is a less frequent offender.  Since you
are seeing this after bumping to extremely high thread counts I would
guess context switching might be a factor.

What are tpstats?

>  I haven't
> seen any particular symptoms other than the log messages; and I suppose I'm
> also dropping replication MUTATEs which had been happening already, anyway.

I don't see any WARN lines about that, did you elide them?

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com