You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Robert Coli (JIRA)" <ji...@apache.org> on 2012/07/18 23:31:35 UTC
[jira] [Created] (CASSANDRA-4446) nodetool drain sometimes doesn't
mark commitlog fully flushed
Robert Coli created CASSANDRA-4446:
--------------------------------------
Summary: nodetool drain sometimes doesn't mark commitlog fully flushed
Key: CASSANDRA-4446
URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
Project: Cassandra
Issue Type: New Feature
Affects Versions: 1.0.10
Environment: ubuntu 10.04 64bit
Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
sun JVM
cassandra 1.0.10 installed from apache deb
Reporter: Robert Coli
I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
It appears to show the following :
1) Drain begins
2) Drain triggers flush
3) Flush triggers compaction
4) StorageService logs DRAINED message
5) compaction thread excepts
6) on restart, same CF creates a memtable
7) and then flushes it [1]
The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
[1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4446) nodetool drain sometimes doesn't
mark commitlog fully flushed
Posted by "Robert Coli (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Coli updated CASSANDRA-4446:
-----------------------------------
Issue Type: Bug (was: New Feature)
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4446) nodetool drain sometimes
doesn't mark commitlog fully flushed
Posted by "Karl Mueller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471960#comment-13471960 ]
Karl Mueller commented on CASSANDRA-4446:
-----------------------------------------
Also seeing this in an upgrade from 1.0.xx to 1.1.15:
INFO 16:29:17,486 completed pre-loading (3 keys) key cache.
INFO 16:29:17,495 Replaying /data2/commit-cassandra/CommitLog-1349727956484.log
INFO 16:29:17,503 Replaying /data2/commit-cassandra/CommitLog-1349727956484.log
INFO 16:29:18,495 GC for ParNew: 3506 ms for 4 collections, 1963062320 used; max is 17095983104
INFO 16:29:18,498 Finished reading /data2/commit-cassandra/CommitLog-1349727956484.log
INFO 16:29:18,499 Log replay complete, 0 replayed mutations
This is a standard upgrade process which includes a drain
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4446) nodetool drain sometimes doesn't
mark commitlog fully flushed
Posted by "Robert Coli (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Coli updated CASSANDRA-4446:
-----------------------------------
Description:
I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
It appears to show the following :
1) Drain begins
2) Drain triggers flush
3) Flush triggers compaction
4) StorageService logs DRAINED message
5) compaction thread excepts
6) on restart, same CF creates a memtable
7) and then flushes it [1]
The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
[1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
was:
I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
It appears to show the following :
1) Drain begins
2) Drain triggers flush
3) Flush triggers compaction
4) StorageService logs DRAINED message
5) compaction thread excepts
6) on restart, same CF creates a memtable
7) and then flushes it [1]
The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
[1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4446) nodetool drain sometimes
doesn't mark commitlog fully flushed
Posted by "Tamar Fraenkel (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501722#comment-13501722 ]
Tamar Fraenkel commented on CASSANDRA-4446:
-------------------------------------------
I had the same experience, when I upgraded my cluster from 1.0.9 to 1.0.11. I ran drain before the upgrade, upgrade on the node finished and node restarted at 2012-11-20 10:20:58, but then I see in the logs reply of commit log:
{quote}
INFO [main] 2012-11-20 09:41:13,918 CommitLog.java (line 172) Replaying /raid0/cassandra/commitlog/CommitLog-1353402218337.log
INFO [main] 2012-11-20 09:41:20,360 CommitLog.java (line 179) Log replay complete, 0 replayed mutations
INFO [main] 2012-11-20 10:11:35,635 CommitLog.java (line 167) No commitlog files found; skipping replay
INFO [main] 2012-11-20 10:21:11,631 CommitLog.java (line 172) Replaying /raid0/cassandra/commitlog/CommitLog-1353404473899.log
INFO [main] 2012-11-20 10:21:18,119 CommitLog.java (line 179) Log replay complete, 6413 replayed mutations
INFO [main] 2012-11-20 10:55:46,435 CommitLog.java (line 172) Replaying /raid0/cassandra/commitlog/CommitLog-1353406871619.log
INFO [main] 2012-11-20 10:55:54,139 CommitLog.java (line 179) Log replay complete, 3 replayed mutations
{quote}
This caused over increment of counters
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-4446) nodetool drain sometimes doesn't
mark commitlog fully flushed
Posted by "Robert Coli (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Coli updated CASSANDRA-4446:
-----------------------------------
Attachment: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: New Feature
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4446) nodetool drain sometimes
doesn't mark commitlog fully flushed
Posted by "Omid Aladini (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476213#comment-13476213 ]
Omid Aladini commented on CASSANDRA-4446:
-----------------------------------------
I also experience this every time I drain / restart (up until latest 1.1.6) and getting this message in log:
{quote}
2012-10-12_15:50:36.92191 INFO 15:50:36,921 Log replay complete, N replayed mutations
{quote}
with N being non-zero. I wonder if this is a cause of double-counts for Counter mutations.
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-4446) nodetool drain sometimes doesn't
mark commitlog fully flushed
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis resolved CASSANDRA-4446.
---------------------------------------
Resolution: Won't Fix
This is going to stand as a known limitation with 1.0.x; so far it looks like it is fixed in latest 1.1.
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4446) nodetool drain sometimes
doesn't mark commitlog fully flushed
Posted by "Peter Schuller (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451241#comment-13451241 ]
Peter Schuller commented on CASSANDRA-4446:
-------------------------------------------
In general, nodetool drain never seems to completely eliminate on-startup log replay. I observe this all the time on all clusters. It certainly cuts down the amount of replay done, but either never or fairly seldom eliminates it completely - at least not based on log messages indicating replay.
Never had time to investigate.
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CASSANDRA-4446) nodetool drain sometimes
doesn't mark commitlog fully flushed
Posted by "Omid Aladini (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13476213#comment-13476213 ]
Omid Aladini edited comment on CASSANDRA-4446 at 10/18/12 12:52 PM:
--------------------------------------------------------------------
I also experience this every time I drain / restart (up until latest 1.1.6 but not on 1.1.6 itself any more) and getting this message in log:
{quote}
2012-10-12_15:50:36.92191 INFO 15:50:36,921 Log replay complete, N replayed mutations
{quote}
with N being non-zero. I wonder if this is a cause of double-counts for Counter mutations.
was (Author: omid):
I also experience this every time I drain / restart (up until latest 1.1.6) and getting this message in log:
{quote}
2012-10-12_15:50:36.92191 INFO 15:50:36,921 Log replay complete, N replayed mutations
{quote}
with N being non-zero. I wonder if this is a cause of double-counts for Counter mutations.
> nodetool drain sometimes doesn't mark commitlog fully flushed
> -------------------------------------------------------------
>
> Key: CASSANDRA-4446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4446
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.10
> Environment: ubuntu 10.04 64bit
> Linux HOSTNAME 2.6.32-345-ec2 #48-Ubuntu SMP Wed May 2 19:29:55 UTC 2012 x86_64 GNU/Linux
> sun JVM
> cassandra 1.0.10 installed from apache deb
> Reporter: Robert Coli
> Attachments: cassandra.1.0.10.replaying.log.after.exception.during.drain.txt
>
>
> I recently wiped a customer's QA cluster. I drained each node and verified that they were drained. When I restarted the nodes, I saw the commitlog replay create a memtable and then flush it. I have attached a sanitized log snippet from a representative node at the time.
> It appears to show the following :
> 1) Drain begins
> 2) Drain triggers flush
> 3) Flush triggers compaction
> 4) StorageService logs DRAINED message
> 5) compaction thread excepts
> 6) on restart, same CF creates a memtable
> 7) and then flushes it [1]
> The columnfamily involved in the replay in 7) is the CF for which the compaction thread excepted in 5). This seems to suggest a timing issue whereby the exception in 5) prevents the flush in 3) from marking all the segments flushed, causing them to replay after restart.
> In case it might be relevant, I did an online change of compaction strategy from Leveled to SizeTiered during the uptime period preceding this drain.
> [1] Isn't commitlog replay not supposed to automatically trigger a flush in modern cassandra?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira