You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Aaron Morton (JIRA)" <ji...@apache.org> on 2011/06/27 03:55:47 UTC

[jira] [Created] (CASSANDRA-2829) always flush memtables

always flush memtables
----------------------

                 Key: CASSANDRA-2829
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.7.6
            Reporter: Aaron Morton
            Assignee: Aaron Morton
            Priority: Minor


Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  

Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 

{noformat}
$ sudo ls -lah commitlog/
total 6.9G
drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
-rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
-rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
-rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
-rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
-rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
-rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
...
-rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
-rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
-rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
-rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
{noformat}

The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 

I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.

{noformat}
$ bin/logtool dirty /tmp/logs/commitlog/

Not connected to a server, Keyspace and Column Family names are not available.

/tmp/logs/commitlog/CommitLog-1308876643288.log.header
Keyspace Unknown:
	Cf id 0: 444
/tmp/logs/commitlog/CommitLog-1308877711517.log.header
Keyspace Unknown:
	Cf id 1: 68848763
...
/tmp/logs/commitlog/CommitLog-1308944451460.log.header
Keyspace Unknown:
	Cf id 1: 61074
/tmp/logs/commitlog/CommitLog-1308945597471.log.header
Keyspace Unknown:
	Cf id 1000: 43175492
	Cf id 1: 108483
/tmp/logs/commitlog/CommitLog-1308946745380.log.header
Keyspace Unknown:
	Cf id 1000: 239223
	Cf id 1: 172211

/tmp/logs/commitlog/CommitLog-1308947888397.log.header
Keyspace Unknown:
	Cf id 1001: 57595560
	Cf id 1: 816960
	Cf id 1000: 0
{noformat}

CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 

I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:

1. Write to cf1 and flush.
2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
3. Do not write to cf1 again.
4. Roll the log, my test does this manually. 
5. Write to CF2 and flush.
6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.

Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  

The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
    
I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2829:
--------------------------------------

    Fix Version/s:     (was: 0.7.8)
                   0.8.2
         Assignee: Jonathan Ellis  (was: Aaron Morton)

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.2
>
>         Attachments: 0001-2829-unit-test.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073307#comment-13073307 ] 

Jonathan Ellis commented on CASSANDRA-2829:
-------------------------------------------

This can be kept purely in-memory.  No need to sync anything.  (BTW there is no header per se post CASSANDRA-2419.)

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2829) always flush memtables

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2829:
--------------------------------------

             Priority: Major  (was: Minor)
    Affects Version/s:     (was: 0.7.6)
                       0.7.0
        Fix Version/s: 0.7.8

> always flush memtables
> ----------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Aaron Morton
>            Assignee: Aaron Morton
>             Fix For: 0.7.8
>
>         Attachments: 0001-2829-unit-test.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Zhu Han (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073180#comment-13073180 ] 

Zhu Han commented on CASSANDRA-2829:
------------------------------------

{quote}Let's add that information and fix that. {quote}

Does it mean everytime an RowMutation is appended to the log, the log header should be fsynced again? It brings at least one extra disk seek at a critical path.

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-2829:
----------------------------------------

    Attachment: 2829.patch

Houston, we have a problem.

In 0.8, we have a much bigger problem related to the commit log. Turns out we don't even turnOn the isDirty flag on writes. This means that typically if we fill a segment (with write of different cfs), starts a new one, and flush (one cf, say cf1), the previous segment will be removed even though it may be full of dirty writes for cf != cf1.

Attaching a patch that fix this issue as well as the original issue of this ticket (as it is not really more complicated). It adds two unit test, one for each issue (both fails in current 0.8). Bumping the priority of this too. 

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch, 2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) always flush memtables

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062653#comment-13062653 ] 

Jonathan Ellis commented on CASSANDRA-2829:
-------------------------------------------

Good detective work finding this!

I'm not sure about the proposed fix, though -- I think this reasoning still applies:
{noformat}
                // we can't just mark the segment where the flush happened clean,
                // since there may have been writes to it between when the flush
                // started and when it finished.
{noformat}

... the memtable may have been clean when the flush started, but we don't block writes until flush finishes, so some may have finished in between (so the CL may have writes for this segment now).

(I don't have a better fix yet, this is a tough one.)

> always flush memtables
> ----------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.6
>            Reporter: Aaron Morton
>            Assignee: Aaron Morton
>            Priority: Minor
>         Attachments: 0001-2829-unit-test.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Aaron Morton (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron Morton updated CASSANDRA-2829:
------------------------------------

    Attachment: 0002-2829-v08.patch
                0001-2829-unit-test-v08.patch

I got to take another look at this tonight on the 0.8 trunk and ported the unit test to 0.8. 

The 002-2829-v08 patch was my second attempt. It changes CFS.forceFlush() to always flush and trusts maybeSwitchMemtable() will only flush non clean CF's. 

There are no changes to  CommitLog.discardCompletedSegmentsInternal(). The CF will be turned off in any segment that is not the context segment. It will always be turned on in the current / context segment. I think this gives the correct behaviour, i.e. the cf can never have dirty changes in a segment that is not current AND the cf may have changes in a segment that is current. It is a bit sloppy though as clean CF's will mark segments as dirty which may delay them been cleaned. 


I also think there is a theoretical risk of a race condition with access to the segments Deque.  The iterator runs in the postFlushExecutor and the segments are added on the appropriate commit log executor service.



> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.2
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-2829:
----------------------------------------

    Priority: Critical  (was: Major)

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>            Priority: Critical
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch, 2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2829) always flush memtables

Posted by "Aaron Morton (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aaron Morton updated CASSANDRA-2829:
------------------------------------

    Attachment: 0001-2829-unit-test.patch
                0002-2829.patch

2829-unit-test contains the unit test for the problem. 2829 is the fix. 

> always flush memtables
> ----------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.6
>            Reporter: Aaron Morton
>            Assignee: Aaron Morton
>            Priority: Minor
>         Attachments: 0001-2829-unit-test.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073574#comment-13073574 ] 

Hudson commented on CASSANDRA-2829:
-----------------------------------

Integrated in Cassandra-0.8 #248 (See [https://builds.apache.org/job/Cassandra-0.8/248/])
    fix bug where dirty commit logs were removed (and avoid keeping segment with no post-flush activity permanently dirty)
patch by slebresne; reviewed by jbellis for CASSANDRA-2829

slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1152793
Files : 
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
* /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/db/CommitLogTest.java
* /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/commitlog/CommitLog.java


> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Sylvain Lebresne
>            Priority: Critical
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch, 2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Zhu Han (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073320#comment-13073320 ] 

Zhu Han commented on CASSANDRA-2829:
------------------------------------

{quote}This can be kept purely in-memory{quote}

OK. So these these log segments might no be ignored during log replay. Maybe not a problem at all.

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13073539#comment-13073539 ] 

Jonathan Ellis commented on CASSANDRA-2829:
-------------------------------------------

+1

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>            Priority: Critical
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch, 2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069011#comment-13069011 ] 

Jonathan Ellis commented on CASSANDRA-2829:
-------------------------------------------

bq. I also think there is a theoretical risk of a race condition with access to the segments Deque. The iterator runs in the postFlushExecutor

discardCompletedSegments actually does the real work in a task on the CL executor. Unless that's not what you're thinking of, I think we're ok here.

bq. It changes CFS.forceFlush() to always flush and trusts maybeSwitchMemtable() will only flush non clean CF's

Hmm.  Interesting.

Part of me thinks it can't be that simple but I don't see a problem with it. :)

Sylvain?


> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.2
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068931#comment-13068931 ] 

Sylvain Lebresne commented on CASSANDRA-2829:
---------------------------------------------

bq. It feels like we need to add a "most recent write at" information as well as the "oldest write/replay position at" one. This would not need to be persisted to disk.

Agreed, I think this is the right fix too.

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.2
>
>         Attachments: 0001-2829-unit-test.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Ellis updated CASSANDRA-2829:
--------------------------------------

    Affects Version/s:     (was: 0.7.0)
              Summary: memtable with no post-flush activity can leave commitlog permanently dirty   (was: always flush memtables)

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Aaron Morton
>             Fix For: 0.7.8
>
>         Attachments: 0001-2829-unit-test.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) always flush memtables

Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064643#comment-13064643 ] 

Jonathan Ellis commented on CASSANDRA-2829:
-------------------------------------------

Thinking out loud here.

CLHeader has a structure to tell us "do we need to replay, and if so, from where?"

{code}
    private Map<Integer, Integer> cfDirtiedAt; // position at which each CF was last flushed

    boolean isDirty(Integer cfId)
    {
        return cfDirtiedAt.containsKey(cfId);
    } 

    boolean isDirty(Integer cfId)
    {
        return cfDirtiedAt.containsKey(cfId);
    } 
{code}

This is set in two places.  One is during a write:

{code}
                    if (!header.isDirty(id))
                    {
                        header.turnOn(id, logWriter.getFilePointer());
                        writeHeader();
                    }
{code}

The other is post-flush, as described above:

{code}
            if (segment.equals(context.getSegment()))
            {
                // we can't just mark the segment where the flush happened clean,
                // since there may have been writes to it between when the flush
                // started and when it finished. so mark the flush position as
                // the replay point for this CF, instead.
                if (logger.isDebugEnabled())
                    logger.debug("Marking replay position " + context.position + " on commit log " + segment);
                header.turnOn(id, context.position);
                segment.writeHeader();
                break;
            }
{code}

It feels like we need to add a "most recent write at" information as well as the "oldest write/replay position at" one.  This would not need to be persisted to disk.

(I thought that this is what 0.6 did, but looking at it that is not the case.  So this bug is present there as well, but I think at this point it just needs to be a known bug there.  Maybe even for 0.7.)

> always flush memtables
> ----------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7.0
>            Reporter: Aaron Morton
>            Assignee: Aaron Morton
>             Fix For: 0.7.8
>
>         Attachments: 0001-2829-unit-test.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty

Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069070#comment-13069070 ] 

Sylvain Lebresne commented on CASSANDRA-2829:
---------------------------------------------

I think this kind of work, in that we won't keep commit log forever, but it still keep commit logs for much longer than necessary because:
# it relies on forceFlush being called, which unless client triggered will only be after the memtable expires and quite a bunch of commit log could pile up during that time. Quite potentially enough to be a problem (if the commit logs fills up you hard drive, it doesn't matter much that "it would have been deleted in 5 hours"). I think we can do much better with not too much effort.
# when we do flush the expired memtable, we'll call maybeSwitchMemtable() will potentially clean memtables. This doesn't sound like a good use of resource: we'll grab the write lock, create a latch, create a new memtable, increment the memtable switch number, push an almost no-op job on the flush executor.

I think we should fix the real problem. The problem is that we discard segment, we always keep the current segment dirty because we don't know if there was some write since we grabbed the context. Let's add that information and fix that. This would make commit log being deleted much quicker, even if we don't consider the corner case of column family that have suddenly no write anymore, because CFs like the system ones, that have very low update volume can retain the logs longer than it's really need.

As for the fix, because the CL executor is mono-threaded, this is fairly easy, let's have an in-memory map of cfId->lastPositionWritten, and compare that to the context position in discardCompletedSegmentInternal (we could probably even just use a set of cfid who would meant: dirty since last getContext).

> memtable with no post-flush activity can leave commitlog permanently dirty 
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2829
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Aaron Morton
>            Assignee: Jonathan Ellis
>             Fix For: 0.8.3
>
>         Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch
>
>
> Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed.  
> Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. 
> {noformat}
> $ sudo ls -lah commitlog/
> total 6.9G
> drwx------ 2 cassandra cassandra  12K 2011-06-24 20:38 .
> drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 ..
> -rw------- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log
> -rw------- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308876643288.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308877711517.log.header
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log
> -rw-r--r-- 1 cassandra cassandra   28 2011-06-24 20:47 CommitLog-1308879395824.log.header
> ...
> -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log
> -rw-r--r-- 1 cassandra cassandra   36 2011-06-24 20:47 CommitLog-1308946745380.log.header
> -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log
> -rw-r--r-- 1 cassandra cassandra   44 2011-06-24 20:47 CommitLog-1308947888397.log.header
> {noformat}
> The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. 
> I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty.
> {noformat}
> $ bin/logtool dirty /tmp/logs/commitlog/
> Not connected to a server, Keyspace and Column Family names are not available.
> /tmp/logs/commitlog/CommitLog-1308876643288.log.header
> Keyspace Unknown:
> 	Cf id 0: 444
> /tmp/logs/commitlog/CommitLog-1308877711517.log.header
> Keyspace Unknown:
> 	Cf id 1: 68848763
> ...
> /tmp/logs/commitlog/CommitLog-1308944451460.log.header
> Keyspace Unknown:
> 	Cf id 1: 61074
> /tmp/logs/commitlog/CommitLog-1308945597471.log.header
> Keyspace Unknown:
> 	Cf id 1000: 43175492
> 	Cf id 1: 108483
> /tmp/logs/commitlog/CommitLog-1308946745380.log.header
> Keyspace Unknown:
> 	Cf id 1000: 239223
> 	Cf id 1: 172211
> /tmp/logs/commitlog/CommitLog-1308947888397.log.header
> Keyspace Unknown:
> 	Cf id 1001: 57595560
> 	Cf id 1: 816960
> 	Cf id 1000: 0
> {noformat}
> CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. 
> I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are:
> 1. Write to cf1 and flush.
> 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal()
> 3. Do not write to cf1 again.
> 4. Roll the log, my test does this manually. 
> 5. Write to CF2 and flush.
> 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1.
> Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened.  
> The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. 
>     
> I think the same problem would exist in 0.8. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira