You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Vijay (JIRA)" <ji...@apache.org> on 2013/08/23 09:17:52 UTC

[jira] [Commented] (CASSANDRA-5911) Commit logs are not removed after nodetool flush or nodetool drain

    [ https://issues.apache.org/jira/browse/CASSANDRA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748365#comment-13748365 ] 

Vijay commented on CASSANDRA-5911:
----------------------------------

1) Even though the logs show Replaying, we will not replay anything since END_OF_SEGMENT_MARKER is placed in the beginning of the file. 

- We can improve the logging, so we don't print CL which we are skipping after reading first 4 bytes. 

2) Only the active segment is replayed, even if we Flush the CL.... since we have not recycled. 

- One way I can think of to avoid replaying on active segment, with a performance hit is.... to have a metadata file which might hold the info on CF dirty writes if any (similar to CommitLogSegment#cfLastWrite, write for the first write on segment and remove for flush). 
                
> Commit logs are not removed after nodetool flush or nodetool drain
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-5911
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5911
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: J.B. Langston
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 2.0.1
>
>
> Commit logs are not removed after nodetool flush or nodetool drain. This can lead to unnecessary commit log replay during startup.  I've reproduced this on Apache Cassandra 1.2.8.  Usually this isn't much of an issue but on a Solr-indexed column family in DSE, each replayed mutation has to be reindexed which can make startup take a long time (on the order of 20-30 min).
> Reproduction follows:
> {code}
> jblangston:bin jblangston$ ./cassandra > /dev/null
> jblangston:bin jblangston$ ../tools/bin/cassandra-stress -n 20000000 > /dev/null
> jblangston:bin jblangston$ du -h ../commitlog
> 576M	../commitlog
> jblangston:bin jblangston$ nodetool flush
> jblangston:bin jblangston$ du -h ../commitlog
> 576M	../commitlog
> jblangston:bin jblangston$ nodetool drain
> jblangston:bin jblangston$ du -h ../commitlog
> 576M	../commitlog
> jblangston:bin jblangston$ pkill java
> jblangston:bin jblangston$ du -h ../commitlog
> 576M	../commitlog
> jblangston:bin jblangston$ ./cassandra -f | grep Replaying
>  INFO 10:03:42,915 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log, /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
>  INFO 10:03:42,922 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566761.log
>  INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566762.log
>  INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566763.log
>  INFO 10:03:43,907 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566764.log
>  INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566765.log
>  INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566766.log
>  INFO 10:03:43,908 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566767.log
>  INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566768.log
>  INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566769.log
>  INFO 10:03:43,909 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566770.log
>  INFO 10:03:43,910 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566771.log
>  INFO 10:03:43,910 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566772.log
>  INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566773.log
>  INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566774.log
>  INFO 10:03:43,911 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566775.log
>  INFO 10:03:43,912 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566776.log
>  INFO 10:03:43,912 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566777.log
>  INFO 10:03:43,912 Replaying /opt/apache-cassandra-1.2.8/commitlog/CommitLog-2-1377096566778.log
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira