You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Cathy Daw (JIRA)" <ji...@apache.org> on 2011/07/23 02:59:09 UTC
[jira] [Created] (CASSANDRA-2942) If you drop a CF when one node is
down the files are orphaned on the downed node
If you drop a CF when one node is down the files are orphaned on the downed node
--------------------------------------------------------------------------------
Key: CASSANDRA-2942
URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
Project: Cassandra
Issue Type: Bug
Reporter: Cathy Daw
Priority: Minor
* Bring up 3 node cluster
* From node1: Run Stress Tool
{code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
* Shutdown node3
* From node1: drop the Standard1 CF in Keyspace1
* Shutdown node2 and node3
* Bring up node1 and node2. Check that the Standard1 files are gone.
{code}
ls -al /var/lib/cassandra/data/Keyspace1/
{code}
* Bring up node3. The log file shows the drop column family occurs
{code}
INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
{code}
* Restart node3 to clear out dropped tables from the filesystem
{code}
root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
total 36
drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
-rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
-rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
-rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
-rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
-rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
{code}
*Bug: The files for Standard1 are orphaned on node3*
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2942) If you drop a CF when one node
is down the files are orphaned on the downed node
Posted by "Hudson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092955#comment-13092955 ]
Hudson commented on CASSANDRA-2942:
-----------------------------------
Integrated in Cassandra #1054 (See [https://builds.apache.org/job/Cassandra/1054/])
reduce window where dropped CF sstables may not be deleted
patch by jbellis; reviewed by slebresne for CASSANDRA-2942
jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1162849
Files :
* /cassandra/trunk/CHANGES.txt
* /cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
* /cassandra/trunk/src/java/org/apache/cassandra/io/DeletionService.java
* /cassandra/trunk/src/java/org/apache/cassandra/io/util/FileUtils.java
* /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java
> If you drop a CF when one node is down the files are orphaned on the downed node
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Cathy Daw
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 1.0
>
> Attachments: 2942.txt
>
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
> INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug: The files for Standard1 are orphaned on node3*
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2942) If you drop a CF when one node
is down the files are orphaned on the downed node
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089239#comment-13089239 ]
Jonathan Ellis commented on CASSANDRA-2942:
-------------------------------------------
bq. May be fixed by CASSANDRA-2521
2521 makes it substantially better, but there's still a window where you can miss deletes.
The "real" fix is to commitlog-ify schema changes, but that's outside our scope for the forseeable future.
Adding "wait for outstanding SSTableDeletingTasks" to our jvm shutdown hook would be almost as good.
> If you drop a CF when one node is down the files are orphaned on the downed node
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
> Issue Type: Bug
> Reporter: Cathy Daw
> Assignee: Sylvain Lebresne
> Priority: Minor
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
> INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug: The files for Standard1 are orphaned on node3*
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-2942) If you drop a CF when one node
is down the files are orphaned on the downed node
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis reassigned CASSANDRA-2942:
-----------------------------------------
Assignee: Sylvain Lebresne
You should be able to reproduce this even on a single node -- just drop a CF, then restart. It only cleans out marked-for-delete files from known CFs.
May be fixed by CASSANDRA-2521. Otherwise we can add "go ahead and clear out marked-for-delete files, even if they don't belong to an active CF" logic.
> If you drop a CF when one node is down the files are orphaned on the downed node
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
> Issue Type: Bug
> Reporter: Cathy Daw
> Assignee: Sylvain Lebresne
> Priority: Minor
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
> INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug: The files for Standard1 are orphaned on node3*
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2942) Dropped columnfamilies can leave
orphaned data files that do not get cleared on restart
Posted by "Jonathan Ellis (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2942:
--------------------------------------
Description:
* Bring up 3 node cluster
* From node1: Run Stress Tool
{code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
* Shutdown node3
* From node1: drop the Standard1 CF in Keyspace1
* Shutdown node2 and node3
* Bring up node1 and node2. Check that the Standard1 files are gone.
{code}
ls -al /var/lib/cassandra/data/Keyspace1/
{code}
* Bring up node3. The log file shows the drop column family occurs
{code}
INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
{code}
* Restart node3 to clear out dropped tables from the filesystem
{code}
root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
total 36
drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
-rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
-rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
-rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
-rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
-rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
{code}
*Bug: The files for Standard1 are orphaned on node3*
was:
* Bring up 3 node cluster
* From node1: Run Stress Tool
{code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
* Shutdown node3
* From node1: drop the Standard1 CF in Keyspace1
* Shutdown node2 and node3
* Bring up node1 and node2. Check that the Standard1 files are gone.
{code}
ls -al /var/lib/cassandra/data/Keyspace1/
{code}
* Bring up node3. The log file shows the drop column family occurs
{code}
INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
{code}
* Restart node3 to clear out dropped tables from the filesystem
{code}
root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
total 36
drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
-rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
-rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
-rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
-rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
-rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
{code}
*Bug: The files for Standard1 are orphaned on node3*
Summary: Dropped columnfamilies can leave orphaned data files that do not get cleared on restart (was: If you drop a CF when one node is down the files are orphaned on the downed node)
> Dropped columnfamilies can leave orphaned data files that do not get cleared on restart
> ---------------------------------------------------------------------------------------
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Cathy Daw
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 1.0.0
>
> Attachments: 2942.txt
>
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
> INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug: The files for Standard1 are orphaned on node3*
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2942) If you drop a CF when one node is
down the files are orphaned on the downed node
Posted by "Jonathan Ellis (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Ellis updated CASSANDRA-2942:
--------------------------------------
Attachment: 2942.txt
patch to wait for StorageService.tasks on shutdown. Also moves CL segment deletion there.
> If you drop a CF when one node is down the files are orphaned on the downed node
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Cathy Daw
> Assignee: Sylvain Lebresne
> Priority: Minor
> Fix For: 1.0
>
> Attachments: 2942.txt
>
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
> INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug: The files for Standard1 are orphaned on node3*
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2942) If you drop a CF when one node
is down the files are orphaned on the downed node
Posted by "Sylvain Lebresne (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/CASSANDRA-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092884#comment-13092884 ]
Sylvain Lebresne commented on CASSANDRA-2942:
---------------------------------------------
nit: we could log an info message when awaitTermination returns false. It also look like DeletionService could just go away with with this.
But otherwise, +1. I agree it is good enough and not worth going for more complicated.
> If you drop a CF when one node is down the files are orphaned on the downed node
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-2942
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2942
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Cathy Daw
> Assignee: Jonathan Ellis
> Priority: Minor
> Fix For: 1.0
>
> Attachments: 2942.txt
>
>
> * Bring up 3 node cluster
> * From node1: Run Stress Tool
> {code} stress --num-keys=10 --columns=10 --consistency-level=ALL --average-size-values --replication-factor=3 --nodes=node1,node2 {code}
> * Shutdown node3
> * From node1: drop the Standard1 CF in Keyspace1
> * Shutdown node2 and node3
> * Bring up node1 and node2. Check that the Standard1 files are gone.
> {code}
> ls -al /var/lib/cassandra/data/Keyspace1/
> {code}
> * Bring up node3. The log file shows the drop column family occurs
> {code}
> INFO 00:51:25,742 Applying migration 9a76f880-b4c5-11e0-0000-8901a7c5c9ce Drop column family: Keyspace1.Standard1
> {code}
> * Restart node3 to clear out dropped tables from the filesystem
> {code}
> root@cathy3:~/cass-0.8/bin# ls -al /var/lib/cassandra/data/Keyspace1/
> total 36
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 .
> drwxr-xr-x 6 root root 4096 Jul 23 00:48 ..
> -rw-r--r-- 1 root root 0 Jul 23 00:51 Standard1-g-1-Compacted
> -rw-r--r-- 2 root root 5770 Jul 23 00:51 Standard1-g-1-Data.db
> -rw-r--r-- 2 root root 32 Jul 23 00:51 Standard1-g-1-Filter.db
> -rw-r--r-- 2 root root 120 Jul 23 00:51 Standard1-g-1-Index.db
> -rw-r--r-- 2 root root 4276 Jul 23 00:51 Standard1-g-1-Statistics.db
> drwxr-xr-x 3 root root 4096 Jul 23 00:51 snapshots
> {code}
> *Bug: The files for Standard1 are orphaned on node3*
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira