You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@cassandra.apache.org by "PJ (JIRA)" <ji...@apache.org> on 2014/05/03 16:17:16 UTC

[jira] [Created] (CASSANDRA-7145) FileNotFoundException during compaction

PJ created CASSANDRA-7145:
-----------------------------

Summary: FileNotFoundException during compaction
Key: CASSANDRA-7145
URL: https://issues.apache.org/jira/browse/CASSANDRA-7145
Project: Cassandra
Issue Type: Bug
Environment: CentOS 6.3, Datastax Enterprise 4.0.1 (Cassandra 2.0.5), Java 1.7.0_55
Reporter: PJ
Priority: Blocker
Attachments: compaction - FileNotFoundException.txt, repair - RuntimeException.txt, startup - AssertionError.txt

I can't finish any compaction because my nodes always throw a "FileNotFoundException". I've already tried the following but nothing helped:

1. nodetool flush
2. nodetool repair (ends with RuntimeException; see attachment)
3. node restart (via dse cassandra-stop)

Somewhere near the end of startup process, another type of exception is logged (see attachment) but the nodes are still able to finish the startup and eventually become online.

My questions now are:
1. Have I already lost data? I'm in the middle of migrating 4.8 billion rows from MySQL and I'd like to know whether I should already abort and start over
2. What caused the sstable files to go missing?
3. How can I proceed with compaction and repair? Obviously, not being able to do so would eventually lead to serious performance and data issues

Related StackOverflow question (mine): http://stackoverflow.com/questions/23435847/filenotfoundexception-during-compaction

Notes:
1. I didn't drop and recreate the keyspace (so probably not related to CASSANDRA-4857)
2. I use sstableloader for the migration. However, since it is designed to wait for the secondary index build to complete before exiting, the overall throughput becomes unacceptable. Due to this, I devised a mechanism that would kill the sstableloader process and cancel the secondary index build when the bulk-loading total progress reaches 100%. So far, I've done this more than 100 times already
3. There are times when I had to restart the nodes because the OS load reached high levels. It's possible that there are compactions in-progress when I restarted the nodes

--
This message was sent by Atlassian JIRA
(v6.2#6252)