You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Nicholas Telford (JIRA)" <ji...@apache.org> on 2010/08/19 10:43:17 UTC
[jira] Commented: (CASSANDRA-1409) What should happen when cassandra loses permissions to it's data directory and unable to compact?

    [ https://issues.apache.org/jira/browse/CASSANDRA-1409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900217#action_12900217 ] 

Nicholas Telford commented on CASSANDRA-1409:
---------------------------------------------

In most cases, Cassandra seems to adhere to the "crash early" principle. This is implied quite heavily in the "Exception handling" section here http://wiki.apache.org/cassandra/CodeStyle by requiring Exceptions to be propagated up.

I disagree that a node could be "read-only". I suppose provided the commitlogs are writable, durability could still be guaranteed but what happens when the memtable reaches the threshold as in the stack trace above?

I think I'd rather have Cassandra crash the second it detects that any of its directories are no longer writable (during writes to the commit log, compaction, memtable flushes etc.) - this sort of issue will never be intentional and having the node go in to a "read-only" mode will only hide the problem from being discovered and easily debugged.

> What should happen when cassandra loses permissions to it's data directory and unable to compact?
> -------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1409
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1409
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Ran Tavory
>
> Due to administrative error one of the hosts in the cluster lost permission to write to it's data directory.
> So I started seeing errors in the log, however, the server continued serving traffic. It wasn't able to compact and do other write operations but it didn't crash.
> I was wondering wether that's by design and if so, is this a good one... I guess I want to know if really bad things happen to my cluster...
> logs look like that...
>  INFO [FLUSH-TIMER] 2010-08-11 07:53:14,683 ColumnFamilyStore.java (line 357) KvAds has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/outbrain/cassandra/commitlog/Commi
> tLog-1281505164614.log', position=88521163)
>  INFO [FLUSH-TIMER] 2010-08-11 07:53:14,683 ColumnFamilyStore.java (line 609) Enqueuing flush of Memtable(KvAds)@851225759
>  INFO [FLUSH-WRITER-POOL:1] 2010-08-11 07:53:14,684 Memtable.java (line 148) Writing Memtable(KvAds)@851225759
> ERROR [FLUSH-WRITER-POOL:1] 2010-08-11 07:53:14,688 DebuggableThreadPoolExecutor.java (line 94) Error in executor futuretask
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.FileNotFoundException: /outbrain/cassandra/data/outbrain_kvdb/KvAds-tmp-249-Data.db (Permission denied)
>         at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>         at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:86)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /outbrain/cassandra/data/outbrain_kvdb/KvAds-tmp-249-Data.db (Permission denied)
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> ... more
> Jonathan Ellis:
> That's a tough call -- you can also come up with scenarios where you'd
> rather have it read-only than completely dead.
> Benjamin Black:
> Useful config option, perhaps?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.