You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Boris Yen <yu...@gmail.com> on 2013/09/02 06:11:55 UTC

Scrub on secondary indexes

Hi,

We are running cassandra 1.0.12. From time to time, we see log message like
"*java.io.IOError: java.io.IOException: dataSize of 71530420 starting at
587 would be larger than file {cf name} ...*" inside system.log.

If the cf name is not for secondary index, running "scrub" seems to prevent
the log message from being logged into system.log again. However, when the
cf name is for secondary indexes, there seems no way to make this error go
away unless I manually remove the data files from cassandra.

I tried to patch the cassandra to make running scrub on secondary indexes
possible. However, I see the statement "assert !cfs.isIndex();" in
doScurb(). This make me feel like the scrub is not intent to be run on
secondary indexes. My question is "why make such limitation on secondary
indexes?". The implementation of secondary should be like normal column
family. Running scrub on it should be legitimate.

Any suggestion to get rid of the error message?

Thanks and Regards,
Boris