You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (Updated) (JIRA)" <ji...@apache.org> on 2012/02/14 12:46:59 UTC

[jira] [Updated] (CASSANDRA-3903) Intermittent unexpected errors: possibly race condition around CQL parser?

     [ https://issues.apache.org/jira/browse/CASSANDRA-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sylvain Lebresne updated CASSANDRA-3903:
----------------------------------------

    Attachment: 0002-Fix-fixCFMaxId.patch
                0001-Fix-CFS.all-thread-safety.patch

For
{noformat}
java.lang.ArrayIndexOutOfBoundsException: 7
        at org.apache.cassandra.db.ColumnFamilyStore.all(ColumnFamilyStore.java:1520)
        at org.apache.cassandra.thrift.ThriftValidation.validateCfDef(ThriftValidation.java:634)
{noformat}
that's because CFS.all() is not threadSafe as a new keyspace can be added between the allocation of the array and the addition of the column family stores. Attaching patch that use an ArrayList instead of a plain array so that it can grow when that happens (It still use the same "estimate" for the initial size of the ArrayList as 99% of the time this will be the right size, but the point is that it doesn't crash if there is a concurrent modification).

For
{noformat}
java.lang.IllegalArgumentException: value already present: 1558
        at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
        at com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:111)
        at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
        at com.google.common.collect.HashBiMap.put(HashBiMap.java:84)
        at org.apache.cassandra.config.Schema.load(Schema.java:392)
{noformat}
I believe that's because Schema.fixCFMaxId() may reset cfIdGen to a smaller value since it doesn't check the current value of cfIdGen. Patch attached for that too.

Not sure what's wrong with the ArrayOutOfBoundsError without stacktrace though.

I'm also not sure at all that this will fix the 'no viable alternative at input' error, as I don't think any of those error should trigger that (if only because both happens after the parsing). Not sure how we could have a race in the parser actually, since in theory the parsing of each request should be fully mono-threaded.

                
> Intermittent unexpected errors: possibly race condition around CQL parser?
> --------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3903
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3903
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1.0
>         Environment: Mac OS X 10.7 with Sun/Oracle Java 1.6.0_29
> Debian GNU/Linux 6.0.3 (squeeze) with Sun/Oracle Java 1.6.0_26
> several recent commits on cassandra-1.1 branch. at least:
> 0183dc0b36e684082832de43a21b3dc0a9716d48, 3eefbac133c838db46faa6a91ba1f114192557ae, 9a842c7b317e6f1e6e156ccb531e34bb769c979f
> Running cassandra under ccm with one node
>            Reporter: paul cannon
>         Attachments: 0001-Fix-CFS.all-thread-safety.patch, 0002-Fix-fixCFMaxId.patch
>
>
> When running multiple simultaneous instances of the test_cql.py piece of the python-cql test suite, I can reliably reproduce intermittent and unpredictable errors in the tests.
> The failures often occur at the point of keyspace creation during test setup, with a CQL statement of the form:
> {code}
>         CREATE KEYSPACE 'asnvzpot' WITH strategy_class = SimpleStrategy
>             AND strategy_options:replication_factor = 1
>     
> {code}
> An InvalidRequestException is returned to the cql driver, which re-raises it as a cql.ProgrammingError. The message:
> {code}
> ProgrammingError: Bad Request: line 2:24 no viable alternative at input 'asnvzpot'
> {code}
> In a few cases, Cassandra threw an ArrayIndexOutOfBoundsException and this traceback, closing the thrift connection:
> {code}
> ERROR [Thrift:244] 2012-02-10 15:51:46,815 CustomTThreadPoolServer.java (line 205) Error occurred during processing of message.
> java.lang.ArrayIndexOutOfBoundsException: 7
>         at org.apache.cassandra.db.ColumnFamilyStore.all(ColumnFamilyStore.java:1520)
>         at org.apache.cassandra.thrift.ThriftValidation.validateCfDef(ThriftValidation.java:634)
>         at org.apache.cassandra.cql.QueryProcessor.processStatement(QueryProcessor.java:744)
>         at org.apache.cassandra.cql.QueryProcessor.process(QueryProcessor.java:898)
>         at org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1245)
>         at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3458)
>         at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3446)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
>         at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:680)
> {code}
> Sometimes I see an ArrayOutOfBoundsError with no traceback:
> {code}
> ERROR [Thrift:858] 2012-02-13 12:04:01,537 CustomTThreadPoolServer.java (line 205) Error occurred during processing of message.
> java.lang.ArrayIndexOutOfBoundsException
> {code}
> Sometimes I get this:
> {code}
> ERROR [MigrationStage:1] 2012-02-13 12:04:46,077 AbstractCassandraDaemon.java (line 134) Fatal exception in thread Thread[MigrationStage:1,5,main]
> java.lang.IllegalArgumentException: value already present: 1558
>         at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
>         at com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:111)
>         at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96)
>         at com.google.common.collect.HashBiMap.put(HashBiMap.java:84)
>         at org.apache.cassandra.config.Schema.load(Schema.java:392)
>         at org.apache.cassandra.db.migration.MigrationHelper.addColumnFamily(MigrationHelper.java:284)
>         at org.apache.cassandra.db.migration.MigrationHelper.addColumnFamily(MigrationHelper.java:209)
>         at org.apache.cassandra.db.migration.AddColumnFamily.applyImpl(AddColumnFamily.java:49)
>         at org.apache.cassandra.db.migration.Migration.apply(Migration.java:66)
>         at org.apache.cassandra.cql.QueryProcessor$1.call(QueryProcessor.java:334)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}
> Again, around 99% of the instances of this {{CREATE KEYSPACE}} statement work fine, so it's a little hard to git bisect out, but I guess I'll see what I can do.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira