You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Paulo Motta (Jira)" <ji...@apache.org> on 2021/03/21 23:42:00 UTC
[jira] [Updated] (CASSANDRA-15758) ERROR when a disconnected Cassandra node comes back and receives a drop/add column request

     [ https://issues.apache.org/jira/browse/CASSANDRA-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paulo Motta updated CASSANDRA-15758:
------------------------------------
    Resolution: Invalid
        Status: Resolved  (was: Triage Needed)

> ERROR when a disconnected Cassandra node comes back and receives a drop/add column request
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15758
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15758
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: YCozy
>            Priority: Normal
>
> We got the following error when we were dropping a column in the table:
> {code:java}
> ERROR [MigrationStage:1] 2020-04-24 00:07:54,995 SchemaKeyspace.java:1021 - No partition columns found for table ks_name.tbl_name in system_schema.columns.  This may be due to corruption or concurrent dropping and altering of a table. If this table is supposed to be dropped, restart cassandra with -Dcassandra.ignore_corrupted_schema_tables=true and run the following query to cleanup: "DELETE FROM system_schema.tables WHERE keyspace_name = 'ks_name' AND table_name = 'tbl_name'; DELETE FROM system_schema.columns WHERE keyspace_name = 'ks_name' AND table_name = 'tbl_name';" If the table is not supposed to be dropped, restore system_schema.columns sstables from backups.
> ERROR [MigrationStage:1] 2020-04-25 15:21:55,716 CassandraDaemon.java:228 - Exception in thread Thread[MigrationStage:1,5,main]
> org.apache.cassandra.schema.SchemaKeyspace$MissingColumns: Columns not found in schema table for ks_name.tbl_name
>         at org.apache.cassandra.schema.SchemaKeyspace.fetchColumns(SchemaKeyspace.java:1100) ~[main/:na]
>         at org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1046) ~[main/:na]
>         at org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:1000) ~[main/:na]
>         at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:959) ~[main/:na]
>         at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesOnly(SchemaKeyspace.java:951) ~[main/:na]
>         at org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1401) ~[main/:na]
>         at org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1380) ~[main/:na]
>         at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:51) ~[main/:na]
>         at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ~[main/:na]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_242]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_242]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_242]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_242]
>         at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84) [main/:na]
>         at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_242]
> {code}
> We analyzed the logs and came up with the following theory of what happened:
>  # We have a cluster of three nodes (C1, C2, C3).
>  # Right after we start all the nodes, C3 is partitioned away from the other. As a result, neither C1 or C2 knows that C3 exists.
>  # User contacts C1 to create a keyspace "ks_name" and a table "tbl_name". C1 and C2 serve the requests. Since they don't know about C3, they think the schema is consistent across the cluster. Both the keyspace and the table are created successfully without warning.
>  # User tries to drop a column in the table. Now C3 reconnects and receives the drop column request from C1 (the coordinator node). However, it does not know about "ks_name" nor "tbl_name". So it throws the above error.
>  # If the user tries to add a column instead of dropping one, the same error will occur.
> Since network partition is inevitable in deployed clusters, we think Cassandra should better handle such a scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org