You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Thor Carpenter <tc...@mediosystems.com> on 2010/12/28 20:59:56 UTC
0.7RC1 local_quorum -> TimedOutException
Hi All,
I am a bit stumped. When reading at a CL of local_quorum I get TimedOutExceptions however when reading at any other CL, including ALL, everything works as expected. Any write CL, including local_quorum, also works as expected.
I am running a 4 node 0.7 RC1 cluster in datacenter "DC1". "DC2" is planned for the future but currently has no nodes. I am using Hector 0.7.0-20.
Here is my testing schema:
create keyspace GlobalCacheKeyspaceDC1RF4 with
replication_factor = 4 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:4, DC2:0}];
use GlobalCacheKeyspaceDC1RF4;
create column family GlobalCacheCF with
comment = 'General purpose cache' and
column_type = 'Standard' and
comparator = 'BytesType' and
keys_cached = 200000 and
rows_cached = 2000 and
gc_grace = 0 and
read_repair_chance = 0.0;
Here is my cassandra-topology.properties:
10.5.64.15=DC1:R1
10.5.64.28=DC1:R1
10.5.64.56=DC1:R1
10.5.64.39=DC1:R1
# default for unknown nodes
default=DC2:R1
Here is the exception I get when reading at a CL of local_quorum:
1:44:14.176 [main] ERROR m.p.c.connection.HConnectionManager - Could not fullfill request on this host CassandraClient<dlc05-sea:9160-50>
11:44:14.182 [main] ERROR m.p.c.connection.HConnectionManager - Exception:
me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:27) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.service.KeyspaceServiceImpl$15.execute(KeyspaceServiceImpl.java:598) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.service.KeyspaceServiceImpl$15.execute(KeyspaceServiceImpl.java:587) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:101) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getColumn(KeyspaceServiceImpl.java:604) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery$1.doInKeyspace(ThriftColumnQuery.java:57) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery$1.doInKeyspace(ThriftColumnQuery.java:52) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65) [hector-0.7.0-20.jar:na]
at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery.execute(ThriftColumnQuery.java:51) [hector-0.7.0-20.jar:na]
There are no warnings or errors in cassandra.log.
Any guidance would be appreciated.
Thanks,
-Thor
Re: [SOLVED] Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Jonathan Ellis <jb...@gmail.com>.
Hmm, interesting. You're right that it's a bug; those two are
supposed to be equivalent.
On Thu, Dec 30, 2010 at 4:02 PM, Thor Carpenter
<tc...@mediosystems.com> wrote:
> So, I found a solution, but I think there is a (trivial) bug.
>
> The broken keyspace:
>
> create keyspace KeyspaceDC1 with
> replication_factor = 1 and
> placement_strategy =
> 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
> strategy_options = [{DC1:1, DC2:0}];
>
> The fixed keyspace:
>
> create keyspace KeyspaceDC1 with
> replication_factor = 1 and
> placement_strategy =
> 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
> strategy_options = [{DC1:1}];
>
> I'm not sure if the issue is that there are no nodes in DC2 or that I'm
> saying DC2 shouldn't get any replicas, or a combination of the two. If I
> get time I'll try to poke around the code base.
>
> -Thor
>
> On Dec 29, 2010, at 12:38 PM, Thor Carpenter wrote:
>
> In an attempt to replicate this error on a simpler system I have come across
> a different error but seemingly related problem. Now I am getting the
> following exception in system.log.
>
> DEBUG [pool-1-thread-3] 2010-12-29 12:10:38,897 CassandraServer.java (line
> 362) insert
> ERROR [pool-1-thread-3] 2010-12-29 12:10:38,906 Cassandra.java (line 2960)
> Internal error processing insert
> java.lang.AssertionError
> at
> org.apache.cassandra.locator.TokenMetadata.firstTokenIndex(TokenMetadata.java:392)
> at
> org.apache.cassandra.locator.TokenMetadata.ringIterator(TokenMetadata.java:417)
> at
> org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalEndpoints(NetworkTopologyStrategy.java:95)
> at
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:99)
> at
> org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1411)
> at
> org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1394)
> at
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:109)
> at
> org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:442)
> at
> org.apache.cassandra.thrift.CassandraServer.insert(CassandraServer.java:379)
> at
> org.apache.cassandra.thrift.Cassandra$Processor$insert.process(Cassandra.java:2952)
> at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
>
> Looking at TokenMetadata.java:392, the assert that is triggered is:
> assert ring.size() > 0;
>
> Which led me to run nodetool ring which outputs a seemingly correct config:
> -sh-3.2$ nodetool -h localhost -p 8080 ring
> Address Status State Load Owns Token
> 10.5.64.26 Up Normal 47.16 KB 100.00% 1
>
>
> To reproduce:
>
> * I have upgraded to 0.7RC3 so install the latest riptano 0.7rc3 rpm on a
> single node in "DC1".
>
> * In cassandra.yaml set initial_token = 1
>
> * cassandra-topology.properties:
>
> 10.5.64.26=DC1:R1
> default=DC2:R1
>
> * Schema loaded via cassandra-cli:
>
> create keyspace KeyspaceDC1 with
> replication_factor = 1 and
> placement_strategy =
> 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
> strategy_options = [{DC1:1, DC2:0}];
>
> use KeyspaceDC1;
>
> create column family TestCF with
> column_type = 'Standard' and
> comparator = 'BytesType' and
> keys_cached = 200000 and
> rows_cached = 2000 and
> gc_grace = 0 and
> read_repair_chance = 0.0;
>
> * In cassandra-cli execute the following:
>
> [default@unknown] use KeyspaceDC1;
> Authenticated to keyspace: KeyspaceDC1
> [default@KeyspaceDC1] set TestCF['some key']['some col'] = 'some value';
> Internal error processing insert
>
> * Check system.log where you should find the above assertion error.
>
>
> Can anybody find a flaw in this setup? Do others agree something looks
> buggy?
>
> Thanks,
> -Thor
>
> On Dec 28, 2010, at 4:33 PM, Thor Carpenter wrote:
>
> Third try for good luck. Maybe this time I'll get it all in one email. :-)
>
> At time 15:55 I performed a write with CL = ONE, a subsequent read with CL =
> ONE and received correct results. Similar results occur with any read CL
> other than LOCAL_QUORUM.
>
> At time 15:58 I performed a write with CL = ONE, a subsequent read with CL =
> LOCAL_QUORUM and received a TimedOutException.
>
> System.log and cassandra.log are from server dlc05-sea to corresponds with
> the hector error message "Closing client
> CassandraClient<dlc05-sea:9160-50>". I haven't found anything in the logs
> that indicates the problem but maybe others with more discerning eyes can.
>
> Please let me know if I can provide any additional information.
>
> Thanks,
> -Thor
>
> <logs.zip><ATT00001..txt>
>
>
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
[SOLVED] Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Thor Carpenter <tc...@mediosystems.com>.
So, I found a solution, but I think there is a (trivial) bug.
The broken keyspace:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1, DC2:0}];
The fixed keyspace:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1}];
I'm not sure if the issue is that there are no nodes in DC2 or that I'm saying DC2 shouldn't get any replicas, or a combination of the two. If I get time I'll try to poke around the code base.
-Thor
On Dec 29, 2010, at 12:38 PM, Thor Carpenter wrote:
In an attempt to replicate this error on a simpler system I have come across a different error but seemingly related problem. Now I am getting the following exception in system.log.
DEBUG [pool-1-thread-3] 2010-12-29 12:10:38,897 CassandraServer.java (line 362) insert
ERROR [pool-1-thread-3] 2010-12-29 12:10:38,906 Cassandra.java (line 2960) Internal error processing insert
java.lang.AssertionError
at org.apache.cassandra.locator.TokenMetadata.firstTokenIndex(TokenMetadata.java:392)
at org.apache.cassandra.locator.TokenMetadata.ringIterator(TokenMetadata.java:417)
at org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalEndpoints(NetworkTopologyStrategy.java:95)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:99)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1411)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1394)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:109)
at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:442)
at org.apache.cassandra.thrift.CassandraServer.insert(CassandraServer.java:379)
at org.apache.cassandra.thrift.Cassandra$Processor$insert.process(Cassandra.java:2952)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Looking at TokenMetadata.java:392, the assert that is triggered is:
assert ring.size() > 0;
Which led me to run nodetool ring which outputs a seemingly correct config:
-sh-3.2$ nodetool -h localhost -p 8080 ring
Address Status State Load Owns Token
10.5.64.26 Up Normal 47.16 KB 100.00% 1
To reproduce:
* I have upgraded to 0.7RC3 so install the latest riptano 0.7rc3 rpm on a single node in "DC1".
* In cassandra.yaml set initial_token = 1
* cassandra-topology.properties:
10.5.64.26=DC1:R1
default=DC2:R1
* Schema loaded via cassandra-cli:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1, DC2:0}];
use KeyspaceDC1;
create column family TestCF with
column_type = 'Standard' and
comparator = 'BytesType' and
keys_cached = 200000 and
rows_cached = 2000 and
gc_grace = 0 and
read_repair_chance = 0.0;
* In cassandra-cli execute the following:
[default@unknown] use KeyspaceDC1;
Authenticated to keyspace: KeyspaceDC1
[default@KeyspaceDC1] set TestCF['some key']['some col'] = 'some value';
Internal error processing insert
* Check system.log where you should find the above assertion error.
Can anybody find a flaw in this setup? Do others agree something looks buggy?
Thanks,
-Thor
On Dec 28, 2010, at 4:33 PM, Thor Carpenter wrote:
Third try for good luck. Maybe this time I'll get it all in one email. :-)
At time 15:55 I performed a write with CL = ONE, a subsequent read with CL = ONE and received correct results. Similar results occur with any read CL other than LOCAL_QUORUM.
At time 15:58 I performed a write with CL = ONE, a subsequent read with CL = LOCAL_QUORUM and received a TimedOutException.
System.log and cassandra.log are from server dlc05-sea to corresponds with the hector error message "Closing client CassandraClient<dlc05-sea:9160-50>". I haven't found anything in the logs that indicates the problem but maybe others with more discerning eyes can.
Please let me know if I can provide any additional information.
Thanks,
-Thor
<logs.zip><ATT00001..txt>
[SOLVED] Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Thor Carpenter <tc...@mediosystems.com>.
So, I found a solution, but I think there is a (trivial) bug.
The broken keyspace:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1, DC2:0}];
The fixed keyspace:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1}];
I'm not sure if the issue is that there are no nodes in DC2 or that I'm saying DC2 shouldn't get any replicas, or a combination of the two. If I get time I'll try to poke around the code base.
-Thor
On Dec 29, 2010, at 12:38 PM, Thor Carpenter wrote:
In an attempt to replicate this error on a simpler system I have come across a different error but seemingly related problem. Now I am getting the following exception in system.log.
DEBUG [pool-1-thread-3] 2010-12-29 12:10:38,897 CassandraServer.java (line 362) insert
ERROR [pool-1-thread-3] 2010-12-29 12:10:38,906 Cassandra.java (line 2960) Internal error processing insert
java.lang.AssertionError
at org.apache.cassandra.locator.TokenMetadata.firstTokenIndex(TokenMetadata.java:392)
at org.apache.cassandra.locator.TokenMetadata.ringIterator(TokenMetadata.java:417)
at org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalEndpoints(NetworkTopologyStrategy.java:95)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:99)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1411)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1394)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:109)
at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:442)
at org.apache.cassandra.thrift.CassandraServer.insert(CassandraServer.java:379)
at org.apache.cassandra.thrift.Cassandra$Processor$insert.process(Cassandra.java:2952)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Looking at TokenMetadata.java:392, the assert that is triggered is:
assert ring.size() > 0;
Which led me to run nodetool ring which outputs a seemingly correct config:
-sh-3.2$ nodetool -h localhost -p 8080 ring
Address Status State Load Owns Token
10.5.64.26 Up Normal 47.16 KB 100.00% 1
To reproduce:
* I have upgraded to 0.7RC3 so install the latest riptano 0.7rc3 rpm on a single node in "DC1".
* In cassandra.yaml set initial_token = 1
* cassandra-topology.properties:
10.5.64.26=DC1:R1
default=DC2:R1
* Schema loaded via cassandra-cli:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1, DC2:0}];
use KeyspaceDC1;
create column family TestCF with
column_type = 'Standard' and
comparator = 'BytesType' and
keys_cached = 200000 and
rows_cached = 2000 and
gc_grace = 0 and
read_repair_chance = 0.0;
* In cassandra-cli execute the following:
[default@unknown] use KeyspaceDC1;
Authenticated to keyspace: KeyspaceDC1
[default@KeyspaceDC1] set TestCF['some key']['some col'] = 'some value';
Internal error processing insert
* Check system.log where you should find the above assertion error.
Can anybody find a flaw in this setup? Do others agree something looks buggy?
Thanks,
-Thor
On Dec 28, 2010, at 4:33 PM, Thor Carpenter wrote:
Third try for good luck. Maybe this time I'll get it all in one email. :-)
At time 15:55 I performed a write with CL = ONE, a subsequent read with CL = ONE and received correct results. Similar results occur with any read CL other than LOCAL_QUORUM.
At time 15:58 I performed a write with CL = ONE, a subsequent read with CL = LOCAL_QUORUM and received a TimedOutException.
System.log and cassandra.log are from server dlc05-sea to corresponds with the hector error message "Closing client CassandraClient<dlc05-sea:9160-50>". I haven't found anything in the logs that indicates the problem but maybe others with more discerning eyes can.
Please let me know if I can provide any additional information.
Thanks,
-Thor
<logs.zip><ATT00001..txt>
Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Thor Carpenter <tc...@mediosystems.com>.
In an attempt to replicate this error on a simpler system I have come across a different error but seemingly related problem. Now I am getting the following exception in system.log.
DEBUG [pool-1-thread-3] 2010-12-29 12:10:38,897 CassandraServer.java (line 362) insert
ERROR [pool-1-thread-3] 2010-12-29 12:10:38,906 Cassandra.java (line 2960) Internal error processing insert
java.lang.AssertionError
at org.apache.cassandra.locator.TokenMetadata.firstTokenIndex(TokenMetadata.java:392)
at org.apache.cassandra.locator.TokenMetadata.ringIterator(TokenMetadata.java:417)
at org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalEndpoints(NetworkTopologyStrategy.java:95)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:99)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1411)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1394)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:109)
at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:442)
at org.apache.cassandra.thrift.CassandraServer.insert(CassandraServer.java:379)
at org.apache.cassandra.thrift.Cassandra$Processor$insert.process(Cassandra.java:2952)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Looking at TokenMetadata.java:392, the assert that is triggered is:
assert ring.size() > 0;
Which led me to run nodetool ring which outputs a seemingly correct config:
-sh-3.2$ nodetool -h localhost -p 8080 ring
Address Status State Load Owns Token
10.5.64.26 Up Normal 47.16 KB 100.00% 1
To reproduce:
* I have upgraded to 0.7RC3 so install the latest riptano 0.7rc3 rpm on a single node in "DC1".
* In cassandra.yaml set initial_token = 1
* cassandra-topology.properties:
10.5.64.26=DC1:R1
default=DC2:R1
* Schema loaded via cassandra-cli:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1, DC2:0}];
use KeyspaceDC1;
create column family TestCF with
column_type = 'Standard' and
comparator = 'BytesType' and
keys_cached = 200000 and
rows_cached = 2000 and
gc_grace = 0 and
read_repair_chance = 0.0;
* In cassandra-cli execute the following:
[default@unknown] use KeyspaceDC1;
Authenticated to keyspace: KeyspaceDC1
[default@KeyspaceDC1] set TestCF['some key']['some col'] = 'some value';
Internal error processing insert
* Check system.log where you should find the above assertion error.
Can anybody find a flaw in this setup? Do others agree something looks buggy?
Thanks,
-Thor
On Dec 28, 2010, at 4:33 PM, Thor Carpenter wrote:
Third try for good luck. Maybe this time I'll get it all in one email. :-)
At time 15:55 I performed a write with CL = ONE, a subsequent read with CL = ONE and received correct results. Similar results occur with any read CL other than LOCAL_QUORUM.
At time 15:58 I performed a write with CL = ONE, a subsequent read with CL = LOCAL_QUORUM and received a TimedOutException.
System.log and cassandra.log are from server dlc05-sea to corresponds with the hector error message "Closing client CassandraClient<dlc05-sea:9160-50>". I haven't found anything in the logs that indicates the problem but maybe others with more discerning eyes can.
Please let me know if I can provide any additional information.
Thanks,
-Thor
<logs.zip><ATT00001..txt>
Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Thor Carpenter <tc...@mediosystems.com>.
In an attempt to replicate this error on a simpler system I have come across a different error but seemingly related problem. Now I am getting the following exception in system.log.
DEBUG [pool-1-thread-3] 2010-12-29 12:10:38,897 CassandraServer.java (line 362) insert
ERROR [pool-1-thread-3] 2010-12-29 12:10:38,906 Cassandra.java (line 2960) Internal error processing insert
java.lang.AssertionError
at org.apache.cassandra.locator.TokenMetadata.firstTokenIndex(TokenMetadata.java:392)
at org.apache.cassandra.locator.TokenMetadata.ringIterator(TokenMetadata.java:417)
at org.apache.cassandra.locator.NetworkTopologyStrategy.calculateNaturalEndpoints(NetworkTopologyStrategy.java:95)
at org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:99)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1411)
at org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:1394)
at org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:109)
at org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:442)
at org.apache.cassandra.thrift.CassandraServer.insert(CassandraServer.java:379)
at org.apache.cassandra.thrift.Cassandra$Processor$insert.process(Cassandra.java:2952)
at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Looking at TokenMetadata.java:392, the assert that is triggered is:
assert ring.size() > 0;
Which led me to run nodetool ring which outputs a seemingly correct config:
-sh-3.2$ nodetool -h localhost -p 8080 ring
Address Status State Load Owns Token
10.5.64.26 Up Normal 47.16 KB 100.00% 1
To reproduce:
* I have upgraded to 0.7RC3 so install the latest riptano 0.7rc3 rpm on a single node in "DC1".
* In cassandra.yaml set initial_token = 1
* cassandra-topology.properties:
10.5.64.26=DC1:R1
default=DC2:R1
* Schema loaded via cassandra-cli:
create keyspace KeyspaceDC1 with
replication_factor = 1 and
placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
strategy_options = [{DC1:1, DC2:0}];
use KeyspaceDC1;
create column family TestCF with
column_type = 'Standard' and
comparator = 'BytesType' and
keys_cached = 200000 and
rows_cached = 2000 and
gc_grace = 0 and
read_repair_chance = 0.0;
* In cassandra-cli execute the following:
[default@unknown] use KeyspaceDC1;
Authenticated to keyspace: KeyspaceDC1
[default@KeyspaceDC1] set TestCF['some key']['some col'] = 'some value';
Internal error processing insert
* Check system.log where you should find the above assertion error.
Can anybody find a flaw in this setup? Do others agree something looks buggy?
Thanks,
-Thor
On Dec 28, 2010, at 4:33 PM, Thor Carpenter wrote:
Third try for good luck. Maybe this time I'll get it all in one email. :-)
At time 15:55 I performed a write with CL = ONE, a subsequent read with CL = ONE and received correct results. Similar results occur with any read CL other than LOCAL_QUORUM.
At time 15:58 I performed a write with CL = ONE, a subsequent read with CL = LOCAL_QUORUM and received a TimedOutException.
System.log and cassandra.log are from server dlc05-sea to corresponds with the hector error message "Closing client CassandraClient<dlc05-sea:9160-50>". I haven't found anything in the logs that indicates the problem but maybe others with more discerning eyes can.
Please let me know if I can provide any additional information.
Thanks,
-Thor
<logs.zip><ATT00001..txt>
Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Thor Carpenter <tc...@mediosystems.com>.
Third try for good luck. Maybe this time I'll get it all in one email. :-)
At time 15:55 I performed a write with CL = ONE, a subsequent read with CL = ONE and received correct results. Similar results occur with any read CL other than LOCAL_QUORUM.
At time 15:58 I performed a write with CL = ONE, a subsequent read with CL = LOCAL_QUORUM and received a TimedOutException.
System.log and cassandra.log are from server dlc05-sea to corresponds with the hector error message "Closing client CassandraClient<dlc05-sea:9160-50>". I haven't found anything in the logs that indicates the problem but maybe others with more discerning eyes can.
Please let me know if I can provide any additional information.
Thanks,
-Thor
Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Thor Carpenter <tc...@mediosystems.com>.
OK, attached are the logs.
At time 15:55 I performed a write with CL = ONE, a subsequent read with CL = ONE and received correct results. Similar results occur with any read CL other than LOCAL_QUORUM.
At time 15:58 I performed a write with CL = ONE, a subsequent read with CL = LOCAL_QUORUM and received a TimedOutException.
System.log and cassandra.log are from server dlc05-sea to corresponds with the hector error message "Closing client CassandraClient<dlc05-sea:9160-50>". I haven't found anything in the logs that indicates the problem but maybe others with more discerning eyes can.
Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Thor Carpenter <tc...@mediosystems.com>.
Hmm, sorry about that. It seems my attachments were dropped the first time. Trying again...
Please let me know if I can provide any additional information.
Thanks,
-Thor
Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Jonathan Ellis <jb...@gmail.com>.
Turn on debug logging to see what nodes the request gets sent to, and
which ones replied.
[dev to BCC]
On Tue, Dec 28, 2010 at 1:59 PM, Thor Carpenter
<tc...@mediosystems.com> wrote:
> Hi All,
>
> I am a bit stumped. When reading at a CL of local_quorum I get TimedOutExceptions however when reading at any other CL, including ALL, everything works as expected. Any write CL, including local_quorum, also works as expected.
>
> I am running a 4 node 0.7 RC1 cluster in datacenter "DC1". "DC2" is planned for the future but currently has no nodes. I am using Hector 0.7.0-20.
>
> Here is my testing schema:
>
> create keyspace GlobalCacheKeyspaceDC1RF4 with
> replication_factor = 4 and
> placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
> strategy_options = [{DC1:4, DC2:0}];
>
> use GlobalCacheKeyspaceDC1RF4;
>
> create column family GlobalCacheCF with
> comment = 'General purpose cache' and
> column_type = 'Standard' and
> comparator = 'BytesType' and
> keys_cached = 200000 and
> rows_cached = 2000 and
> gc_grace = 0 and
> read_repair_chance = 0.0;
>
>
> Here is my cassandra-topology.properties:
>
> 10.5.64.15=DC1:R1
> 10.5.64.28=DC1:R1
> 10.5.64.56=DC1:R1
> 10.5.64.39=DC1:R1
>
> # default for unknown nodes
> default=DC2:R1
>
>
> Here is the exception I get when reading at a CL of local_quorum:
>
> 1:44:14.176 [main] ERROR m.p.c.connection.HConnectionManager - Could not fullfill request on this host CassandraClient<dlc05-sea:9160-50>
> 11:44:14.182 [main] ERROR m.p.c.connection.HConnectionManager - Exception:
> me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
> at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:27) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl$15.execute(KeyspaceServiceImpl.java:598) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl$15.execute(KeyspaceServiceImpl.java:587) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:101) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getColumn(KeyspaceServiceImpl.java:604) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery$1.doInKeyspace(ThriftColumnQuery.java:57) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery$1.doInKeyspace(ThriftColumnQuery.java:52) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery.execute(ThriftColumnQuery.java:51) [hector-0.7.0-20.jar:na]
>
>
> There are no warnings or errors in cassandra.log.
>
> Any guidance would be appreciated.
>
> Thanks,
> -Thor
>
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com
Re: 0.7RC1 local_quorum -> TimedOutException
Posted by Jonathan Ellis <jb...@gmail.com>.
Turn on debug logging to see what nodes the request gets sent to, and
which ones replied.
[dev to BCC]
On Tue, Dec 28, 2010 at 1:59 PM, Thor Carpenter
<tc...@mediosystems.com> wrote:
> Hi All,
>
> I am a bit stumped. When reading at a CL of local_quorum I get TimedOutExceptions however when reading at any other CL, including ALL, everything works as expected. Any write CL, including local_quorum, also works as expected.
>
> I am running a 4 node 0.7 RC1 cluster in datacenter "DC1". "DC2" is planned for the future but currently has no nodes. I am using Hector 0.7.0-20.
>
> Here is my testing schema:
>
> create keyspace GlobalCacheKeyspaceDC1RF4 with
> replication_factor = 4 and
> placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy' and
> strategy_options = [{DC1:4, DC2:0}];
>
> use GlobalCacheKeyspaceDC1RF4;
>
> create column family GlobalCacheCF with
> comment = 'General purpose cache' and
> column_type = 'Standard' and
> comparator = 'BytesType' and
> keys_cached = 200000 and
> rows_cached = 2000 and
> gc_grace = 0 and
> read_repair_chance = 0.0;
>
>
> Here is my cassandra-topology.properties:
>
> 10.5.64.15=DC1:R1
> 10.5.64.28=DC1:R1
> 10.5.64.56=DC1:R1
> 10.5.64.39=DC1:R1
>
> # default for unknown nodes
> default=DC2:R1
>
>
> Here is the exception I get when reading at a CL of local_quorum:
>
> 1:44:14.176 [main] ERROR m.p.c.connection.HConnectionManager - Could not fullfill request on this host CassandraClient<dlc05-sea:9160-50>
> 11:44:14.182 [main] ERROR m.p.c.connection.HConnectionManager - Exception:
> me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
> at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:27) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl$15.execute(KeyspaceServiceImpl.java:598) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl$15.execute(KeyspaceServiceImpl.java:587) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:89) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:101) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:129) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getColumn(KeyspaceServiceImpl.java:604) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery$1.doInKeyspace(ThriftColumnQuery.java:57) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery$1.doInKeyspace(ThriftColumnQuery.java:52) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:65) [hector-0.7.0-20.jar:na]
> at me.prettyprint.cassandra.model.thrift.ThriftColumnQuery.execute(ThriftColumnQuery.java:51) [hector-0.7.0-20.jar:na]
>
>
> There are no warnings or errors in cassandra.log.
>
> Any guidance would be appreciated.
>
> Thanks,
> -Thor
>
>
--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com