You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joshua McKenzie (JIRA)" <ji...@apache.org> on 2016/05/03 18:55:12 UTC
[jira] [Updated] (CASSANDRA-10687) When adding new node to cluster
getting Cassandra timeout during write query
[ https://issues.apache.org/jira/browse/CASSANDRA-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joshua McKenzie updated CASSANDRA-10687:
----------------------------------------
Resolution: Cannot Reproduce
Status: Resolved (was: Awaiting Feedback)
[~eyalso]: Closing this as cannot reproduce. Please feel free to re-open if you're still seeing it on a 2.1/2.2 release of C*.
> When adding new node to cluster getting Cassandra timeout during write query
> ----------------------------------------------------------------------------
>
> Key: CASSANDRA-10687
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10687
> Project: Cassandra
> Issue Type: Bug
> Components: Configuration, Coordination, Streaming and Messaging
> Environment: Cassandra 2.0.9 using vnodes, on Debian 7.9, on two data centers (AUS & TAM)
> Reporter: Eyal Sorek
>
> When adding one new node on 8 nodes cluster (also again after completing adding the 9th in AUS data center and again when adding the 10th node on TAM data center with same behaviour).
> We get many of the following errors below.
> First - why this, when the node is joining :
> LOCAL_ONE (2 replica were required but only 1 acknowledged the write
> Since when LOCAL_ONE requires 2 replicas ?
> Second, why we fill so much overhead on the all cluster, when a node is joining ?
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged the write)
> Sample stack trace
> …stax.driver.core.exceptions.WriteTimeoutException.copy (WriteTimeoutException.java:73)
> …m.datastax.driver.core.DriverThrowables.propagateCause (DriverThrowables.java:37)
> ….driver.core.DefaultResultSetFuture.getUninterruptibly (DefaultResultSetFuture.java:214)
> com.datastax.driver.core.AbstractSession.execute (AbstractSession.java:52)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao$$anonfun$insertCompressed$1.apply(CassandraPagesReadWriteDao.scala:29)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao$$anonfun$insertCompressed$1.apply(CassandraPagesReadWriteDao.scala:25)
> com.wixpress.framework.monitoring.metering.SyncMetering$class.tracking(Metering.scala:58)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadOnlyDao.tracking(CassandraPagesReadOnlyDao.scala:19)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao.insertCompressed(CassandraPagesReadWriteDao.scala:25)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.com$wixpress$html$data$distributor$core$DaoPageDistributor$$distributePage(DaoPageDistributor.scala:36)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply$mcV$sp(DaoPageDistributor.scala:26)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply(DaoPageDistributor.scala:26)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply(DaoPageDistributor.scala:26)
> com.wixpress.framework.monitoring.metering.SyncMetering$class.tracking(Metering.scala:58)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.tracking(DaoPageDistributor.scala:17)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.process(DaoPageDistributor.scala:25)
> com.wixpress.html.data.distributor.core.greyhound.DistributionRequestHandler.handleMessage(DistributionRequestHandler.scala:19)
> com.wixpress.greyhound.KafkaUserHandlers.handleMessage(UserHandlers.scala:11)
> com.wixpress.greyhound.EventsConsumer.com$wixpress$greyhound$EventsConsumer$$handleMessage(EventsConsumer.scala:51)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply$mcV$sp(EventsConsumer.scala:43)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply(EventsConsumer.scala:40)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply(EventsConsumer.scala:40)
> scala.util.Try$.apply(Try.scala:192)
> com.wixpress.greyhound.EventsConsumer.com$wixpress$greyhound$EventsConsumer$$dispatch(EventsConsumer.scala:40)
> com.wixpress.greyhound.EventsConsumer$$anonfun$consumeEvents$1.apply(EventsConsumer.scala:26)
> com.wixpress.greyhound.EventsConsumer$$anonfun$consumeEvents$1.apply(EventsConsumer.scala:25)
> scala.collection.Iterator$class.foreach(Iterator.scala:742)
> scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
> com.wixpress.greyhound.EventsConsumer.consumeEvents(EventsConsumer.scala:25)
> com.wixpress.greyhound.EventsConsumer.run(EventsConsumer.scala:20)
> java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
> java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
> java.lang.Thread.run (Thread.java:745)
> caused by com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged the write)
> …stax.driver.core.exceptions.WriteTimeoutException.copy (WriteTimeoutException.java:100)
> com.datastax.driver.core.Responses$Error.asException (Responses.java:98)
> com.datastax.driver.core.DefaultResultSetFuture.onSet (DefaultResultSetFuture.java:149)
> com.datastax.driver.core.RequestHandler.setFinalResult (RequestHandler.java:183)
> com.datastax.driver.core.RequestHandler.access$2300 (RequestHandler.java:44)
> …ore.RequestHandler$SpeculativeExecution.setFinalResult (RequestHandler.java:748)
> ….driver.core.RequestHandler$SpeculativeExecution.onSet (RequestHandler.java:587)
> …atastax.driver.core.Connection$Dispatcher.channelRead0 (Connection.java:1013)
> …atastax.driver.core.Connection$Dispatcher.channelRead0 (Connection.java:936)
> ….netty.channel.SimpleChannelInboundHandler.channelRead (SimpleChannelInboundHandler.java:105)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.handler.timeout.IdleStateHandler.channelRead (IdleStateHandler.java:254)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.handler.codec.ByteToMessageDecoder.channelRead (ByteToMessageDecoder.java:242)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.channel.DefaultChannelPipeline.fireChannelRead (DefaultChannelPipeline.java:847)
> ….channel.nio.AbstractNioByteChannel$NioByteUnsafe.read (AbstractNioByteChannel.java:131)
> io.netty.channel.nio.NioEventLoop.processSelectedKey (NioEventLoop.java:511)
> ….channel.nio.NioEventLoop.processSelectedKeysOptimized (NioEventLoop.java:468)
> io.netty.channel.nio.NioEventLoop.processSelectedKeys (NioEventLoop.java:382)
> io.netty.channel.nio.NioEventLoop.run (NioEventLoop.java:354)
> ….netty.util.concurrent.SingleThreadEventExecutor$2.run (SingleThreadEventExecutor.java:111)
> java.lang.Thread.run (Thread.java:745)
> caused by com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged the write)
> com.datastax.driver.core.Responses$Error$1.decode (Responses.java:57)
> com.datastax.driver.core.Responses$Error$1.decode (Responses.java:37)
> com.datastax.driver.core.Message$ProtocolDecoder.decode (Message.java:213)
> com.datastax.driver.core.Message$ProtocolDecoder.decode (Message.java:204)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:89)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.handler.codec.ByteToMessageDecoder.channelRead (ByteToMessageDecoder.java:242)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.channel.DefaultChannelPipeline.fireChannelRead (DefaultChannelPipeline.java:847)
> ….channel.nio.AbstractNioByteChannel$NioByteUnsafe.read (AbstractNioByteChannel.java:131)
> io.netty.channel.nio.NioEventLoop.processSelectedKey (NioEventLoop.java:511)
> ….channel.nio.NioEventLoop.processSelectedKeysOptimized (NioEventLoop.java:468)
> io.netty.channel.nio.NioEventLoop.processSelectedKeys (NioEventLoop.java:382)
> io.netty.channel.nio.NioEventLoop.run (NioEventLoop.java:354)
> ….netty.util.concurrent.SingleThreadEventExecutor$2.run (SingleThreadEventExecutor.java:111)
> java.lang.Thread.run (Thread.java:745)
> # nodetool status
> xss = -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+CMSClassUnloadingEnabled -Xms8192M -Xmx8192M -Xmn2048M -Xss256k
> Note: Ownership information does not include topology; for complete information, specify a keyspace
> Datacenter: AUS
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns Host ID Rack
> UN 172.16.213.62 85.52 GB 256 11.7% 27f2fd1d-5f3c-4691-a1f6-e28c1343e212 R1
> UN 172.16.213.63 83.11 GB 256 12.2% 4869f14b-e858-46c7-967c-60bd8260a149 R1
> UN 172.16.213.64 80.91 GB 256 11.7% d4ad2495-cb24-4964-94d2-9e3f557054a4 R1
> UN 172.16.213.66 84.11 GB 256 10.3% 2a16c0dc-c36a-4196-89df-2de4f6b6cae5 R1
> UN 172.16.144.75 95.2 GB 256 11.4% f87d6518-6c8e-49d9-a013-018bbedb8414 R1
> Datacenter: TAM
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns Host ID Rack
> UJ 10.14.0.155 4.38 GB 256 ? c88bebae-737b-4ade-8f79-64f655036eee R1
> UN 10.14.0.106 81.57 GB 256 10.0% 3b539927-b53a-4f50-9acd-d92fefbd84b9 R1
> UN 10.14.0.107 80.23 GB 256 10.4% b70f674d-892f-42ff-a261-5356bee79e99 R1
> UN 10.14.0.108 83.64 GB 256 11.2% 6e24b17a-0b48-46b4-8edb-b0a9206314a3 R1
> UN 10.14.0.109 91.02 GB 256 11.2% 11f02dbd-257f-4623-81f4-b94db7365775 R1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)