You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Joshua McKenzie (JIRA)" <ji...@apache.org> on 2016/05/03 18:55:12 UTC

[jira] [Updated] (CASSANDRA-10687) When adding new node to cluster getting Cassandra timeout during write query

     [ https://issues.apache.org/jira/browse/CASSANDRA-10687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joshua McKenzie updated CASSANDRA-10687:
----------------------------------------
    Resolution: Cannot Reproduce
        Status: Resolved  (was: Awaiting Feedback)

[~eyalso]: Closing this as cannot reproduce. Please feel free to re-open if you're still seeing it on a 2.1/2.2 release of C*.

> When adding new node to cluster getting Cassandra timeout during write query
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10687
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10687
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Configuration, Coordination, Streaming and Messaging
>         Environment: Cassandra 2.0.9 using vnodes, on Debian 7.9,  on two data centers (AUS & TAM)
>            Reporter: Eyal Sorek
>
> When adding one new node on 8 nodes cluster (also again after completing adding the 9th in AUS data center and again when adding the 10th node on TAM data center with same behaviour).
> We get many of the following errors below.
> First - why this, when the node is joining :
> LOCAL_ONE (2 replica were required but only 1 acknowledged the write
> Since when LOCAL_ONE requires 2 replicas ?
> Second, why we fill so much overhead on the all cluster, when a node is joining ?
> com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged the write)
> Sample stack trace
> …stax.driver.core.exceptions.WriteTimeoutException.copy (WriteTimeoutException.java:73)
> …m.datastax.driver.core.DriverThrowables.propagateCause (DriverThrowables.java:37)
> ….driver.core.DefaultResultSetFuture.getUninterruptibly (DefaultResultSetFuture.java:214)
>        com.datastax.driver.core.AbstractSession.execute (AbstractSession.java:52)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao$$anonfun$insertCompressed$1.apply(CassandraPagesReadWriteDao.scala:29)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao$$anonfun$insertCompressed$1.apply(CassandraPagesReadWriteDao.scala:25)
> com.wixpress.framework.monitoring.metering.SyncMetering$class.tracking(Metering.scala:58)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadOnlyDao.tracking(CassandraPagesReadOnlyDao.scala:19)
> com.wixpress.publichtml.renderer.data.access.dao.page.CassandraPagesReadWriteDao.insertCompressed(CassandraPagesReadWriteDao.scala:25)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.com$wixpress$html$data$distributor$core$DaoPageDistributor$$distributePage(DaoPageDistributor.scala:36)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply$mcV$sp(DaoPageDistributor.scala:26)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply(DaoPageDistributor.scala:26)
> com.wixpress.html.data.distributor.core.DaoPageDistributor$$anonfun$process$1.apply(DaoPageDistributor.scala:26)
> com.wixpress.framework.monitoring.metering.SyncMetering$class.tracking(Metering.scala:58)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.tracking(DaoPageDistributor.scala:17)
> com.wixpress.html.data.distributor.core.DaoPageDistributor.process(DaoPageDistributor.scala:25)
> com.wixpress.html.data.distributor.core.greyhound.DistributionRequestHandler.handleMessage(DistributionRequestHandler.scala:19)
> com.wixpress.greyhound.KafkaUserHandlers.handleMessage(UserHandlers.scala:11)
> com.wixpress.greyhound.EventsConsumer.com$wixpress$greyhound$EventsConsumer$$handleMessage(EventsConsumer.scala:51)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply$mcV$sp(EventsConsumer.scala:43)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply(EventsConsumer.scala:40)
> com.wixpress.greyhound.EventsConsumer$$anonfun$com$wixpress$greyhound$EventsConsumer$$dispatch$1.apply(EventsConsumer.scala:40)
> scala.util.Try$.apply(Try.scala:192)
> com.wixpress.greyhound.EventsConsumer.com$wixpress$greyhound$EventsConsumer$$dispatch(EventsConsumer.scala:40)
> com.wixpress.greyhound.EventsConsumer$$anonfun$consumeEvents$1.apply(EventsConsumer.scala:26)
> com.wixpress.greyhound.EventsConsumer$$anonfun$consumeEvents$1.apply(EventsConsumer.scala:25)
> scala.collection.Iterator$class.foreach(Iterator.scala:742)
> scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
> com.wixpress.greyhound.EventsConsumer.consumeEvents(EventsConsumer.scala:25)
> com.wixpress.greyhound.EventsConsumer.run(EventsConsumer.scala:20)
>       java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142)
>      java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617)
>                                    java.lang.Thread.run (Thread.java:745)
> caused by com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged the write)
> …stax.driver.core.exceptions.WriteTimeoutException.copy (WriteTimeoutException.java:100)
>    com.datastax.driver.core.Responses$Error.asException (Responses.java:98)
>   com.datastax.driver.core.DefaultResultSetFuture.onSet (DefaultResultSetFuture.java:149)
>  com.datastax.driver.core.RequestHandler.setFinalResult (RequestHandler.java:183)
>     com.datastax.driver.core.RequestHandler.access$2300 (RequestHandler.java:44)
> …ore.RequestHandler$SpeculativeExecution.setFinalResult (RequestHandler.java:748)
> ….driver.core.RequestHandler$SpeculativeExecution.onSet (RequestHandler.java:587)
> …atastax.driver.core.Connection$Dispatcher.channelRead0 (Connection.java:1013)
> …atastax.driver.core.Connection$Dispatcher.channelRead0 (Connection.java:936)
> ….netty.channel.SimpleChannelInboundHandler.channelRead (SimpleChannelInboundHandler.java:105)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
>   io.netty.handler.timeout.IdleStateHandler.channelRead (IdleStateHandler.java:254)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.handler.codec.ByteToMessageDecoder.channelRead (ByteToMessageDecoder.java:242)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.channel.DefaultChannelPipeline.fireChannelRead (DefaultChannelPipeline.java:847)
> ….channel.nio.AbstractNioByteChannel$NioByteUnsafe.read (AbstractNioByteChannel.java:131)
>    io.netty.channel.nio.NioEventLoop.processSelectedKey (NioEventLoop.java:511)
> ….channel.nio.NioEventLoop.processSelectedKeysOptimized (NioEventLoop.java:468)
>   io.netty.channel.nio.NioEventLoop.processSelectedKeys (NioEventLoop.java:382)
>                   io.netty.channel.nio.NioEventLoop.run (NioEventLoop.java:354)
> ….netty.util.concurrent.SingleThreadEventExecutor$2.run (SingleThreadEventExecutor.java:111)
>                                    java.lang.Thread.run (Thread.java:745)
> caused by com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency LOCAL_ONE (2 replica were required but only 1 acknowledged the write)
>       com.datastax.driver.core.Responses$Error$1.decode (Responses.java:57)
>       com.datastax.driver.core.Responses$Error$1.decode (Responses.java:37)
> com.datastax.driver.core.Message$ProtocolDecoder.decode (Message.java:213)
> com.datastax.driver.core.Message$ProtocolDecoder.decode (Message.java:204)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:89)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> …etty.handler.codec.MessageToMessageDecoder.channelRead (MessageToMessageDecoder.java:103)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.handler.codec.ByteToMessageDecoder.channelRead (ByteToMessageDecoder.java:242)
> …hannel.AbstractChannelHandlerContext.invokeChannelRead (AbstractChannelHandlerContext.java:339)
> ….channel.AbstractChannelHandlerContext.fireChannelRead (AbstractChannelHandlerContext.java:324)
> io.netty.channel.DefaultChannelPipeline.fireChannelRead (DefaultChannelPipeline.java:847)
> ….channel.nio.AbstractNioByteChannel$NioByteUnsafe.read (AbstractNioByteChannel.java:131)
>    io.netty.channel.nio.NioEventLoop.processSelectedKey (NioEventLoop.java:511)
> ….channel.nio.NioEventLoop.processSelectedKeysOptimized (NioEventLoop.java:468)
>   io.netty.channel.nio.NioEventLoop.processSelectedKeys (NioEventLoop.java:382)
>                   io.netty.channel.nio.NioEventLoop.run (NioEventLoop.java:354)
> ….netty.util.concurrent.SingleThreadEventExecutor$2.run (SingleThreadEventExecutor.java:111)
>                                    java.lang.Thread.run (Thread.java:745)
> # nodetool status
> xss =  -ea -javaagent:/usr/share/cassandra/lib/jamm-0.2.5.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -XX:+CMSClassUnloadingEnabled -Xms8192M -Xmx8192M -Xmn2048M -Xss256k
> Note: Ownership information does not include topology; for complete information, specify a keyspace
> Datacenter: AUS
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns   Host ID                               Rack
> UN  172.16.213.62  85.52 GB   256     11.7%  27f2fd1d-5f3c-4691-a1f6-e28c1343e212  R1
> UN  172.16.213.63  83.11 GB   256     12.2%  4869f14b-e858-46c7-967c-60bd8260a149  R1
> UN  172.16.213.64  80.91 GB   256     11.7%  d4ad2495-cb24-4964-94d2-9e3f557054a4  R1
> UN  172.16.213.66  84.11 GB   256     10.3%  2a16c0dc-c36a-4196-89df-2de4f6b6cae5  R1
> UN  172.16.144.75  95.2 GB    256     11.4%  f87d6518-6c8e-49d9-a013-018bbedb8414  R1
> Datacenter: TAM
> ===============
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address        Load       Tokens  Owns   Host ID                               Rack
> UJ  10.14.0.155    4.38 GB    256     ?      c88bebae-737b-4ade-8f79-64f655036eee  R1
> UN  10.14.0.106    81.57 GB   256     10.0%  3b539927-b53a-4f50-9acd-d92fefbd84b9  R1
> UN  10.14.0.107    80.23 GB   256     10.4%  b70f674d-892f-42ff-a261-5356bee79e99  R1
> UN  10.14.0.108    83.64 GB   256     11.2%  6e24b17a-0b48-46b4-8edb-b0a9206314a3  R1
> UN  10.14.0.109    91.02 GB   256     11.2%  11f02dbd-257f-4623-81f4-b94db7365775  R1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)