You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Santhosh Kumar Ramalingam (Jira)" <ji...@apache.org> on 2019/11/01 23:22:00 UTC

[jira] [Commented] (CASSANDRA-15358) Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue

    [ https://issues.apache.org/jira/browse/CASSANDRA-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965153#comment-16965153 ] 

Santhosh Kumar Ramalingam commented on CASSANDRA-15358:
-------------------------------------------------------

[~benedict], below is the setup

Hardware - Azure vms L series (nvme's)

OS - CentOS 7.6

JDK - Java8,11,12,13

GC collector - G!GC,ZGC

Test conditions and results:

TLP-Stress : Keyvalue 

/usr/local/bin/tlp-stress run KeyValue -n 10b -t 400 --cl LOCAL_ONE --compaction lcs --ho 1.3.0.6 --port 9042 -r 0.0 --field.keyvalue.value='book(3000,5000)'

Result : Ran absolutely fantastic no issues.

 

App test: Spark job (real production application)

no of columns : 100

batch size : 100

local_quorum

RF 3

 

Result: we run into the above exception. The nodes are not failing anymore but then the latency and throughput is impacted. we are only able to do 30k write ops with latency up to 80ms.

> Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15358
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15358
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/benchmark
>            Reporter: Santhosh Kumar Ramalingam
>            Assignee: Benedict Elliott Smith
>            Priority: Normal
>              Labels: 4.0, alpha
>
> Hitting a bug with cassandra 4 alpha version. The same bug is repeated with difefrent version of Java(8,11 &12) [~benedict]
>  
> Stack trace:
> {code:java}
> INFO [main] 2019-10-11 16:07:12,024 Server.java:164 - Starting listening for CQL clients on /1.3.0.6:9042 (unencrypted)...
> WARN [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:343 - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:369 - Setup task failed with error, rescheduling
> WARN [Messaging-EventLoop-3-2] 2019-10-11 16:07:22,038 NoSpamLogger.java:94 - 10.3x.4x.5x:7000->1.3.0.5:7000-LARGE_MESSAGES-[no-channel] dropping message of type PING_REQ whose timeout expired before reaching the network
> WARN [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:343 - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:369 - Setup task failed with error, rescheduling
> INFO [Messaging-EventLoop-3-6] 2019-10-11 16:07:32,759 NoSpamLogger.java:91 - 10.3x.4x.5x:7000->1.3.0.2:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: /1.3.0.2:7000
> Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
> at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
> at io.netty.channel.unix.Socket.finishConnect(Socket.java:243)
> at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:667)
> at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:644)
> at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:524)
> at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:414)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
> at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> WARN [Messaging-EventLoop-3-3] 2019-10-11 16:11:32,639 NoSpamLogger.java:94 - 1.3.4.6:7000->1.3.4.5:7000-URGENT_MESSAGES-[no-channel] dropping message of type GOSSIP_DIGEST_SYN whose timeout expired before reaching the network
> INFO [Messaging-EventLoop-3-18] 2019-10-11 16:11:33,077 NoSpamLogger.java:91 - 1.3.4.5:7000->1.3.4.4:7000-URGENT_MESSAGES-[no-channel] failed to connect
>  
> ERROR [Messaging-EventLoop-3-11] 2019-10-10 01:34:34,407 InboundMessageHandler.java:657 - 1.3.4.5:7000->1.3.4.8:7000-LARGE_MESSAGES-0b7d09cd unexpected exception caught while processing inbound messages; terminating connection
> java.lang.IllegalArgumentException: initialBuffer is not a direct buffer.
> at io.netty.buffer.UnpooledDirectByteBuf.<init>(UnpooledDirectByteBuf.java:87)
> at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:59)
> at org.apache.cassandra.net.BufferPoolAllocator$Wrapped.<init>(BufferPoolAllocator.java:95)
> at org.apache.cassandra.net.BufferPoolAllocator.newDirectBuffer(BufferPoolAllocator.java:56)
> at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
> at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
> at io.netty.channel.unix.PreferredDirectByteBufAllocator.ioBuffer(PreferredDirectByteBufAllocator.java:53)
> at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
> at io.netty.channel.epoll.EpollRecvByteAllocatorHandle.allocate(EpollRecvByteAllocatorHandle.java:75)
> at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:777)
> at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:424)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
> at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:835)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org