You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Valentin Lorentz (Jira)" <ji...@apache.org> on 2019/10/31 14:05:00 UTC
[jira] [Comment Edited] (CASSANDRA-15358) Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue

    [ https://issues.apache.org/jira/browse/CASSANDRA-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964055#comment-16964055 ] 

Valentin Lorentz edited comment on CASSANDRA-15358 at 10/31/19 2:04 PM:
------------------------------------------------------------------------

Hello,

 

I had a similar issue earlier today. It was triggered by a client using a code similar to this one: [https://docs.datastax.com/en/developer/python-driver/3.20/query_paging/#handling-paged-results-with-callbacks] with these changes:
 * typo fix ( {{handle_err}} ->  {{handle_error}})
 *  {{"SELECT * FROM users"}} replaced with  {{SimpleStatement("SELECT * FROM revision", fetch_size=100)}}

 

The table I'm querying had ~1 billion keys, and is defined with:

 
{code:java}
CREATE TYPE IF NOT EXISTS person (
    fullname    blob,
    name        blob,
    email       blob
);

CREATE TYPE IF NOT EXISTS microtimestamp (
    seconds             bigint,
    microseconds        int
);

CREATE TYPE IF NOT EXISTS microtimestamp_with_timezone (
    timestamp           frozen<microtimestamp>,
    offset              smallint,
    negative_utc        boolean
);

CREATE TABLE IF NOT EXISTS revision (
    id                              blob PRIMARY KEY,
    date                            microtimestamp_with_timezone,
    committer_date                  microtimestamp_with_timezone,
    type                            ascii,
    directory                       blob,
    message                         blob,
    author                          person,
    committer                       person,
    parents                         frozen<list<blob>>,
    synthetic                       boolean,
    metadata                        text
);
{code}
Cluster was initially created with Cassandra 3.11.4, but was migrated to 4.0-alpha1 a few weeks ago. There are four nodes in my cluster, and no replication.

cassandra.yaml was the same as the one shipped with 4.0-alpha1, except for some paths/IP changes, and these changes:
{code:java}
prepared_statements_cache_size_mb: 10
key_cache_size_in_mb: 10
concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
file_cache_size_in_mb: 512
disk_access_mode: mmap
trickle_fsync: true
enable_user_defined_functions: true{code}
After a restart, I am unable to reproduce the issue, so I cannot tell if the issue was caused by my config.


was (Author: progval):
Hello,

 

I had a similar issue earlier today. It was triggered by a client using a code similar to this one: [https://docs.datastax.com/en/developer/python-driver/3.20/query_paging/#handling-paged-results-with-callbacks] with these changes:
 * typo fix ( {{handle_err}} ->  {{handle_error}})
 *  {{"SELECT * FROM users"}} replaced with  {{SimpleStatement("SELECT * FROM revision", fetch_size=100)}}{{}}

 

The table I'm querying had ~1 billion keys, and is defined with:

 
{code:java}
CREATE TYPE IF NOT EXISTS person (
    fullname    blob,
    name        blob,
    email       blob
);

CREATE TYPE IF NOT EXISTS microtimestamp (
    seconds             bigint,
    microseconds        int
);

CREATE TYPE IF NOT EXISTS microtimestamp_with_timezone (
    timestamp           frozen<microtimestamp>,
    offset              smallint,
    negative_utc        boolean
);

CREATE TABLE IF NOT EXISTS revision (
    id                              blob PRIMARY KEY,
    date                            microtimestamp_with_timezone,
    committer_date                  microtimestamp_with_timezone,
    type                            ascii,
    directory                       blob,
    message                         blob,
    author                          person,
    committer                       person,
    parents                         frozen<list<blob>>,
    synthetic                       boolean,
    metadata                        text
);
{code}
Cluster was initially created with Cassandra 3.11.4, but was migrated to 4.0-alpha1 a few weeks ago. There are four nodes in my cluster, and no replication.

cassandra.yaml was the same as the one shipped with 4.0-alpha1, except for some paths/IP changes, and these changes:
{code:java}
prepared_statements_cache_size_mb: 10
key_cache_size_in_mb: 10
concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
file_cache_size_in_mb: 512
disk_access_mode: mmap
trickle_fsync: true
enable_user_defined_functions: true{code}
After a restart, I am unable to reproduce the issue, so I cannot tell if the issue was caused by my config.

> Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue
> ----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15358
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15358
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/benchmark
>            Reporter: Santhosh Kumar Ramalingam
>            Assignee: Benedict Elliott Smith
>            Priority: Normal
>              Labels: 4.0, alpha
>
> Hitting a bug with cassandra 4 alpha version. The same bug is repeated with difefrent version of Java(8,11 &12) [~benedict]
>  
> Stack trace:
> {code:java}
> INFO [main] 2019-10-11 16:07:12,024 Server.java:164 - Starting listening for CQL clients on /1.3.0.6:9042 (unencrypted)...
> WARN [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:343 - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:369 - Setup task failed with error, rescheduling
> WARN [Messaging-EventLoop-3-2] 2019-10-11 16:07:22,038 NoSpamLogger.java:94 - 10.3x.4x.5x:7000->1.3.0.5:7000-LARGE_MESSAGES-[no-channel] dropping message of type PING_REQ whose timeout expired before reaching the network
> WARN [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:343 - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:369 - Setup task failed with error, rescheduling
> INFO [Messaging-EventLoop-3-6] 2019-10-11 16:07:32,759 NoSpamLogger.java:91 - 10.3x.4x.5x:7000->1.3.0.2:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: /1.3.0.2:7000
> Caused by: java.net.ConnectException: finishConnect(..) failed: Connection refused
> at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
> at io.netty.channel.unix.Socket.finishConnect(Socket.java:243)
> at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:667)
> at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:644)
> at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:524)
> at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:414)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
> at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> WARN [Messaging-EventLoop-3-3] 2019-10-11 16:11:32,639 NoSpamLogger.java:94 - 1.3.4.6:7000->1.3.4.5:7000-URGENT_MESSAGES-[no-channel] dropping message of type GOSSIP_DIGEST_SYN whose timeout expired before reaching the network
> INFO [Messaging-EventLoop-3-18] 2019-10-11 16:11:33,077 NoSpamLogger.java:91 - 1.3.4.5:7000->1.3.4.4:7000-URGENT_MESSAGES-[no-channel] failed to connect
>  
> ERROR [Messaging-EventLoop-3-11] 2019-10-10 01:34:34,407 InboundMessageHandler.java:657 - 1.3.4.5:7000->1.3.4.8:7000-LARGE_MESSAGES-0b7d09cd unexpected exception caught while processing inbound messages; terminating connection
> java.lang.IllegalArgumentException: initialBuffer is not a direct buffer.
> at io.netty.buffer.UnpooledDirectByteBuf.<init>(UnpooledDirectByteBuf.java:87)
> at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:59)
> at org.apache.cassandra.net.BufferPoolAllocator$Wrapped.<init>(BufferPoolAllocator.java:95)
> at org.apache.cassandra.net.BufferPoolAllocator.newDirectBuffer(BufferPoolAllocator.java:56)
> at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
> at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
> at io.netty.channel.unix.PreferredDirectByteBufAllocator.ioBuffer(PreferredDirectByteBufAllocator.java:53)
> at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
> at io.netty.channel.epoll.EpollRecvByteAllocatorHandle.allocate(EpollRecvByteAllocatorHandle.java:75)
> at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:777)
> at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:424)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
> at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:835)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org