You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "David (JIRA)" <ji...@apache.org> on 2019/05/14 02:49:00 UTC

[jira] [Commented] (KAFKA-8103) Kafka SIGSEGV on kafka-network-thread

    [ https://issues.apache.org/jira/browse/KAFKA-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16839036#comment-16839036 ] 

David commented on KAFKA-8103:
------------------------------

[~ijuma]

We have gotten a few other cases over the last month that have different top level errors on this same cluster. Do you think they could be from the same underlying issue?

 
{code:java}
# Problematic frame:
# J 16372 C2 org.apache.kafka.common.network.Selector.pollSelectionKeys(Ljava/util/Set;ZJ)V (543 bytes) @ 0x00007fc0233ebe0c [0x00007fc0233eb8c0+0x54c]
#Register to memory mapping:

RAX=0x0000000000000001 is an unknown value
RBX=0x00007fbe5fb72880 is pointing into the stack for thread: 0x00007fc0315a8800
RCX=0x0000000000000040 is an unknown value
RDX=0x0000000000001762 is an unknown value
RSP=0x00007fbe5fb728a0 is pointing into the stack for thread: 0x00007fc0315a8800
RBP=0x00000000eafa9b17 is an unknown value
RSI=0x000000054f001f38 is an oop
sun.nio.ch.EPollArrayWrapper
- klass: 'sun/nio/ch/EPollArrayWrapper'
RDI=0x00000000a9e003e7 is an unknown value
R8 =0x0000000000000000 is an unknown value
R9 =0x0000000000010000 is an unknown value
R10=0x0000000000000000 is an unknown value
R11=0x00000000aa418a8b is an unknown value
R12=0x0000000000000000 is an unknown value
R13=0x00000005520c5458 is an oop
sun.nio.ch.SelectionKeyImpl
- klass: 'sun/nio/ch/SelectionKeyImpl'
R14=0x000000055063faa8 is an oop
java.lang.Object
- klass: 'java/lang/Object'
R15=0x00007fc0315a8800 is a thread{code}
{code:java}
# J 1826 C2 java.nio.Buffer.limit(I)Ljava/nio/Buffer; (62 bytes) @ 0x00007fa0216f52c0 [0x00007fa0216f52a0+0x20]

Register to memory mapping:


RAX=0x000000073f09e6a0 is an oop

java.nio.HeapByteBuffer

- klass: 'java/nio/HeapByteBuffer'

RBX=0x000000073f09e6a0 is an oop

java.nio.HeapByteBuffer

- klass: 'java/nio/HeapByteBuffer'

RCX=0x000000000000a6cf is an unknown value

RDX=0x000000000000a6cf is an unknown value

RSP=0x00007f9d6f137748 is pointing into the stack for thread: 0x00007fa031485800

RBP=0x000000054a3a69e0 is an oop

org.apache.kafka.common.network.PlaintextTransportLayer

- klass: 'org/apache/kafka/common/network/PlaintextTransportLayer'

RSI=0x000000073f09e6a0 is an oop

java.nio.HeapByteBuffer

- klass: 'java/nio/HeapByteBuffer'

RDI=0x00007f9fb13f84f3 is an unknown value

R8 =0x000000074041c2c8 is an oop

java.nio.HeapByteBuffer

- klass: 'java/nio/HeapByteBuffer'

R9 =0x00000000e808385f is an unknown value

R10=0x000000000000a6cf is an unknown value

R11=0x000000073f09e6a0 is an oop

java.nio.HeapByteBuffer

- klass: 'java/nio/HeapByteBuffer'

R12=0x0000000000000000 is an unknown value

R13=0x00007f9fada00000 is an unknown value

R14=0x00000000e80852ac is an unknown value

R15=0x00007fa031485800 is a thread


{code}
{code:java}
# Problematic frame:

# J 10102 C2 sun.nio.ch.FileChannelImpl.size()J (239 bytes) @ 0x00007fdc9aa2aa40 [0x00007fdc9aa2aa20+0x20]

Register to memory mapping:




RAX=0x0000000000000026 is an unknown value

RBX=0x0000000000000163 is an unknown value

RCX=0x0000000000000163 is an unknown value

RDX=0x0000000549a29d38 is an oop

sun.nio.ch.Util$1

- klass: 'sun/nio/ch/Util$1'

RSP=0x00007fdb18cc8848 is pointing into the stack for thread: 0x00007fdca943e000

RBP=0x0000000000000062 is an unknown value

RSI=0x0000000594f7e9b8 is an oop

sun.nio.ch.FileChannelImpl

- klass: 'sun/nio/ch/FileChannelImpl'

RDI=0x00007fdc5a042988 is an unknown value

R8 =0x0000000000000000 is an unknown value

R9 =0x00000000f805747d is an unknown value

R10=0x00000007190b80c0 is an oop

org.apache.kafka.common.record.FileRecords

- klass: 'org/apache/kafka/common/record/FileRecords'

R11=0x00000000b29efd37 is an unknown value

R12=0x0000000000000000 is an unknown value

R13=0x00000007190c0800 is an oop

java.util.ArrayDeque

- klass: 'java/util/ArrayDeque'

R14=0x0000000000000061 is an unknown value

R15=0x00007fdca943e000 is a thread
{code}

> Kafka SIGSEGV on kafka-network-thread
> -------------------------------------
>
>                 Key: KAFKA-8103
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8103
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 1.1.1
>         Environment: OS 
> Amazon Linux
> Kernel 
> 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
> Java
> openjdk version "1.8.0_191"
> OpenJDK Runtime Environment (build 1.8.0_191-b12)
> OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
> AWS Instance Type
> c5.4xlarge
>            Reporter: Sean Humbarger
>            Priority: Major
>         Attachments: hs_err_pid4345.log
>
>
> We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 65,000 messages per second and are seeing SIGSEGV crashes at least once a day (see attachment).  Each broker has six disks attached to it to support the kafka logs.  When the crash occurs, we simply restart kafka and everything seems fine.  We don't see anything out of the ordinary in /var/log/messages or dmesg when the crashes occur.  Thus far, we are unable to predict during the day when the crash will occur or which node it will occur on. 
>  
> The problematic frame is as follows:
> {code:java}
> # Problematic frame:
> # J 8628 C2 org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V (13 bytes) @ 0x00007ff779f9fca0 [0x00007ff779f9fc80+0x20]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)