You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Apurva Mehta (JIRA)" <ji...@apache.org> on 2017/06/01 06:57:04 UTC
[jira] [Comment Edited] (KAFKA-5357) StackOverFlow error in
transaction coordinator
[ https://issues.apache.org/jira/browse/KAFKA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032547#comment-16032547 ]
Apurva Mehta edited comment on KAFKA-5357 at 6/1/17 6:56 AM:
-------------------------------------------------------------
The stack over flow error is essentially due to the fact that we are in a tight recursive loop when handling transaction marker write completion. Here is what happens:
# We send the marker
# Upon success, we try to write the updated transaction metadata to the transaction log to move it form PrepareXX to CompletedXX state.
# If this append fails, we do a recursive call to `addTransactionToLog` with the same metadata update, ad infinitum.
# When the there are broker bounces and not enough replicas are available, this can happen in a tight loop for several 10's of seconds, resulting in a stack overflow error.
One fix is to back of and retry rather than doing the tight loop -- that is what the client does.
was (Author: apurva):
The stack over flow error is essentially due to the fact that we are in a tight recursive loop when handling transaction marker write completion. Here is what happens:
# We send the marker
# Upon success, we try to write the updated transaction metadata to the transaction log to move it form PrepareXX to CompletedXX state.
# If this append fails, we do a recursive call for `addTransactionToLog`, ad infinitum.
# When the there are broker bounces and not enough replicas are available, this can happen in a tight loop for several 10's of seconds, resulting in a stack overflow error.
One fix is to back of and retry rather than doing the tight loop -- that is what the client does.
> StackOverFlow error in transaction coordinator
> ----------------------------------------------
>
> Key: KAFKA-5357
> URL: https://issues.apache.org/jira/browse/KAFKA-5357
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.11.0.0
> Reporter: Apurva Mehta
> Priority: Blocker
> Labels: exactly-once
> Fix For: 0.11.0.0
>
> Attachments: KAFKA-5357.tar.gz
>
>
> I observed the following in the broker logs:
> {noformat}
> [2017-06-01 04:10:36,664] ERROR [Replica Manager on Broker 1]: Error processing append operation on partition __transaction_state-37 (kafka.server.ReplicaManager)
> [2017-06-01 04:10:36,667] ERROR [TxnMarkerSenderThread-1]: Error due to (kafka.common.InterBrokerSendThread)
> java.lang.StackOverflowError
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.PrintWriter.<init>(PrintWriter.java:116)
> at java.io.PrintWriter.<init>(PrintWriter.java:100)
> at org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58)
> at org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
> at org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
> at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313)
> at org.apache.log4j.DailyRollingFileAppender.subAppend(DailyRollingFileAppender.java:369)
> at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
> at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
> at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
> at org.apache.log4j.Category.callAppenders(Category.java:206)
> at org.apache.log4j.Category.forcedLog(Category.java:391)
> at org.apache.log4j.Category.error(Category.java:322)
> at kafka.utils.Logging$class.error(Logging.scala:105)
> at kafka.server.ReplicaManager.error(ReplicaManager.scala:122)
> at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:557)
> at kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:505)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:116)
> at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
> at scala.collection.AbstractTraversable.map(Traversable.scala:104)
> at kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:505)
> at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:346)
> at kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply$mcV$sp(TransactionStateManager.scala:589)
> at kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply(TransactionStateManager.scala:570)
> at kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1.apply(TransactionStateManager.scala:570)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
> at kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala:219)
> at kafka.coordinator.transaction.TransactionStateManager.appendTransactionToLog(TransactionStateManager.scala:564)
> at kafka.coordinator.transaction.TransactionMarkerChannelManager.kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1(TransactionMarkerChannelManager.scala:225)
> at kafka.coordinator.transaction.TransactionMarkerChannelManager$$anonfun$kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1$4.apply(TransactionMarkerChannelManager.scala:225)
> at kafka.coordinator.transaction.TransactionMarkerChannelManager$$anonfun$kafka$coordinator$transaction$TransactionMarkerChannelManager$$retryAppendCallback$1$4.apply(TransactionMarkerChannelManager.scala:225)
> at kafka.coordinator.transaction.TransactionStateManager.kafka$coordinator$transaction$TransactionStateManager$$updateCacheCallback$1(TransactionStateManager.scala:561)
> at kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1$$anonfun$apply$mcV$sp$4.apply(TransactionStateManager.scala:595)
> at kafka.coordinator.transaction.TransactionStateManager$$anonfun$appendTransactionToLog$1$$anonfun$apply$mcV$sp$4.apply(TransactionStateManager.scala:595)
> at kafka.server.ReplicaManager.appendRecords(ReplicaManager.scala:373)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)