You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Rayman (Jira)" <ji...@apache.org> on 2020/08/10 23:14:00 UTC
[jira] [Commented] (SAMZA-2577) Threads appending to StreamAppender
block/deadlock in high tput scenarios, leading to processing stalls
[ https://issues.apache.org/jira/browse/SAMZA-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175097#comment-17175097 ]
Rayman commented on SAMZA-2577:
-------------------------------
Log4j2 fix: [https://github.com/apache/samza/pull/1411/files]
> Threads appending to StreamAppender block/deadlock in high tput scenarios, leading to processing stalls
> -------------------------------------------------------------------------------------------------------
>
> Key: SAMZA-2577
> URL: https://issues.apache.org/jira/browse/SAMZA-2577
> Project: Samza
> Issue Type: Bug
> Reporter: Rayman
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Problem:
> In both StreamAppender for log4j1 and log4j2 a blocking queue is used to coordinate between the append()-ing threads and a single thread send()-ing to Kafka.
> This is a bounded, blocking, lock-synchronized queue.
> To avoid deadlock scenarios (see SAMZA-1537), the append()-ing threads have a timeout of 2 seconds, after which the log message is discarded and the queue is drained.
> This means in case of message bursts, threads calling append() may block for upto 2 seconds, and may continually be stuck in this pattern, leading to processing stalls and lowered throughput.
> *Solutions for Log4j2*
> Solution 1. Enable async logger in log4j2, since they are supported and provided in log4j2.[https://logging.apache.org/log4j/2.x/manual/async.html].
> In using this capability, the blocking-queue in StreamAppender is not required because the logger itself will be asynchronous, and so append() threads can directly call systemProducer.send().
> However, if async loggers are not used then this queue based mechanism, to give the append()-ing threads an "async" illusion, is required.
> Solution 2. Continue using the blocking bounded lock-based queue, but make the queue size and timeout configurable. Users can then tune this to account for message bursts.
> Solution 3. Move to use a lock-less queue, e.g., ConcurrentLinkedQueue (unbounded) or
> implement a bounded lock-less queue, or use [open-source implementations|[https://stackoverflow.com/questions/20890554/lock-free-circular-array]].
> Append()-ing threads will no longer need to block or timeout. However the caller may busy-wait or need a fixed-rate or fixed-sleep-time to avoid busy waits, since a lock-less queue is non blocking.
> It uses CAS operations.
> *For log4j2, we will adopt Solution 1.*
> *Solutions for Log4j1*
> Solution 1. Deprecate – log4j1 is not supported.
> Solution 2. Similar to Solution 2 above.
> Solution 3. Similar to Solution 3 above.
> *For log4j1, we will adopt Solution 1 – won't fix.*
--
This message was sent by Atlassian Jira
(v8.3.4#803005)