You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/09/22 13:56:00 UTC

[jira] [Commented] (ARTEMIS-3264) Core to AMQP conversion error causes client disconnect

    [ https://issues.apache.org/jira/browse/ARTEMIS-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17608286#comment-17608286 ] 

ASF subversion and git services commented on ARTEMIS-3264:
----------------------------------------------------------

Commit cd7555c523297ec77d74feff465cb8984bf8c3c4 in activemq-artemis's branch refs/heads/main from Justin Bertram
[ https://gitbox.apache.org/repos/asf?p=activemq-artemis.git;h=cd7555c523 ]

ARTEMIS-3264 handle core-to-AMQP conversion failures more gracefully

If an AMQP consumer tries to receive a message and the broker is unable
to convert the message from core to AMQP then the consumer is
disconnected and the offending message stays in the queue. When the
consumer reconnects the conversion error will happen again resulting in
a loop that can only be resolved through administrative action (e.g.
deleting the message manually or sending it to a dead letter address).

This commit fixes that problem by detecting the conversion problem and
sending the message to the queue's dead letter address. It also doesn't
disconnect the consumer.

This commit also changes the log messages associated with sending a
message to the dead letter address since this event can now occur
regardless of the delivery attempts.


> Core to AMQP conversion error causes client disconnect
> ------------------------------------------------------
>
>                 Key: ARTEMIS-3264
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3264
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: AMQP, Broker
>    Affects Versions: 2.17.0
>         Environment: Embedded Apache Artemis 2.17.0
> Windows Server 2016 Standard (10.0.14393)
> Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
> Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
>            Reporter: Christian Danner
>            Assignee: Justin Bertram
>            Priority: Critical
>         Attachments: activemq_artemis.log
>
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> We are deploying a mesh of embedded brokers and per default use core bridges to replicate data between different broker instances / topics.
> The clients that actually consume messages are connected using AMQP (QPID, AMQP .Net Lite)
> Recently we encountered a situation where the broker could not deliver a message to a (Java QPID) client because the internal conversion from Core to AMQP failed (see attached log file).
> This had the effect that the client got disconnected and did not receive any messages anymore at all (it was stuck in a JMS receive call and obviously was not informed about disconnect - not sure if this is a QPID/Proton issue, but even after restart the client was not able to connect anymore to the server! We had to restart the server to be able to connect again!)
> We are currently working around this issue by using AMQP (i.e. JMS) as the only client side protocol to avoid that Core-AMQP conversion happens in the first place.
> However, I'm wondering if the way the broker deals with such errors is a good idea - it disconnects the client and keeps the message in the queue, so even after reconnect the delivery fails again with the same Exception!
> Looking at the call stack (ending up in QueueImpl:3800) this kind of error is handled in a very generic way - the handler method does not distinguish between different types of Exceptions and knows nothing about the reason why delivery failed, however it still defaults to disconnecting the corresponding client.
> I think in the situation described above it would be necessary to forward the erroneous message to a DLQ instead and continue with the next message. Currently the message clogs the queue and needs to be deleted / moved manually in order for processing to continue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)