You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@activemq.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/09/20 11:54:00 UTC

[jira] [Work logged] (ARTEMIS-3264) Core to AMQP conversion error causes client disconnect

     [ https://issues.apache.org/jira/browse/ARTEMIS-3264?focusedWorklogId=810363&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-810363 ]

ASF GitHub Bot logged work on ARTEMIS-3264:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Sep/22 11:53
            Start Date: 20/Sep/22 11:53
    Worklog Time Spent: 10m 
      Work Description: gemmellr commented on PR #4225:
URL: https://github.com/apache/activemq-artemis/pull/4225#issuecomment-1252244000

   'Fix' the logger codes in what sense? Is it that they dont follow the 'number also conveys level' convention? I know @clebertsuconic hates that and doesnt intend doing it going forward (and various things clearly havent followed it historically) so I'm not sure this change actually makes sense at this point if thats the reason.
   
   Regardless it doesnt seem nice to swap codes of distinct messages to reusing codes that other messages were previously using, which this seems to do a couple times, and also leaves their original numbers open for reuse later for a couple more instances. If changing them they should probably use new non-clashing numbers, and try to prevent the other ones being reused; I think I saw mention of deprecating rather than removing methods to prevent that.
   
   It doesnt really seem like a NO-JIRA change whatever way it actually changed.




Issue Time Tracking
-------------------

            Worklog Id:     (was: 810363)
    Remaining Estimate: 0h
            Time Spent: 10m

> Core to AMQP conversion error causes client disconnect
> ------------------------------------------------------
>
>                 Key: ARTEMIS-3264
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3264
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: AMQP, Broker
>    Affects Versions: 2.17.0
>         Environment: Embedded Apache Artemis 2.17.0
> Windows Server 2016 Standard (10.0.14393)
> Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
> Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
>            Reporter: Christian Danner
>            Assignee: Justin Bertram
>            Priority: Critical
>         Attachments: activemq_artemis.log
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> We are deploying a mesh of embedded brokers and per default use core bridges to replicate data between different broker instances / topics.
> The clients that actually consume messages are connected using AMQP (QPID, AMQP .Net Lite)
> Recently we encountered a situation where the broker could not deliver a message to a (Java QPID) client because the internal conversion from Core to AMQP failed (see attached log file).
> This had the effect that the client got disconnected and did not receive any messages anymore at all (it was stuck in a JMS receive call and obviously was not informed about disconnect - not sure if this is a QPID/Proton issue, but even after restart the client was not able to connect anymore to the server! We had to restart the server to be able to connect again!)
> We are currently working around this issue by using AMQP (i.e. JMS) as the only client side protocol to avoid that Core-AMQP conversion happens in the first place.
> However, I'm wondering if the way the broker deals with such errors is a good idea - it disconnects the client and keeps the message in the queue, so even after reconnect the delivery fails again with the same Exception!
> Looking at the call stack (ending up in QueueImpl:3800) this kind of error is handled in a very generic way - the handler method does not distinguish between different types of Exceptions and knows nothing about the reason why delivery failed, however it still defaults to disconnecting the corresponding client.
> I think in the situation described above it would be necessary to forward the erroneous message to a DLQ instead and continue with the next message. Currently the message clogs the queue and needs to be deleted / moved manually in order for processing to continue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)