You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by "Peter Robinson (LCL-C)" <Pe...@loblaw.ca> on 2021/05/11 22:57:26 UTC
apache Artemis 2.17.0 address federation on Pub/Sub large message upstream broker enters permanent spin high CPU state

The test harness is on Win 10 / java 1.8.0_121.

There are 2 WAN connected brokers running.  The Publisher (upstream) called 'A' and the Subscriber (downstream) called 'One'.  The Publisher code locally connects to the broker on its localhost, and the Subscriber code locally connects to the broker on its localhost.

We intend to let the Brokers manage the WAN networking connectivity issues, and that is why the code always connects to localhost.

There are actually 3 Publishers (nodes), and 2 Subscribers (nodes).  We don't need to complicate the problem statement, so please don't offer alternate queueing topologies.  Please stick with n-Publishers and m-Subscribers, where we only need one of each to demonstrate the problem.

This is all working with 3 Pubs and 2 Subs (current target state), where every message published by any of the Pubs arrives at both the Subs (and on the Pub node too).  The problem is not a topology issue.

For the test harness, I simply made 2 Brokers running on my PC using different acceptor ports, and web binding ports.

Both broker.xml files (pre) define the identical address and durable multicast.

      <addresses>
         <address name="theTopic">
            <multicast>
               <queue name="theClientID.subscriberName">
                  <durable>true</durable>
               </queue>
            </multicast>
         </address>
      </addresses>

The Subscriber broker 'One' defines the (upstream) Address federation.

      <connectors>
         <connector name="A-connector">tcp://localhost:61616</connector>
      </connectors>

      <federations>
        <federation name="oneFederation">
          <upstream name="fromAtoOneUpstream">
             <static-connectors>
                <connector-ref>A-connector</connector-ref>
             </static-connectors>
             <policy ref="policySet"/>
          </upstream>

          <policy-set name="policySet">
             <policy ref="addressPolicy" />
          </policy-set>

          <address-policy name="addressPolicy" max-hops="1">
             <include address-match="theTopic.#" />
             <exclude address-match="DLQ" />
             <exclude address-match="ExpiryQueue" />
          </address-policy>
        </federation>
      </federations>

The rest of broker.xml is as-created by ``artemis.cmd create *blah*``.

jndi is:
java.naming.factory.initial=org.apache.activemq.artemis.jndi.ActiveMQInitialContextFactory
connectionFactory.ConnectionFactoryA=tcp://localhost:61616
connectionFactory.ConnectionFactoryOne=tcp://localhost:61617
topic.topic/theTopic=theTopic

The first-time, I start the Publisher 'A' first, then the Subscriber 'One'.  Thereafter start order doesn't seem to matter.

A published message to A.theTopic remains in A.theClientID.subscriberName, and arrives in One.theClientID.subscriberName.  A subscriber on each of 'A' and 'One' reads the identical message off the Q.  OK, not identical - the JMSReceived value is different.  Probably numeric internal message numbers are different amongst the brokers.

Test harness - using random sized TextMessages between 1k and 10k, each containing a standard format 1st-line but then random text to fill the size, if I publish 1,000 messages on A.theTopic, then A.theClientID.subscriberName has 1,000 messages and One.theClientID.subscriberName has 1,000 messages.  Correct.

All works just fine with 3 Publishers (A, B, C for example) of 1,000 messages, then Subscribers (One, Two for example) each get 3,000 messages. Correct.

If I send 10 messages, but the 5th message is 135k in size, then One.theClientID.subscriberName only receives 4 messages.

Broker 'A' spins up to 99% CPU and enters a tight loop where it appears to be continuously reading message # 5 and decrementing to 4 remaining, and then adding it back (for some reason) and incrementing to 5 remaining.  Rinse and repeat.

The browser console shows the queue depth of A.theClientID.subscriberName to be (either 10 before I let the local Subscriber loose, or 0 after I let the local Subscriber loose), but the federated queue 'A.federated.oneFederation.fromAtoOneUpstream.theTopic.multicast' show as 6.  And remains at 6.

The browser console shows the queue depth of One.theClientID.subscriberName to be 4.  And remains at 4.

Stopping broker 'A' and restarting spins to 99% CPU.  Stopping both brokers and restarting has broker 'A' spinning to 99% CPU.

I now have to destroy broker 'A' and re-create.

The broker.xml acceptor shows (the default) tcp://0.0.0.0:61616?tcpSendBufferSize=1048576;tcpReceiveBufferSize=1048576;amqpMinLargeMessageSize=102400.   Or :61617 for 'One'.

I have not specified URL parameters on the JMS jndi connector such as connectionFactory.ConnectionFactoryA=tcp://localhost:61616?minLargeMessageSize=250000.

I have not enabled compression in broker.xml via <compress-large-messages>true</compress-large-messages>, or JMS jndi such as connectionFactory.ConnectionFactoryA=tcp://localhost:61616?minLargeMessageSize=250000?compressLargeMessages=true.

Editing etc/logging.properties and changing anything to do with artemis from INFO to DEBUG produces huge files, and where I saw the decrementing and incrementing logging.

Anyone have any thoughts / able to confirm by re-producing?  I can additionally post really simple java code as a test harness.

Thank you,
This email message is confidential, may be legally privileged and is intended for the exclusive use of the addressee. If you received this message in error or are not the intended recipient, you should destroy the email message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained. Please inform us of the delivery error by return email. Thank you for your cooperation.

Le présent message électronique est confidentiel et peut être couvert par le secret professionnel. Il est à l’usage exclusif du destinataire. Si vous recevez ce message par erreur ou si vous n’en êtes pas le destinataire prévu, vous devez détruire le message et toute pièce jointe ou copie et vous êtes tenu de ne pas conserver, distribuer, divulguer ni utiliser tout renseignement qu’il contient. Veuillez nous informer de toute erreur d’envoi en répondant à ce message. Merci de votre collaboration.