You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Dave B <db...@bossnine.com> on 2022/12/02 14:11:29 UTC

Farm deploy random failures

I'm having intermittent failures when I deploy to a cluster. I see the 
war file sent to slave nodes but it then becomes zero size. It happens 
on different nodes and not all the time.

Upon failure, Master node .out shows

SEVERE [Catalina-utility-1] 
org.apache.catalina.ha.tcp.SimpleTcpCluster.send Unable to send message 
through cluster sender.
     org.apache.catalina.tribes.ChannelException: Send failed, 
attempt:[1] max:[1]; Faulty members:tcp://{172, xx, xx, xx}:5222;
         at 
org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:217)
         at 
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:78)
         at 
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:51)
         at 
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:65)
         at 
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:83)
         at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89)
         at 
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMessage(ThroughputInterceptor.java:62)
         at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89)
         at 
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:93)


Slave node .out shows



  WARNING [Tribes-Task-Receiver[localhost-Channel]-7] 
org.apache.catalina.tribes.group.GroupChannel.messageReceived Error 
receiving message:
     java.lang.NullPointerException
         at 
org.apache.catalina.ha.deploy.FileMessageFactory.writeMessage(FileMessageFactory.java:247)
         at 
org.apache.catalina.ha.deploy.FarmWarDeployer.messageReceived(FarmWarDeployer.java:226)
         at 
org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:821)
         at 
org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:803)
         at 
org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:345)
         at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
         at 
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:118)
         at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
         at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
         at 
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.messageReceived(ThroughputInterceptor.java:94)
         at 
org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
         at 
org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:288)
         at 
org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:272)
         at 
org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:229)
         at 
org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:103)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
         at java.lang.Thread.run(Thread.java:750)


and here is the cluster section of master node server.xml


         <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                  channelSendOptions="6">
           <Manager className="org.apache.catalina.ha.session.BackupManager"
                    expireSessionsOnShutdown="false"
                    notifyListenersOnReplication="true"
                    sessionAttributeValueClassNameFilter=".+"
                    mapSendOptions="6"/>
           <Channel 
className="org.apache.catalina.tribes.group.GroupChannel">
             <Membership 
className="org.apache.catalina.tribes.membership.McastService"
                       address="xxx.xxx.xxx.xxx"
                       port="xxxx"
                       frequency="500"
                       dropTime="5000"
                       localLoopbackDisabled="false"/>
             <Receiver 
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                       address="auto"
                       port="5221"
                       selectorTimeout="100"
                       maxThreads="20"
                       timeout="5000"
                       autoBind="1000"/>
             <Sender 
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
               <Transport 
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
                       timeout="5000"/>
             </Sender>
             <Interceptor 
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
                       connectTimeout="5000"/>
             <Interceptor 
className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/>
             <Interceptor 
className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
           </Channel>
           <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
 
filter=".*\.gif|.*\.js|.*\.jpeg|.*\.jpg|.*\.png|.*\.htm|.*\.html|.*\.css|.*\.txt"/>
           <Deployer 
className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                     tempDir="/apps/tomcat/env22_node1/temp/"
                     deployDir="/apps/tomcat/env22_node1/webapps/"
                     watchDir="/apps/deployments/tomcat/env22/"
                     watchEnabled="true"/>
           <ClusterListener 
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
         </Cluster>


Is this enough info for someone to suggest a fix?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Farm deploy random failures

Posted by Mark Thomas <ma...@apache.org>.
Exact Tomcat version?

Is this on pysical machines or on VMs?

Are there associated warning messages in the logs before the failure 
message about retries?

I've looked though the relevant cluster code and I don't see anything 
obvious that could cause this in terms of a Tomcat bug. Increasing 
maxRetryAttempts and/or timeout may help.

Mark


On 02/12/2022 14:11, Dave B wrote:
> I'm having intermittent failures when I deploy to a cluster. I see the 
> war file sent to slave nodes but it then becomes zero size. It happens 
> on different nodes and not all the time.
> 
> Upon failure, Master node .out shows
> 
> SEVERE [Catalina-utility-1] 
> org.apache.catalina.ha.tcp.SimpleTcpCluster.send Unable to send message 
> through cluster sender.
>      org.apache.catalina.tribes.ChannelException: Send failed, 
> attempt:[1] max:[1]; Faulty members:tcp://{172, xx, xx, xx}:5222;
>          at 
> org.apache.catalina.tribes.transport.nio.ParallelNioSender.doLoop(ParallelNioSender.java:217)
>          at 
> org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:78)
>          at 
> org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:51)
>          at 
> org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:65)
>          at 
> org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:83)
>          at 
> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89)
>          at 
> org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMessage(ThroughputInterceptor.java:62)
>          at 
> org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:89)
>          at 
> org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:93)
> 
> 
> Slave node .out shows
> 
> 
> 
>   WARNING [Tribes-Task-Receiver[localhost-Channel]-7] 
> org.apache.catalina.tribes.group.GroupChannel.messageReceived Error 
> receiving message:
>      java.lang.NullPointerException
>          at 
> org.apache.catalina.ha.deploy.FileMessageFactory.writeMessage(FileMessageFactory.java:247)
>          at 
> org.apache.catalina.ha.deploy.FarmWarDeployer.messageReceived(FarmWarDeployer.java:226)
>          at 
> org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:821)
>          at 
> org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:803)
>          at 
> org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:345)
>          at 
> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
>          at 
> org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:118)
>          at 
> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
>          at 
> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
>          at 
> org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.messageReceived(ThroughputInterceptor.java:94)
>          at 
> org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:96)
>          at 
> org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:288)
>          at 
> org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:272)
>          at 
> org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:229)
>          at 
> org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:103)
>          at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>          at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>          at java.lang.Thread.run(Thread.java:750)
> 
> 
> and here is the cluster section of master node server.xml
> 
> 
>          <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
>                   channelSendOptions="6">
>            <Manager 
> className="org.apache.catalina.ha.session.BackupManager"
>                     expireSessionsOnShutdown="false"
>                     notifyListenersOnReplication="true"
>                     sessionAttributeValueClassNameFilter=".+"
>                     mapSendOptions="6"/>
>            <Channel 
> className="org.apache.catalina.tribes.group.GroupChannel">
>              <Membership 
> className="org.apache.catalina.tribes.membership.McastService"
>                        address="xxx.xxx.xxx.xxx"
>                        port="xxxx"
>                        frequency="500"
>                        dropTime="5000"
>                        localLoopbackDisabled="false"/>
>              <Receiver 
> className="org.apache.catalina.tribes.transport.nio.NioReceiver"
>                        address="auto"
>                        port="5221"
>                        selectorTimeout="100"
>                        maxThreads="20"
>                        timeout="5000"
>                        autoBind="1000"/>
>              <Sender 
> className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
>                <Transport 
> className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
>                        timeout="5000"/>
>              </Sender>
>              <Interceptor 
> className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
>                        connectTimeout="5000"/>
>              <Interceptor 
> className="org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor"/>
>              <Interceptor 
> className="org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor"/>
>            </Channel>
>            <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
> 
> filter=".*\.gif|.*\.js|.*\.jpeg|.*\.jpg|.*\.png|.*\.htm|.*\.html|.*\.css|.*\.txt"/>
>            <Deployer 
> className="org.apache.catalina.ha.deploy.FarmWarDeployer"
>                      tempDir="/apps/tomcat/env22_node1/temp/"
>                      deployDir="/apps/tomcat/env22_node1/webapps/"
>                      watchDir="/apps/deployments/tomcat/env22/"
>                      watchEnabled="true"/>
>            <ClusterListener 
> className="org.apache.catalina.ha.session.ClusterSessionListener"/>
>          </Cluster>
> 
> 
> Is this enough info for someone to suggest a fix?
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org