You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2010/03/18 12:09:27 UTC
DO NOT REPLY [Bug 48934] New: Cluster's regression. When replication
fails once, replication can be never done again.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48934
Summary: Cluster's regression. When replication fails once,
replication can be never done again.
Product: Tomcat 6
Version: 6.0.26
Platform: All
OS/Version: All
Status: NEW
Severity: regression
Priority: P2
Component: Cluster
AssignedTo: dev@tomcat.apache.org
ReportedBy: fujino.keiichi@oss.ntt.co.jp
I found cluster's regression in Tomcat6.0.26.
The reproduction is as follows.
=====
The cluster is composed of tomcat1 and tomcat2.
(Transport className is
org.apache.catalina.tribes.transport.nio.PooledParallelSender.
Perhaps, I think PooledMultiSender to be the same. )
Tomcat2 is stopped during session replication.
As a result, Session replication failed and ChannelException is thrown.
Tomcat2 restart.
Session replication again.
As a result, following exception is thrown.
org.apache.catalina.tribes.ChannelException: Sender not connected.; No faulty
members identified.
=====
The cause is
http://svn.apache.org/viewvc?view=revision&revision=908741
When replication fails, sender is disconnected by this fix.
The disconnect method is as follows in PooledParallelSender.
===
public synchronized void disconnect() {
this.connected = false;
super.disconnect();
}
===
this.connected is set to false, and super.disconnect() is called.
In super.disconnect(), the queue is closed.
I think.
if connected is set to false once, it never becomes true again.
and
if queue is closed once, it never opened again.
It is only ReplicationTransmitter#start to be able to set connected to true.
It is also the same to open the queue.
As a result,
when replication fails once, replication can be never done again.
I do not know the reason why r908741 is applied.
However, if ChannelException is thrown once, it becomes impossible to use all
Sender.
This is not good thing.
Can revert r908741 ?
If it is not possible, what is the reason for the r908741?
Best regards.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org
DO NOT REPLY [Bug 48934] Cluster's regression. When replication
fails once, replication can be never done again.
Posted by bu...@apache.org.
https://issues.apache.org/bugzilla/show_bug.cgi?id=48934
--- Comment #1 from Filip Hanik <fh...@apache.org> 2010-03-18 13:56:35 UTC ---
Created an attachment (id=25146)
--> (https://issues.apache.org/bugzilla/attachment.cgi?id=25146)
Bug fix
Dear Fujino, as always you are right. The intended fix was to close sockets
that were potentially left in a CLOSE_WAIT state when something went wrong. But
instead of closing the actual sender that holds the TCP sockets, I accidentally
closed the entire sender system
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org