You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-user@james.apache.org by Marcello Marangio <m....@tno.it> on 2007/02/26 16:21:35 UTC

POP3 problem in a HA environment

Hi

There could be a problem with james 2.2.0 in a High Avaliability
environment.

We set up 2 james instances to work at the same time on the same database
(using a failover connection string).

There is a load balancer which every few seconds checks if the pop3 server
is alive, without closing properly the pop3 session.

Every time the check happens we get in the pop3server logs an error message
"Connection reset".

 

This is the stacktrace:

 

26/02/07 14:43:27 ERROR pop3server: Exception during connection from
<ip_address> (<ip_address>) : Connection reset

java.net.SocketException: Connection reset

        at java.net.SocketInputStream.read(SocketInputStream.java:168)

        at java.io.BufferedInputStream.read1(BufferedInputStream.java:220)

        at java.io.BufferedInputStream.read(BufferedInputStream.java:277)

        at
sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java:408)

        at
sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java:450)

        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:182)

        at java.io.InputStreamReader.read(InputStreamReader.java:167)

        at java.io.BufferedReader.fill(BufferedReader.java:136)

        at java.io.BufferedReader.read(BufferedReader.java:157)

        at
org.apache.james.util.CRLFTerminatedReader.readLine(CRLFTerminatedReader.jav
a:98)

        at
org.apache.james.pop3server.POP3Handler.readCommandLine(POP3Handler.java:415
)

        at
org.apache.james.pop3server.POP3Handler.handleConnection(POP3Handler.java:26
6)

        at
org.apache.james.util.connection.ServerConnection$ClientConnectionRunner.run
(ServerConnection.java:417)

        at
org.apache.james.util.thread.ExecutableRunnable.execute(ExecutableRunnable.j
ava:55)

        at
org.apache.james.util.thread.WorkerThread.run(WorkerThread.java:90)

 

 

We thought it wasn't harmless, but after a while (more or less an hour) the
system crashes.

At the moment the problem happens even if there is only one james instance
up and running, i.e. it's not a concurrency problem.

 

Any help will be REALLY appreciated.

Thanks

Marcello


Re: R: POP3 problem in a HA environment

Posted by Michael Weissenbacher <mw...@dermichi.com>.
Hi,
> I am afraid that with a simple java program that performs very fast socket
> open/reset james can crash, which would be a serious bug.
You haven't told us about your environment. On Java Production Servers 
under Linux with many open sockets I usually have to change the nofile 
Parameter in security.conf like this:
*               hard    nofile  100000

This is because every socket counts as an open file and the default 
(1024) is pretty low for such situations.

hth,
Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


R: POP3 problem in a HA environment

Posted by Marcello Marangio <m....@tno.it>.
Hi
A bit of news.
We increased the check delay of the load balancer from 30 secs to 180 secs
and he system is still up and running from 48 hours.
It seems that the load balancer "is alive" check is a simple socket's
open/reset operation on the port 110 (that's why the "connection reset"
exception on the pop3 server); so it seems that after a few close enough
socket reset operation the system crashes.  I guess if the delay is big
enough the system actually reclaims the resources correctly, as danny said.

I am afraid that with a simple java program that performs very fast socket
open/reset james can crash, which would be a serious bug.

If no one made a test like that, I'll try to do it on james 2.2.0 and on
james 2.3 and I'll keep the mailing list informed.
Cheers
Marcello

> Da: danny.angus@gmail.com [mailto:danny.angus@gmail.com]
> Inviato: lunedì 26 febbraio 2007 16.32
> 
> On 2/26/07, Marcello Marangio <m....@tno.it> wrote:
> 
> > We thought it wasn't harmless, but after a while (more or less an hour)
> the
> > system crashes.
> 
> I'm surprised by this, it *should* be the server giving up on a
> hanging connection and should be reclaiming resources not leaking
> them.
> 
> I don't know if I can reproduce it, but you *should* raise a defect in
> JIRA about this:
> 
> http://issues.apache.org/jira/browse/JAMES
> 
> d.
> 
> >
> > At the moment the problem happens even if there is only one james
> instance
> > up and running, i.e. it's not a concurrency problem.
> >
> >
> >
> > Any help will be REALLY appreciated.
> >
> > Thanks
> >
> > Marcello
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


Re: POP3 problem in a HA environment

Posted by Danny Angus <da...@apache.org>.
On 2/26/07, Marcello Marangio <m....@tno.it> wrote:

> We thought it wasn't harmless, but after a while (more or less an hour) the
> system crashes.

I'm surprised by this, it *should* be the server giving up on a
hanging connection and should be reclaiming resources not leaking
them.

I don't know if I can reproduce it, but you *should* raise a defect in
JIRA about this:

http://issues.apache.org/jira/browse/JAMES

d.

>
> At the moment the problem happens even if there is only one james instance
> up and running, i.e. it's not a concurrency problem.
>
>
>
> Any help will be REALLY appreciated.
>
> Thanks
>
> Marcello
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org