You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-user@james.apache.org by Diego Castillo <di...@inexbee.com> on 2003/02/12 15:51:00 UTC

RE : RE : Really slow spooling again... :(

Hi Noel,

Thanks for the autoreconnect=true tip. I will perform further load tests
tonight.

Your test of starting James with 200K messages in spool is very
interesting. I have made similar tests myself (10K messages, 100MB spool
table) without much success. The start query that James sends to the
database took too long (>60s) and James never consumed the queue. I
experienced this both with MySQL and Oracle.

I thought about a possible roundabout which I have never had the time to
test. Maybe you can tell if it makes any sense. It consists in appending
to the listMessagesSQL query from sqlResources.xml a LIMIT xxx. This
way, MySQL does not send back a huge ResultSet which makes the JDBC
connection to expire (how do you configure the 60s timeout?). Would this
work?

Regards,


Diego

-----Message d'origine-----
De : Noel J. Bergman [mailto:noel@devtech.com] 
Envoye : mardi 11 fevrier 2003 17:35
A : James Users List
Objet : RE: RE : Really slow spooling again... :(

Diego,

At this time, there are no plans to replace mordred in James v2.1.
Those
are some of the things that will be part of James v3.

I spent quite a bit of time with mordred yesterday.  One thing that I
found
was that it really helps to add autoreconnect=true to the MySQL URL.  I
had
a spool with over 200,000 items in it.  Running James with no SMTP
connections, so only the spool is using JDBC, I could not run for long
before getting exceptions.  With the auto-reconnect support, I was able
to
process the entire spool at a rate of roughly 1667 messages per minute
without changing mordred at all.

I would like to see spool processing much faster than that, but where I
am
at in my testing is this: if I don't rate limit postal many of the
messages
accummulate in the queue.  If I rate limit postal to 600 messages per
minute, I can run unimpeded for hours with 5% free CPU.  Since normally
I
have 0% free CPU during a load test, I am figuring that for this system,
600
MPM is about the limit.  That is with the spooler performing work.  With
the
spooler discarding messages at the top of root, I can go about 1600 MPM,
and
wit the spooler disabled, I seem to recall about 3000 MPM.  So there
appears
to be considerable room for improvement in the spooler.

I was planning to make some tweaks to mordred, and perhaps there are
some
changes we could make to the table structure, but right now the biggest
improvement in reliability for me over the stock config.xml was to add
the
autoreconnect=true parameter to the MySQL URL.

Ultimately, I suspect that we'll have to revisit the spooling mechanism,
and
possibly do some rate limiting on the incoming connection based upon the
spooler.

	--- Noel

-----Original Message-----
From: Diego Castillo [mailto:diego.castillo@inexbee.com]
Sent: Tuesday, February 11, 2003 4:30
To: 'James Users List'
Cc: Diego Castillo, INEXBEE
Subject: RE : Really slow spooling again... :(


Hi Noel,

I have seen your mail on Mordred and some issues with JDBC spool, which
you seem to have corrected in latest build.

I am experiencing myself some problems on heavy load conditions (more
than 5 mails per second). MySQL gets tired and starts responding slowly.
This slows James down, so the spool queue grows and MySQL gets even
slower. At some point it takes really long to take a mail out of the
spool (>60s), and Mordred fires a DB connection timeout. Then it gets
what looks to me like a deadlock and James hangs (see stack trace
below). I need to restart James and OPTIMIZE TABLE spool, following
Danny's advice.

Is this the issue you have corrected in 2.1.1a7? I am using 2.1. I am
looking forward searchable datasources via JNDI, which Serge seems to
have added to HEAD. Will this be part of the 2.1.1 final release?

Regards,


Diego


*** Stack trace ***

database-connections.maildb: java.lang.Throwable
   at org.apache.james.util.mordred.PoolConnEntry
      .lock(PoolConnEntry.java:84)
   at org.apache.james.util.mordred.JdbcDataSource
      .getConnection(JdbcDataSource.java:127)
   at org.apache.james.mailrepository.JDBCSpoolRepository
      .loadPendingMessages(JDBCSpoolRepository.java:228)
   at org.apache.james.mailrepository.JDBCSpoolRepository
      .getNextPendingMessage(JDBCSpoolRepository.java:203)
   at org.apache.james.mailrepository.JDBCSpoolRepository
      .accept(JDBCSpoolRepository.java:137)
   at org.apache.james.transport.mailets.RemoteDelivery
      .run(RemoteDelivery.java:577)
   at java.lang.Thread.run(Thread.java:536)

-----Message d'origine-----
De : Noel J. Bergman [mailto:noel@devtech.com]
Envoye : mardi 11 fevrier 2003 01:21
A : James Users List
Objet : RE: Really slow spooling again... :(

Kenny,

Try adding some additional threads to RemoteDelivery, but unless you
also
configured RemoteDelivery's output spool for JDBC, do *not* do it unless
you
are using v2.1.1a7 or later, due to a synchronization error that I found
and
fixed.

Yes, the record at openrbl.org doesn't look good for you.  Are you a
colo,
or do you need a host?

	--- Noel

-----Original Message-----
From: Kenny Smith [mailto:jakarta-james@journalscape.com]
Sent: Monday, February 10, 2003 18:31
To: James Users List
Subject: Re: Really slow spooling again... :(


Hi Noel et al,

It appears that connections from my server are getting denied by hotmail
and yahoo (my ISP just got labeled as a spammer... I'm trying to find a
new ISP), so I'm getting a lot of timeouts and a lot of retries. I can
only assume that it's just going through the spool really slowly because
it takes a long time to timeout. :/

Me need new ISP. *sigh*

Kenny

Noel J. Bergman wrote:

> Kenny,
>
> I have run tests most nights lately, each about 1 million messages.
They
> run through the spooler until they encounter a matcher that discards
them
> all.  The tests average about 1400 messages per minute on the 400 mhz
> Celeron test bed.
>
> The test basically exercises the heck out of the SMTP handler and the
> spooler.  The spooler is configured in mysql as:
>
>
>
>
>
> The data-source is the standard entry for MySQL, except that the
> database is
> "test" on that machine.
>
> What database (type and version) are you using?  One thing that I
found a
> while ago was that on very rare occassions, I could go into MySQL, do
a
> manual SQL query, and notice an excessive amount of time (let's say 1
> second
> instead of 0.006).  Restarting MySQL (and James because of mordred)
> cleared
> that up.
>
> The fact that they are in transport is insufficient.  What you need to
> do is
> see where in the processor they are.  Look for entries of the kind:
>
>  Checking  with
>  Servicing  by
>
> Those should tell us specifically where in the transport the message
has
> gotten, and the timestamps will tell us how long it is taking to move
> through the spool.
>
> 	--- Noel
>
> -----Original Message-----
> From: Kenny Smith [mailto:jakarta-james@journalscape.com]
> Sent: Monday, February 10, 2003 13:10
> To: james-user@jakarta.apache.org
> Subject: Really slow spooling again... :(
>
> Hi all,
>
> I'm having all of my mail backup in the spool tables (using JDBC) with
a
> message_state of 'transport'. I've turned up my logging to DEBUG, but
> I'm not seeing anything in the logs that looks relevant to slow
> performance. I sent 230 email 2 hours ago and there are still 190 in
the
> spool/transport waiting to get sent.
>
> I saw the recent conversation about indexes on the spool table causing
> slow downs after a while, so I stopped James, dumped the table,
removed
> the indicies, put the table back in and started James up. Nothing
> appears to have changed.
>
> I'm running James 2.1 with jdk1.4 on Solaris.
>
> Any help is appreciated.
>
> Kenny Smith
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-user-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: james-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-user-help@jakarta.apache.org


RE: Improving JDBC Spool responsiveness

Posted by Danny Angus <da...@apache.org>.
I like this, particularly if it helps.

> -----Original Message-----
> From: Noel J. Bergman [mailto:noel@devtech.com]
> Sent: 14 February 2003 07:25
> To: James Developers List
> Subject: RE: Improving JDBC Spool responsiveness
> 
> 
> Serge,
> 
> Yes, using Statement.setMaxRows(int max) is probably the right 
> thing to do.
> In the MySQL Connector/J drivers it costs an extra message to the server,
> I'm not sure if PostgreSQL handles it at the server or the client, and who
> knows what other drivers do, but setMaxRows is standard in the API, and
> LIMIT isn't supported in the query language by at least Oracle and SQL
> Server.
> 
> Thanks.  Good point.
> 
> What would you think of changing the spool config in config.xml as:
> 
>   <config>
>     <sqlFile>file://conf/sqlResources.xml</sqlFile>
>     <filestore>file://var/dbmail</filestore>
>     <limit> N </limit>
>   </config>
> 
> "Limit" could be renamed cachesize, maxrows, prefetch, or
> whateverwewanttocallit, but the value would be kept and used in
> loadPendingMessages():
> 
>   listMessages =
> conn.prepareStatement(sqlQueries.getSqlString("listMessagesSQL",
>                                                                 true));
>   listMessages.setString(1, repositoryName);
>   listMessages.setMaxRows(limit);
>   rsListMessages = listMessages.executeQuery();
> 
>   while (rsListMessages.next() && pendingMessages.size() < limit && ...)
> {...}
> 
> Thoughts?
> 
> 	--- Noel
> 
> -----Original Message-----
> From: Serge Knystautas [mailto:sergek@lokitech.com]
> Sent: Thursday, February 13, 2003 3:10
> To: James Developers List
> Subject: Re: Improving JDBC Spool responsiveness
> 
> 
> Noel J. Bergman wrote:
> >>It consists in appending to the listMessagesSQL query from
> >>sqlResources.xml a LIMIT xxx. This way, MySQL does not send
> >>back a huge ResultSet which makes the JDBC connection to expire
> >
> >
> > Good thought regarding LIMIT.  There is code in JDBCSpoolRepository for
> > limiting the number of messages loaded into an internal working set, but
> the
> > SQL query still has to generate the large result set.  There are two
> places
> > where we use listMessageSQL:
> >
> >    JDBCMailRepository.list()
> >    JDBCSpoolReposittory.loadPendingMessages()
> >
> > The former is only used by the POP3 handler when listing the contents of
> the
> > user's mailbox.  The latter is used internally to load the 
> working set.  I
> > was thinking that perhaps it doesn't make sense to limit the 
> list given to
> > the POP3 handler, although it would simply require the user to clear out
> > their messages before they could retrieve more of them.  Or I 
> could clone
> > the listMessagesSQL to separate the queries, which is probably 
> a good idea
> > for other reasons.
> 
> Another approach is to use Statement.setMaxRows(int max).  Many JDBC
> drivers can then figure out how to execute this so you achieve the goal
> of not returning all the records.
> 
> --
> Serge Knystautas
> President
> Lokitech >> software . strategy . design >> http://www.lokitech.com/
> p. 1.301.656.5501
> e. sergek@lokitech.com
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-dev-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: james-dev-help@jakarta.apache.org
> 

RE: Improving JDBC Spool responsiveness

Posted by "Noel J. Bergman" <no...@devtech.com>.
Serge,

Yes, using Statement.setMaxRows(int max) is probably the right thing to do.
In the MySQL Connector/J drivers it costs an extra message to the server,
I'm not sure if PostgreSQL handles it at the server or the client, and who
knows what other drivers do, but setMaxRows is standard in the API, and
LIMIT isn't supported in the query language by at least Oracle and SQL
Server.

Thanks.  Good point.

What would you think of changing the spool config in config.xml as:

  <config>
    <sqlFile>file://conf/sqlResources.xml</sqlFile>
    <filestore>file://var/dbmail</filestore>
    <limit> N </limit>
  </config>

"Limit" could be renamed cachesize, maxrows, prefetch, or
whateverwewanttocallit, but the value would be kept and used in
loadPendingMessages():

  listMessages =
conn.prepareStatement(sqlQueries.getSqlString("listMessagesSQL",
                                                                true));
  listMessages.setString(1, repositoryName);
  listMessages.setMaxRows(limit);
  rsListMessages = listMessages.executeQuery();

  while (rsListMessages.next() && pendingMessages.size() < limit && ...)
{...}

Thoughts?

	--- Noel

-----Original Message-----
From: Serge Knystautas [mailto:sergek@lokitech.com]
Sent: Thursday, February 13, 2003 3:10
To: James Developers List
Subject: Re: Improving JDBC Spool responsiveness


Noel J. Bergman wrote:
>>It consists in appending to the listMessagesSQL query from
>>sqlResources.xml a LIMIT xxx. This way, MySQL does not send
>>back a huge ResultSet which makes the JDBC connection to expire
>
>
> Good thought regarding LIMIT.  There is code in JDBCSpoolRepository for
> limiting the number of messages loaded into an internal working set, but
the
> SQL query still has to generate the large result set.  There are two
places
> where we use listMessageSQL:
>
>    JDBCMailRepository.list()
>    JDBCSpoolReposittory.loadPendingMessages()
>
> The former is only used by the POP3 handler when listing the contents of
the
> user's mailbox.  The latter is used internally to load the working set.  I
> was thinking that perhaps it doesn't make sense to limit the list given to
> the POP3 handler, although it would simply require the user to clear out
> their messages before they could retrieve more of them.  Or I could clone
> the listMessagesSQL to separate the queries, which is probably a good idea
> for other reasons.

Another approach is to use Statement.setMaxRows(int max).  Many JDBC
drivers can then figure out how to execute this so you achieve the goal
of not returning all the records.

--
Serge Knystautas
President
Lokitech >> software . strategy . design >> http://www.lokitech.com/
p. 1.301.656.5501
e. sergek@lokitech.com



---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


Re: Improving JDBC Spool responsiveness

Posted by Serge Knystautas <se...@lokitech.com>.
Noel J. Bergman wrote:
>>It consists in appending to the listMessagesSQL query from
>>sqlResources.xml a LIMIT xxx. This way, MySQL does not send
>>back a huge ResultSet which makes the JDBC connection to expire
> 
> 
> Good thought regarding LIMIT.  There is code in JDBCSpoolRepository for
> limiting the number of messages loaded into an internal working set, but the
> SQL query still has to generate the large result set.  There are two places
> where we use listMessageSQL:
> 
>    JDBCMailRepository.list()
>    JDBCSpoolReposittory.loadPendingMessages()
> 
> The former is only used by the POP3 handler when listing the contents of the
> user's mailbox.  The latter is used internally to load the working set.  I
> was thinking that perhaps it doesn't make sense to limit the list given to
> the POP3 handler, although it would simply require the user to clear out
> their messages before they could retrieve more of them.  Or I could clone
> the listMessagesSQL to separate the queries, which is probably a good idea
> for other reasons.

Another approach is to use Statement.setMaxRows(int max).  Many JDBC 
drivers can then figure out how to execute this so you achieve the goal 
of not returning all the records.

-- 
Serge Knystautas
President
Lokitech >> software . strategy . design >> http://www.lokitech.com/
p. 1.301.656.5501
e. sergek@lokitech.com



---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org


Improving JDBC Spool responsiveness

Posted by "Noel J. Bergman" <no...@devtech.com>.
Diego,

> I thought about a possible roundabout which I have never had
> the time to test.

> It consists in appending to the listMessagesSQL query from
> sqlResources.xml a LIMIT xxx. This way, MySQL does not send
> back a huge ResultSet which makes the JDBC connection to expire

Good thought regarding LIMIT.  There is code in JDBCSpoolRepository for
limiting the number of messages loaded into an internal working set, but the
SQL query still has to generate the large result set.  There are two places
where we use listMessageSQL:

   JDBCMailRepository.list()
   JDBCSpoolReposittory.loadPendingMessages()

The former is only used by the POP3 handler when listing the contents of the
user's mailbox.  The latter is used internally to load the working set.  I
was thinking that perhaps it doesn't make sense to limit the list given to
the POP3 handler, although it would simply require the user to clear out
their messages before they could retrieve more of them.  Or I could clone
the listMessagesSQL to separate the queries, which is probably a good idea
for other reasons.

This is worth looking at for James v2.1.2.

	--- Noel


---------------------------------------------------------------------
To unsubscribe, e-mail: james-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: james-dev-help@jakarta.apache.org