You are viewing a plain text version of this content. The canonical link for it is here.
Posted to server-user@james.apache.org by "rafael.munoz" <ra...@gmail.com> on 2009/03/26 12:14:06 UTC

Bottleneck on James spool storing code

Hello

I have been doing some stress and performance tests on James 2.3.1 and I
think I have found a bottleneck on the James spool storing code (I am using 
the filesystem spool). 

I have configured James to behave as a simple SMTPServer and do almost
nothing more than receiving mails and storing it in the spool ("<mailet
match="All" class="Null"/>"). When stressing James (sending 10 messages per
second) I see higher and higher times on the mail handle time, with times of
5, 6 and even 10 seconds instead of the 100 ms. average time that I got when
James is not stressed out. I am measuring these times on the SMTP client, so
these are times from the start of the SMTP session to the end of it (in
James this correspond with the time until the mail is store in the spool, if
I am not mistaken) and not include the mail processing time.

After some digging I found that James was spending all the time on the
File_Persistent_Stream_Repository.put method. All the input threads spend
most of the time (almost 99%) waiting to enter in that method, because it is
synchronized. I do not understand why this method it is synchronized: it
does not access to any shared resources or anything. Any explanation?

Just for testing, I remove the synchronized from that method and several
others (i.e. File_Persistent_Object_Repository.put). I get a very little
performance boost but not very much because in that case the
FileOutputStream object created inside the
File_Persistent_Stream_Repository.put method start taking a lot of time.

So, summarizing:
1. Why the File_Persistent_*_Repository  methods are synchronized?
2. Anyone knows why the FileOutputStream object creation takes more and more
when James is stress out? The underlying OS is not reporting any problems
with the filesystem or the file descriptors.

Any ideas will be more than welcome. This bottleneck is really hurting our
performance. We can afford to have some poor performance after the spool but
not before it.

Details:
- Solaris 8
- JDK 6.0u12 (500M RAM heap)
- Sun-Fire-V240 (1 CPU)
- Input throughput: 10 message per second
- James config:
----- SmtpServer.connectionTimeout = 30000 
----- StmpServer.connectionLimit = 150
----- thread-group.max-threads = 200
- Input threads busy with 10 messages per second ---> around 60 threads

regards and thanks in advance (and sorry for the long long post ;) ),
Rafael Munoz
-- 
View this message in context: http://www.nabble.com/Bottleneck-on-James-spool-storing-code-tp22719950p22719950.html
Sent from the James - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


Re: Bottleneck on James spool storing code

Posted by "rafael.munoz" <ra...@gmail.com>.
Hi

So none of the developers has an idea why this
File_Persistent_Stream_Repository methods are synchronized? 

regards,
rafa


rafael.munoz wrote:
> 
> Hello
> 
> I have been doing some stress and performance tests on James 2.3.1 and I
> think I have found a bottleneck on the James spool storing code (I am
> using  the filesystem spool). 
> 
> I have configured James to behave as a simple SMTPServer and do almost
> nothing more than receiving mails and storing it in the spool ("<mailet
> match="All" class="Null"/>"). When stressing James (sending 10 messages
> per second) I see higher and higher times on the mail handle time, with
> times of 5, 6 and even 10 seconds instead of the 100 ms. average time that
> I got when James is not stressed out. I am measuring these times on the
> SMTP client, so these are times from the start of the SMTP session to the
> end of it (in James this correspond with the time until the mail is store
> in the spool, if I am not mistaken) and not include the mail processing
> time.
> 
> After some digging I found that James was spending all the time on the
> File_Persistent_Stream_Repository.put method. All the input threads spend
> most of the time (almost 99%) waiting to enter in that method, because it
> is synchronized. I do not understand why this method it is synchronized:
> it does not access to any shared resources or anything. Any explanation?
> 
> Just for testing, I remove the synchronized from that method and several
> others (i.e. File_Persistent_Object_Repository.put). I get a very little
> performance boost but not very much because in that case the
> FileOutputStream object created inside the
> File_Persistent_Stream_Repository.put method start taking a lot of time.
> 
> So, summarizing:
> 1. Why the File_Persistent_*_Repository  methods are synchronized?
> 2. Anyone knows why the FileOutputStream object creation takes more and
> more when James is stress out? The underlying OS is not reporting any
> problems with the filesystem or the file descriptors.
> 
> Any ideas will be more than welcome. This bottleneck is really hurting our
> performance. We can afford to have some poor performance after the spool
> but not before it.
> 
> Details:
> - Solaris 8
> - JDK 6.0u12 (500M RAM heap)
> - Sun-Fire-V240 (1 CPU)
> - Input throughput: 10 message per second
> - James config:
> ----- SmtpServer.connectionTimeout = 30000 
> ----- StmpServer.connectionLimit = 150
> ----- thread-group.max-threads = 200
> - Input threads busy with 10 messages per second ---> around 60 threads
> 
> regards and thanks in advance (and sorry for the long long post ;) ),
> Rafael Munoz
> 

-- 
View this message in context: http://www.nabble.com/Bottleneck-on-James-spool-storing-code-tp22719950p22823986.html
Sent from the James - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


Re: Bottleneck on James spool storing code

Posted by Josip Almasi <jo...@vrspace.org>.
rafael.munoz wrote:
> Hello
> 
> 
> Josip Almasi wrote:
>> rafael.munoz wrote:
>>> Hello
>>>
>>> I have been doing some stress and performance tests on James 2.3.1 and I
>>> think I have found a bottleneck on the James spool storing code (I am
>>> using 
>>> the filesystem spool). 
>>>
>>> I have configured James to behave as a simple SMTPServer and do almost
>>> nothing more than receiving mails and storing it in the spool ("<mailet
>>> match="All" class="Null"/>"). 
>> Eh, Null?
>>
> 
> I was only measuring James input so I was just destroying any incoming
> message after retrieving from the spool. 'Null' is refering to the
> NullMailet (http://james.apache.org/mailet/standard/mailet-report.html#Null)

Well I know but then maybe those complexity things I said don't apply 
here; they generally apply to directories with large number of files.

> 
> 
> Josip Almasi wrote:
>> ...
>>> So, summarizing:
>> ...
>>> 2. Anyone knows why the FileOutputStream object creation takes more and
>>> more
>>> when James is stress out? The underlying OS is not reporting any problems
>>> with the filesystem or the file descriptors.
>> Most filesystems store directory entries as lists. Lists are read each 
>> time from beginning so to access Nth entry you'll access N-1 entries, 
>> IOW N*(N-1)/2 complexity.
>> In fact, AFAIK only FS that doesn't do that is ReiserFS, and I doubt you 
>> can get that on solaris, so better switch to database storage.
>> Databases use ballanced trees meaning IIRC max N*log(N) avg log(N) 
>> complexity.
> 
> Umm .. interesting, I didn't know that. I would check the number on entries
> on the spool directory when I start to get huge file FileOutputStream
> creation times (that as you implies almost surely are linked to huge file
> creation times on the filesystem). And about the database suggestion, I am
> afraid it is not an option in our application :(.
> 
> Thanks for your answer!

Welcome.
Furthermore, if you use FS, JAMES has to keep list (or map) of messages 
in memory. For lists and treemaps, above complexity applies.
Plus, memory usage... spammers kindly provided me with even better 
stress test in production:> Check this thread:
http://marc.info/?l=james-user&m=121491652506688&w=2

Regards...


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


Re: Bottleneck on James spool storing code

Posted by "rafael.munoz" <ra...@gmail.com>.
Hello


Josip Almasi wrote:
> 
> rafael.munoz wrote:
>> Hello
>> 
>> I have been doing some stress and performance tests on James 2.3.1 and I
>> think I have found a bottleneck on the James spool storing code (I am
>> using 
>> the filesystem spool). 
>> 
>> I have configured James to behave as a simple SMTPServer and do almost
>> nothing more than receiving mails and storing it in the spool ("<mailet
>> match="All" class="Null"/>"). 
> 
> Eh, Null?
> 

I was only measuring James input so I was just destroying any incoming
message after retrieving from the spool. 'Null' is refering to the
NullMailet (http://james.apache.org/mailet/standard/mailet-report.html#Null)


Josip Almasi wrote:
> 
> ...
>> So, summarizing:
> ...
>> 2. Anyone knows why the FileOutputStream object creation takes more and
>> more
>> when James is stress out? The underlying OS is not reporting any problems
>> with the filesystem or the file descriptors.
> 
> Most filesystems store directory entries as lists. Lists are read each 
> time from beginning so to access Nth entry you'll access N-1 entries, 
> IOW N*(N-1)/2 complexity.
> In fact, AFAIK only FS that doesn't do that is ReiserFS, and I doubt you 
> can get that on solaris, so better switch to database storage.
> Databases use ballanced trees meaning IIRC max N*log(N) avg log(N) 
> complexity.
> 
> 

Umm .. interesting, I didn't know that. I would check the number on entries
on the spool directory when I start to get huge file FileOutputStream
creation times (that as you implies almost surely are linked to huge file
creation times on the filesystem). And about the database suggestion, I am
afraid it is not an option in our application :(.

Thanks for your answer!

regards,
Rafael Munoz

-- 
View this message in context: http://www.nabble.com/Bottleneck-on-James-spool-storing-code-tp22719950p22740379.html
Sent from the James - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org


Re: Bottleneck on James spool storing code

Posted by Josip Almasi <jo...@vrspace.org>.
rafael.munoz wrote:
> Hello
> 
> I have been doing some stress and performance tests on James 2.3.1 and I
> think I have found a bottleneck on the James spool storing code (I am using 
> the filesystem spool). 
> 
> I have configured James to behave as a simple SMTPServer and do almost
> nothing more than receiving mails and storing it in the spool ("<mailet
> match="All" class="Null"/>"). 

Eh, Null?

...
> So, summarizing:
...
> 2. Anyone knows why the FileOutputStream object creation takes more and more
> when James is stress out? The underlying OS is not reporting any problems
> with the filesystem or the file descriptors.

Most filesystems store directory entries as lists. Lists are read each 
time from beginning so to access Nth entry you'll access N-1 entries, 
IOW N*(N-1)/2 complexity.
In fact, AFAIK only FS that doesn't do that is ReiserFS, and I doubt you 
can get that on solaris, so better switch to database storage.
Databases use ballanced trees meaning IIRC max N*log(N) avg log(N) 
complexity.

Regards...


---------------------------------------------------------------------
To unsubscribe, e-mail: server-user-unsubscribe@james.apache.org
For additional commands, e-mail: server-user-help@james.apache.org