You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Kenneth Kim <kk...@yahoo.com> on 2006/03/28 03:38:23 UTC

socket SA is not fast enough, help

I've found that SpamAssassin will not return a score until I close
socket writing. Once i've closed the writing, in order to get a score
for the next message, I have to reopen the connection in php to send
another message to SA. I hope I'm wrong about this, but currently I'm
bottle necking at because I have to reopen the connection. Is there
anyway for me to get a score w/o having to close socket writing?
Possibly a command I can send at the end/after each message? 

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: spamd REPORT

Posted by David B Funk <db...@engineering.uiowa.edu>.
On Thu, 30 Mar 2006, mouss wrote:

> > Use the 'REPORT' or 'REPORT_IFSPAM' spamd command instead of 'SYMBOLS'
> > or 'PROCESS' to get the full score report but not the full message.
> >
>
> This requires parsing the message.
>
>
> I would like getting something like:
>
> ALL_TRUSTED=-1.44,MISSING_SUBJECT=1.345
>
> instead of (REPORT):
>
> blahblah
> ...
> ...
> Content analysis details:   (-0.1 points, 5.0 required)
>
>   pts rule name              description
> ---- ----------------------
> --------------------------------------------------
> -1.4 ALL_TRUSTED            Passed through trusted hosts only via SMTP
>   1.3 MISSING_SUBJECT        Missing Subject: header
>
> or (SYMBOLS):
>
> ALL_TRUSTED,MISSING_SUBJECT
>
> PROCESS seems the answer. It's just unfortunate that it still echoes the
>   whole message. Can I just close the socket once I get my headers?

Picky Picky, Some people want -everything- done for them.

If you read the SA Conf document ("Mail::SpamAssassin::Conf(3)")
you will find that there is a section that talks about all the
options for formatting the REPORT output. Pay particular attention
to the TAGS section (the stuff that looks like _BLAH_) and the report
template options.

Is left as an exercise for the student to find the appropriate
config file options to get an output that looks like:

 localhost$ spamc -R < /tmp/test2
 5.6/6.0
 Content analysis details:   (5.6 points, 6.0 required, autolearn=no)
 BAYES_44=-0.001,COMBINED_FROM=0.318,FVGT_TRIPWIRE_SJ=0.077,FVGT_TRIPWIRE_XF=0.077,FVGT_TRIPWIRE_XX=0.077,L_RCVD_IN_CBL=2.1,L_RCVD_IN_XBL=2.7,L_T_COMBINED=0.216

And if that "Content analysis details" part bothers you, you can even
get it down to:

 localhost$ spamc -R < /tmp/test2
 5.6/6.0
 BAYES_44=-0.001,COMBINED_FROM=0.318,FVGT_TRIPWIRE_SJ=0.077,FVGT_TRIPWIRE_XF=0.077,FV$

Hint: look at clear_report_template and _TESTSSCORES(,)_

Professor Dave.
(I -am- from a University ;)

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: socket SA is not fast enough, help

Posted by mouss <us...@free.fr>.
Thomas Hochstein wrote:
> mouss schrieb:
> 
>>> Use the 'REPORT' or 'REPORT_IFSPAM' spamd command instead of 'SYMBOLS'
>>> or 'PROCESS' to get the full score report but not the full message.
>> This requires parsing the message.
>>
>> I would like getting something like:
>>
>> ALL_TRUSTED=-1.44,MISSING_SUBJECT=1.345
> 
> Why don't you reconfigure the report template to output exactly what
> you'd like to see?
> 
> 


Already done after Matthew recommendation. Thanks to all.

(I failed to see the relationship between SA reports and the spamd 
protocol!).

Re: socket SA is not fast enough, help

Posted by Thomas Hochstein <ml...@ancalagon.inka.de>.
mouss schrieb:

>> Use the 'REPORT' or 'REPORT_IFSPAM' spamd command instead of 'SYMBOLS'
>> or 'PROCESS' to get the full score report but not the full message.
>
> This requires parsing the message.
>
> I would like getting something like:
>
> ALL_TRUSTED=-1.44,MISSING_SUBJECT=1.345

Why don't you reconfigure the report template to output exactly what
you'd like to see?

Re: socket SA is not fast enough, help

Posted by mouss <us...@free.fr>.
David B Funk wrote:
> On Tue, 28 Mar 2006, mouss wrote:
> 
>> Another thing is that I can't find a way to get the SA headers (as they
>> would be added by spamassassin) without having the full message sent
>> back (SYMBOLS doesn't return the score of each test). or am I missing
>> something?
> 
> Use the 'REPORT' or 'REPORT_IFSPAM' spamd command instead of 'SYMBOLS'
> or 'PROCESS' to get the full score report but not the full message.
> 

This requires parsing the message.


I would like getting something like:

ALL_TRUSTED=-1.44,MISSING_SUBJECT=1.345

instead of (REPORT):

blahblah
...
...
Content analysis details:   (-0.1 points, 5.0 required)

  pts rule name              description
---- ---------------------- 
--------------------------------------------------
-1.4 ALL_TRUSTED            Passed through trusted hosts only via SMTP
  1.3 MISSING_SUBJECT        Missing Subject: header

or (SYMBOLS):

ALL_TRUSTED,MISSING_SUBJECT

PROCESS seems the answer. It's just unfortunate that it still echoes the 
  whole message. Can I just close the socket once I get my headers?

Re: socket SA is not fast enough, help

Posted by David B Funk <db...@engineering.uiowa.edu>.
On Tue, 28 Mar 2006, mouss wrote:

> Another thing is that I can't find a way to get the SA headers (as they
> would be added by spamassassin) without having the full message sent
> back (SYMBOLS doesn't return the score of each test). or am I missing
> something?

Use the 'REPORT' or 'REPORT_IFSPAM' spamd command instead of 'SYMBOLS'
or 'PROCESS' to get the full score report but not the full message.

-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Re: socket SA is not fast enough, help

Posted by Adam Lanier <ad...@krusty.madoff.com>.
On Wed, 2006-03-29 at 08:50 -0800, Kenneth Kim wrote:
> My spamassassin is running on a remote server, no way to get around
> this at the moment. I am connecting to spamd on the remote server
> using sockets in php. Unfortunately I have to close the socket to get
> a response/spam score from the server. Is there any other way to
> promt a response from the server without closing the socket? I
> believe if I could figure this out, things could be sped up quite a
> bit. Is there a way to use spamc to maintain a connection to spamd,
> perhaps a way to send multiple messages? 

I believe what you are asking is whether the spamd socket connection can
be reused for multiple messages.  The answer is no, each processed
message requires the overhead of a new socket to spamd.

To do what you seem to want to do would require a multi-threaded
connection caching server to encapsulate the spamassassin/spamd
objects. 

This is analogous to the mimedefang-multiplexor (and probably lots of
other anti-spam software that uses spamassassin) that run a number of
spamassassin objects in separate threads.

Re: socket SA is not fast enough, help

Posted by Kenneth Kim <kk...@yahoo.com>.
My spamassassin is running on a remote server, no way to get around
this at the moment. I am connecting to spamd on the remote server
using sockets in php. Unfortunately I have to close the socket to get
a response/spam score from the server. Is there any other way to
promt a response from the server without closing the socket? I
believe if I could figure this out, things could be sped up quite a
bit. Is there a way to use spamc to maintain a connection to spamd,
perhaps a way to send multiple messages? 



--- mouss <us...@free.fr> wrote:

> Matt Kettler wrote:
> > Kenneth Kim wrote:
> >> I've found that SpamAssassin will not return a score until I
> close
> >> socket writing. Once i've closed the writing, in order to get a
> score
> >> for the next message, I have to reopen the connection in php to
> send
> >> another message to SA. I hope I'm wrong about this, but
> currently I'm
> >> bottle necking at because I have to reopen the connection. Is
> there
> >> anyway for me to get a score w/o having to close socket writing?
> >> Possibly a command I can send at the end/after each message?
> > 
> > You should switch to using spamd directly if you want to do this
> with sockets.
> > 
> > You can find the protocol that spamd speaks on it's TCP socket in
> the PROTOCOL
> > docs that come with SA.
> > http://spamassassin.apache.org/full/3.1.x/dist/spamd/PROTOCOL
> > 
> > The "spamassassin" command line script is particularly inefficent
> for this kind
> > of thing, and can handle only one message per call. Spamc has the
> same
> > one-message-per-call limit.
> > 
> > 
> 
> My understanding is that he is talking about spamd, which doesn't
> allow 
> socket reuse. The protocol doc says:
> 	After each side is done writing, it shuts down its side of the 	
> 	connection.
> 
> I don't know if there are hard design issues, but the socket could
> be 
> used, that would be good.
> 
> Another thing is that I can't find a way to get the SA headers (as
> they 
> would be added by spamassassin) without having the full message
> sent 
> back (SYMBOLS doesn't return the score of each test). or am I
> missing 
> something?
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: socket SA is not fast enough, help

Posted by mouss <us...@free.fr>.
Matt Kettler wrote:
> Kenneth Kim wrote:
>> I've found that SpamAssassin will not return a score until I close
>> socket writing. Once i've closed the writing, in order to get a score
>> for the next message, I have to reopen the connection in php to send
>> another message to SA. I hope I'm wrong about this, but currently I'm
>> bottle necking at because I have to reopen the connection. Is there
>> anyway for me to get a score w/o having to close socket writing?
>> Possibly a command I can send at the end/after each message?
> 
> You should switch to using spamd directly if you want to do this with sockets.
> 
> You can find the protocol that spamd speaks on it's TCP socket in the PROTOCOL
> docs that come with SA.
> http://spamassassin.apache.org/full/3.1.x/dist/spamd/PROTOCOL
> 
> The "spamassassin" command line script is particularly inefficent for this kind
> of thing, and can handle only one message per call. Spamc has the same
> one-message-per-call limit.
> 
> 

My understanding is that he is talking about spamd, which doesn't allow 
socket reuse. The protocol doc says:
	After each side is done writing, it shuts down its side of the 	
	connection.

I don't know if there are hard design issues, but the socket could be 
used, that would be good.

Another thing is that I can't find a way to get the SA headers (as they 
would be added by spamassassin) without having the full message sent 
back (SYMBOLS doesn't return the score of each test). or am I missing 
something?

Re: socket SA is not fast enough, help

Posted by Matt Kettler <mk...@evi-inc.com>.
Kenneth Kim wrote:
> I've found that SpamAssassin will not return a score until I close
> socket writing. Once i've closed the writing, in order to get a score
> for the next message, I have to reopen the connection in php to send
> another message to SA. I hope I'm wrong about this, but currently I'm
> bottle necking at because I have to reopen the connection. Is there
> anyway for me to get a score w/o having to close socket writing?
> Possibly a command I can send at the end/after each message?

You should switch to using spamd directly if you want to do this with sockets.

You can find the protocol that spamd speaks on it's TCP socket in the PROTOCOL
docs that come with SA.
http://spamassassin.apache.org/full/3.1.x/dist/spamd/PROTOCOL

The "spamassassin" command line script is particularly inefficent for this kind
of thing, and can handle only one message per call. Spamc has the same
one-message-per-call limit.