You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by rokdominko <ro...@pnv.si> on 2011/04/10 21:20:45 UTC

Retrieve specific word, phrases or sentances from mail body and subject

Hi!

I hope someone can help me with this issue. At our company we are building a
mass mailing application. We have taken every measure to ensure that our
clients aren't sending spam to their clients, but one section of our
application still has potential of sending spam, which we do not want. This
is the message subject and body. We have been using Spamassasin on our
servers for some time now, so we thought we would simply send the message
through it before sending, to check if it validates as spam or not. If it
validates as spam, then we do not allow it to be sent. The problem is, we
want to tell our client exactly which words, phrases or sentences are
problematic, so we need Spamassassin to return the list of these words,
phrases or sentences, so that we can tell our client what exactly is wrong
with their message.

We have written a PHP script, which connects to spamd process on our server
(on port 783) and it checks the message with no problems and if it's spam it
doesn't allow sending it. We use the command 'REPORT' to ask the process to
evaluate the message. With this configuration we get a nice report returned
from spamd process, but in that report we get a list of broken rules, not
the exact words, phrases or sentences, which we would need.

So, we would need to get the problematic words, phrases or sentences from
spamd. If it's not possible, maybe someone can help us find an alternative.
Hope someone can help us!

Our specifications: SpamAssassin version 3.3.1, running on Perl version
5.10.1, running on Ubuntu server version 10.04.1 with Linux kernel
2.6.32-26-generic-pae.

Have a nice day,

Rok 
-- 
View this message in context: http://old.nabble.com/Retrieve-specific-word%2C-phrases-or-sentances-from-mail-body-and-subject-tp31365331p31365331.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Re: Retrieve specific word, phrases or sentances from mail body and subject

Posted by Bowie Bailey <Bo...@BUC.com>.
On 4/10/2011 4:43 PM, John Hardin wrote:
> On Sun, 10 Apr 2011, rokdominko wrote:
>
>> The problem is, we want to tell our client exactly which words,
>> phrases or sentences are problematic, so we need Spamassassin to
>> return the list of these words, phrases or sentences, so that we can
>> tell our client what exactly is wrong with their message.
>>
>> We have written a PHP script, which connects to spamd process on our
>> server
>> (on port 783) and it checks the message with no problems and if it's
>> spam it
>> doesn't allow sending it.
>
> That level of detail isn't available via spamd. You'd have to run
> spamassassin in debug mode with rules tracing and then parse the results.
>
> Take a look at the output from:
>
>    spamassassin -t --debug area=rules < your_message_file
>

Keep in mind that when using the '-t' flag, spamassassin will always
claim the mail is spam.  You will need to ignore this and focus on the
score instead.

-- 
Bowie

Re: Retrieve specific word, phrases or sentances from mail body and subject

Posted by John Hardin <jh...@impsec.org>.
On Sun, 10 Apr 2011, rokdominko wrote:

> The problem is, we want to tell our client exactly which words, phrases 
> or sentences are problematic, so we need Spamassassin to return the list 
> of these words, phrases or sentences, so that we can tell our client 
> what exactly is wrong with their message.
>
> We have written a PHP script, which connects to spamd process on our server
> (on port 783) and it checks the message with no problems and if it's spam it
> doesn't allow sending it.

That level of detail isn't available via spamd. You'd have to run 
spamassassin in debug mode with rules tracing and then parse the results.

Take a look at the output from:

    spamassassin -t --debug area=rules < your_message_file

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org    FALaholic #11174     pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   ...every time I sit down in front of a Windows machine I feel as
   if the computer is just a place for the manufacturers to put their
   advertising.                                 -- fwadling on Y! SCOX
-----------------------------------------------------------------------
  3 days until Thomas Jefferson's 268th Birthday