You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Paul Boven <p....@chello.nl> on 2006/06/27 16:13:25 UTC

Airline reservations get tagged

Hi everyone,

Although our SA setup works very well in general, one issue that has 
come up a few times recently is airline E-tickets/reservations. These 
tend to be ALL CAPS and have quite a few other trigger words. Our 
company seems to do business with more than one travel-agent, so just 
whitelisting isn't quite enough. These mails hit the following rules:

X-Spam-Score: ***** (5.696) BAYES_99,HTML_30_40,HTML_MESSAGE,NO_REAL_NAME,
  SARE_OBFU_TBL_03,UPPERCASE_50_75,autolearn=no

Given the sensitive nature of these emails, I'd rather not post it on 
the list. My question is: do other people get the same FPs? Do any of 
the current rules need to take this in account? A publicly available 
negative scoring rule would probably just be abused by the spammers, so 
what would be the best way to fix this, not just for me but in general?

Regards, Paul Boven.

Re: Airline reservations get tagged

Posted by Hamish <ha...@travellingkiwi.com>.
On Wednesday 28 June 2006 08:48, Ralf Hildebrandt wrote:
> * hamann.w@t-online.de <ha...@t-online.de>:
> > Given that airline messages are important, are related to meney, and
> > recipients dont want to get forged ones, it would be a great idea to
> > start a campaign with airlines / travel agents to use some sort of
> > proof of origin (spf, digital signature, whatnot) Recipients could then
> > apply whitelists
>
> Amen to that!

Does SA do anything with digital signatures to deduct scores? If it's 
worthwhile, I'm game to play.

  Hamish.

Re: Airline reservations get tagged

Posted by Ralf Hildebrandt <Ra...@charite.de>.
* hamann.w@t-online.de <ha...@t-online.de>:

> Given that airline messages are important, are related to meney, and
> recipients dont want to get forged ones, it would be a great idea to
> start a campaign with airlines / travel agents to use some sort of
> proof of origin (spf, digital signature, whatnot) Recipients could then
> apply whitelists

Amen to that!
-- 
Ralf Hildebrandt (i.A. des IT-Zentrums)         Ralf.Hildebrandt@charite.de
Charite - Universitätsmedizin Berlin            Tel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin    Fax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF                 send no mail to spamtrap@charite.de

Re: Airline reservations get tagged

Posted by Loren Wilton <lw...@earthlink.net>.
> Although our SA setup works very well in general, one issue that has
> come up a few times recently is airline E-tickets/reservations.

Travel stuff in general seems to be designed specifically to hit as many
spam rules as possible.  *Everything* from Travelocity and Alaska Air get
around 20 points here on average.  Ticket confirmations add more from the
all-caps subjects and the like.

> X-Spam-Score: ***** (5.696) BAYES_99,HTML_30_40,HTML_MESSAGE,NO_REAL_NAME,
>   SARE_OBFU_TBL_03,UPPERCASE_50_75,autolearn=no

In your specific case I'd fix the bayes_99 problem, and that should get you
clean.  There isn't any particular reason I can think of why ticket
confirmations should be getting tagged as spam by Bayes.

        Loren


Re: Airline reservations get tagged

Posted by Paul Boven <p....@chello.nl>.
Hi Loren, everyone

Loren Wilton wrote:
>> bayes token 'visa' => 0.997839158297152
>> bayes token 'refund' => 0.997646909307943
>> bayes token 'drinks' => 0.997585038685398
>> bayes token 'NUMBER' => 0.990398319296953
>> bayes token 'nights' => 0.98853871069642
> 
> This suggests you are still on 2.6x.  It is possible that upgrading to 3.x
> or 3.1.x might get spam scores more in alignment with your actual incoming
> mail.

No, we're running SA 3.04 in here. This is the output from SpamAssassin 
-D on an email, not a dump from the (now hashed) Bayes database.

Regards, Paul Boven.

Re: Airline reservations get tagged

Posted by Loren Wilton <lw...@earthlink.net>.
> bayes token 'visa' => 0.997839158297152
> bayes token 'refund' => 0.997646909307943
> bayes token 'drinks' => 0.997585038685398
> bayes token 'NUMBER' => 0.990398319296953
> bayes token 'nights' => 0.98853871069642

This suggests you are still on 2.6x.  It is possible that upgrading to 3.x
or 3.1.x might get spam scores more in alignment with your actual incoming
mail.

        Loren


Re: Airline reservations get tagged

Posted by Paul Boven <p....@chello.nl>.
Hi Ralf,

Ralf Hildebrandt wrote:

>> Although our SA setup works very well in general, one issue that has 
>> come up a few times recently is airline E-tickets/reservations. These 
>> tend to be ALL CAPS and have quite a few other trigger words. Our 
>> company seems to do business with more than one travel-agent, so just 
>> whitelisting isn't quite enough. These mails hit the following rules:
>>
>> X-Spam-Score: ***** (5.696) BAYES_99,HTML_30_40,HTML_MESSAGE,NO_REAL_NAME,
>>  SARE_OBFU_TBL_03,UPPERCASE_50_75,autolearn=no
> 
> You could feed these to the bayes DB as "ham"

You are right, of course. But Bayes is more of a statistical tool, and 
given the total number of mails stored in Bayes already, I fear it will 
take quite a bit of learning to offset the current high scoring.

Our current Bayes setup is:
Company-wide Bayes database
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam -0.1
score BAYES_99 5.0
score BAYES_95 4.0

Perhaps I should lower my BAYES_99 and BAYES_95 a bit, though these 
settings are based on past experience where Bayes alone was not able to 
put clearly spammy mails over the threshold.

These E-tickets just look terribly spammy to Bayes because of the 
languaged used, it seems. Some high-scoring words for this one are:

bayes token 'visa' => 0.997839158297152
bayes token 'refund' => 0.997646909307943
bayes token 'drinks' => 0.997585038685398
bayes token 'NUMBER' => 0.990398319296953
bayes token 'nights' => 0.98853871069642

Regards, Paul Boven.

Re: Airline reservations get tagged

Posted by Ralf Hildebrandt <Ra...@charite.de>.
* Paul Boven <p....@chello.nl>:
> Hi everyone,
> 
> Although our SA setup works very well in general, one issue that has 
> come up a few times recently is airline E-tickets/reservations. These 
> tend to be ALL CAPS and have quite a few other trigger words. Our 
> company seems to do business with more than one travel-agent, so just 
> whitelisting isn't quite enough. These mails hit the following rules:
> 
> X-Spam-Score: ***** (5.696) BAYES_99,HTML_30_40,HTML_MESSAGE,NO_REAL_NAME,
>  SARE_OBFU_TBL_03,UPPERCASE_50_75,autolearn=no

You could feed these to the bayes DB as "ham"

-- 
Ralf Hildebrandt (i.A. des IT-Zentrums)         Ralf.Hildebrandt@charite.de
Charite - Universitätsmedizin Berlin            Tel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-Berlin    Fax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF                 send no mail to spamtrap@charite.de