You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@spamassassin.apache.org by Jeferson Pessoa Santana <je...@dgx.com.br> on 2006/03/16 14:37:49 UTC

Sender base

Hi guys,

Does anyone know how to use Sender Base to check e-mails with SpamAssassin?

For those who doesn't know Sender Base : www.senderbase.org.

Thanks,

Jeff

Re: Sender base

Posted by Theo Van Dinter <fe...@apache.org>.

On Thu, Mar 16, 2006 at 10:37:49AM -0300, Jeferson Pessoa Santana wrote:
> Does anyone know how to use Sender Base to check e-mails with SpamAssassin?
> For those who doesn't know Sender Base : www.senderbase.org.

It's actually pretty easy, however we've specifically been asked by the owners
of SenderBase to not include rules that uses their service.  So if it's not
stated as such on the site, I believe they do not want the information used
for large-scale filtering.

-- 
Randomly Generated Tagline:
"Many people equate the word 'daemon' with the word 'demon,' implying some
 kind of Satanic connection between UNIX and the underworld." - Evi Nemeth

Re: Sender base

Posted by mouss <us...@free.fr>.

Bret Miller a écrit :
> 
> They have rationale. You can read it. They do not, however, as far as I
> know, state exactly what makes them think a message they receive is spam
> or legitimate.
> 
> To quote them:
> 
> "MXRate is not really a blacklist in the traditional sense. Our system
> analyzes data submitted using automated procedures. It calculates a
> probability score based on the overall message sending pattern of any
> particular server and that is used as a basis of an opinion as to
> whether or not the address is a source of spam. What we do is publish a
> recommendation  based on that opinion. Certainly, we can publish a
> recommendation that the message be blocked or highly penalized, however
> we can also recommend that it simply be treated with suspicion, or even
> treated as a known source of legitimate mail."
> 
> "MXRate does this differently. First, we do not accept subjective spam
> reports. Secondly, not only do we track spam, we also track legitimate
> email. This allows us to evaluate the current sending patterns of a mail
> server. In the example above, the ISP would undoubtedly have some
> statistics being maintained by us on the ratio of legitimate messages to
> spam messages, and their recent frequency. So while the spammer might
> have caused a temporary "block" recommendation, this would probably only
> last a few hours after the spamming stops."
> 
> "So, in other words, our intention is not to provide a database of
> addresses of anyone who has ever sent spam, it is to provide a database
> of addresses currently sending spam."
> 

I see. I'll have to take a closer look. thanks for the info.

> 
> 
>>ahah. so you have evidence that it fkags legit mail, but you still use
>>and recommend it???
> 
> 
> Indeed I do. And SpamAssassin itself includes and uses by default a list
> based on SpamCop, which is notoriously unreliable at determining exactly
> what is spam and what is not, because it's based on the reports of
> average users many of whom don't understand the difference between "a
> mailing list I know longer want to receive" and "unsolicited bulk
> e-mail".
> 

I agree here. I do know that some people submit as spam mail they didn't
check (including mail sent to mailing lists). but, I've seen worst: I
lately received a "complaint" from a wanna be spamcop system, a
complaint sent to postmaster (instead of abuse) for a message I sent to
an ML, and the message was a bug report (it was probably too short for
one ML subscriber who apparently doesn't understand english) which could
easily be found in the ML archives. of course, they didn't even check
the Received headers (which clearly showed that the message was
forwarded by the ML system). and more fun: they obfuscated some of the
data, but left enough infos to determine the submitter.


Now, spamcops has automatic expiration, which somewhat keeps it [almost]
usable. Anyway, I'll disable it soon.

> Whether something increases the spam probability on some small
> percentage of legitimate e-mail isn't the determining factor in whether
> it's useful in determining the probability that a message is spam or
> legitimate. The SpamCop list is still useful because, it flags
> considerably more spam than ham. You wouldn't want to score it at 5.0,
> because there are false positives. But there's no reason why you can't
> use it.
> 
> These lists are really (as far as I'm concerned) about the same as
> flagging certain words or phrases or message headers as "more likely to
> be in spam".
> 
> 
> So, in your world, if a rule, or blacklist ever hits on legitimate
> e-mail, it shouldn't ever be used again?

That's not what I meant. so let me clarify my opinion:
- if I don't understand how they list, I don't use their list
- if they do nothing to fix FPs, I don't use their list
- and whatever their listing policy is, if the ratio of false positives
isn't minimal, then I'd say no.

otherwise, randomly removing 1 of every N messages will remove enough
spam with no overhead (N=1 removes all spam).

Many spam IPs are listed on Blars and Sorbs. but I won't use these. Of
course, surbl/uribl do have FPs, but they will fix the issue as soon as
they know. so it's not just a "current ratio" issue. The policy is
important.

RE: Sender base

Posted by Bret Miller <br...@wcg.org>.

> > I use it or I wouldn't have known how to use it. ;)
> > 
> > Like most other DNS blacklists, I wouldn't trust it 
> completely. Whether
> > something is "spam" depends largely on the perception of the person
> > reviewing it. I don't know how you get in one list or another for
> > mxrate,
> 
> If you don't know their listing policy, what's the rationale in using
> their list?

They have rationale. You can read it. They do not, however, as far as I
know, state exactly what makes them think a message they receive is spam
or legitimate.

To quote them:

"MXRate is not really a blacklist in the traditional sense. Our system
analyzes data submitted using automated procedures. It calculates a
probability score based on the overall message sending pattern of any
particular server and that is used as a basis of an opinion as to
whether or not the address is a source of spam. What we do is publish a
recommendation  based on that opinion. Certainly, we can publish a
recommendation that the message be blocked or highly penalized, however
we can also recommend that it simply be treated with suspicion, or even
treated as a known source of legitimate mail."

"MXRate does this differently. First, we do not accept subjective spam
reports. Secondly, not only do we track spam, we also track legitimate
email. This allows us to evaluate the current sending patterns of a mail
server. In the example above, the ISP would undoubtedly have some
statistics being maintained by us on the ratio of legitimate messages to
spam messages, and their recent frequency. So while the spammer might
have caused a temporary "block" recommendation, this would probably only
last a few hours after the spamming stops."

"So, in other words, our intention is not to provide a database of
addresses of anyone who has ever sent spam, it is to provide a database
of addresses currently sending spam."


> ahah. so you have evidence that it fkags legit mail, but you still use
> and recommend it???

Indeed I do. And SpamAssassin itself includes and uses by default a list
based on SpamCop, which is notoriously unreliable at determining exactly
what is spam and what is not, because it's based on the reports of
average users many of whom don't understand the difference between "a
mailing list I know longer want to receive" and "unsolicited bulk
e-mail".

Whether something increases the spam probability on some small
percentage of legitimate e-mail isn't the determining factor in whether
it's useful in determining the probability that a message is spam or
legitimate. The SpamCop list is still useful because, it flags
considerably more spam than ham. You wouldn't want to score it at 5.0,
because there are false positives. But there's no reason why you can't
use it.

These lists are really (as far as I'm concerned) about the same as
flagging certain words or phrases or message headers as "more likely to
be in spam".

> 
>  So, if you use it, score it
> > where it helps you increase or decrease the score, but 
> realize it will
> > likely flag some real e-mail as "recommend block". It did here. So I
> > treat it as "more likely to be spam" rather than "block 
> this e-mail".
> 
> If this isn't Cargo cult, I wonder how to call it...

So, in your world, if a rule, or blacklist ever hits on legitimate
e-mail, it shouldn't ever be used again?

Just curious...

Bret

Re: Sender base

Posted by mouss <us...@free.fr>.

Bret Miller a écrit :
> 
> I use it or I wouldn't have known how to use it. ;)
> 
> Like most other DNS blacklists, I wouldn't trust it completely. Whether
> something is "spam" depends largely on the perception of the person
> reviewing it. I don't know how you get in one list or another for
> mxrate,

If you don't know their listing policy, what's the rationale in using
their list?

I have designed the fastest BL in the world (now, I didn't patent this
because I think many guys have designed the same or a similar system):
no IPC, no fork, no parsing:
	- if the last quad is odd, it's not listed
	- if it's even, it's listed
try this and you'll see that it blocks a lot of spam. a lot more than
many DNSBLs. And it contains much more listed IPs than any other list
(even more than all lists!), since it contains 2^31 IPs.

Note that this list includes 127.0.0.2, so it is testable. It doesn't
include 127.0.0.1, so it's "safe":)

 but I do know that a number of messages get flagged somewhat
> incorrectly at least in my perception of it.

ahah. so you have evidence that it fkags legit mail, but you still use
and recommend it???

 So, if you use it, score it
> where it helps you increase or decrease the score, but realize it will
> likely flag some real e-mail as "recommend block". It did here. So I
> treat it as "more likely to be spam" rather than "block this e-mail".

If this isn't Cargo cult, I wonder how to call it...

Re: Sender base

Posted by Jeferson Pessoa Santana <je...@dgx.com.br>.

Bret Miller wrote:

>>>Not SenderBase specifically, but MXRate is a similar service. See
>>>www.mxrate.com for information.
>>>      
>>>
>>Ah, Alligate, the server that "test incoming mail for RFC compliance",
>>but lacks a proper behaviour on RSET.
>>
>>I hope this service is more sane. Anybody with experience 
>>with MX? Might be worth a try.
>>    
>>
>
>I use it or I wouldn't have known how to use it. ;)
>
>Like most other DNS blacklists, I wouldn't trust it completely. Whether
>something is "spam" depends largely on the perception of the person
>reviewing it. I don't know how you get in one list or another for
>mxrate, but I do know that a number of messages get flagged somewhat
>incorrectly at least in my perception of it. So, if you use it, score it
>where it helps you increase or decrease the score, but realize it will
>likely flag some real e-mail as "recommend block". It did here. So I
>treat it as "more likely to be spam" rather than "block this e-mail".
>
>Bret
>
>  
>
Thanks guys. I'm reading the documentation of this MXRATE and it seems 
to me that this service is better than SenderBase. =D

RE: Sender base

Posted by Bret Miller <br...@wcg.org>.

> > Not SenderBase specifically, but MXRate is a similar service. See
> > www.mxrate.com for information.
> 
> Ah, Alligate, the server that "test incoming mail for RFC compliance",
> but lacks a proper behaviour on RSET.
> 
> I hope this service is more sane. Anybody with experience 
> with MX? Might be worth a try.

I use it or I wouldn't have known how to use it. ;)

Like most other DNS blacklists, I wouldn't trust it completely. Whether
something is "spam" depends largely on the perception of the person
reviewing it. I don't know how you get in one list or another for
mxrate, but I do know that a number of messages get flagged somewhat
incorrectly at least in my perception of it. So, if you use it, score it
where it helps you increase or decrease the score, but realize it will
likely flag some real e-mail as "recommend block". It did here. So I
treat it as "more likely to be spam" rather than "block this e-mail".

Bret

Re: Sender base

Posted by Jakob Hirsch <jh...@plonk.de>.

Bret Miller wrote:

> Not SenderBase specifically, but MXRate is a similar service. See
> www.mxrate.com for information.

Ah, Alligate, the server that "test incoming mail for RFC compliance",
but lacks a proper behaviour on RSET.

I hope this service is more sane. Anybody with experience with MX? Might
be worth a try.

RE: Sender base

Posted by Bret Miller <br...@wcg.org>.

> Jeferson Pessoa Santana wrote:
> > Hi guys,
> >
> > Does anyone know how to use Sender Base to check e-mails with 
> > SpamAssassin?
> >
> > For those who doesn't know Sender Base : www.senderbase.org.
> 
> And to quote senderbase:
> 
> "SenderBase is a reporting tool used by network 
> administrators to investigate traffic flows over the Internet 
> and does not operate a blacklist."

Not SenderBase specifically, but MXRate is a similar service. See
www.mxrate.com for information.

It works like other DNS-based black/white lists:

header   __RCVD_IN_MXRATE     eval:check_rbl('mxrate',
'mxrate-server-address-here')
describe __RCVD_IN_MXRATE     Received via a relay in MXRate
tflags   __RCVD_IN_MXRATE     net

header   RCVD_IN_MXRATE_BL    eval:check_rbl_sub('mxrate', '127.0.0.2')
describe RCVD_IN_MXRATE_BL    MXRate recommends blocking
tflags   RCVD_IN_MXRATE_BL    net
score    RCVD_IN_MXRATE_BL    2.0

header   RCVD_IN_MXRATE_WL    eval:check_rbl_sub('mxrate', '127.0.0.3')
describe RCVD_IN_MXRATE_WL    MXRate recommends allowing
tflags   RCVD_IN_MXRATE_WL    net
score    RCVD_IN_MXRATE_WL    -1.0

header   RCVD_IN_MXRATE_S     eval:check_rbl_sub('mxrate', '127.0.0.4')
describe RCVD_IN_MXRATE_S     MXRate sees as suspicious
tflags   RCVD_IN_MXRATE_S     net
score    RCVD_IN_MXRATE_S     1.0

Re: Sender base

Posted by Matt Kettler <mk...@comcast.net>.

Jeferson Pessoa Santana wrote:
> Hi guys,
>
> Does anyone know how to use Sender Base to check e-mails with
> SpamAssassin?
>
> For those who doesn't know Sender Base : www.senderbase.org.

And to quote senderbase:

"SenderBase is a reporting tool used by network administrators to
investigate traffic flows over the Internet and does not operate a
blacklist."