You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2004/01/19 05:29:11 UTC
[Bug 2948] New: URI tests against DNSBL and RHSBL
http://bugzilla.spamassassin.org/show_bug.cgi?id=2948
Summary: URI tests against DNSBL and RHSBL
Product: Spamassassin
Version: unspecified
Platform: All
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P5
Component: spamassassin
AssignedTo: spamassassin-dev@incubator.apache.org
ReportedBy: jdl@imaginenet.net
Many additional e-mails can be caught if the URIs are extracted from the
message and the FQDN portion compared against one or more DNSBL's and/or
RHSBL's. Of course the FQDN would have to be resolved first before testing
against a DNSBL. Some sort of caching and unique routines would probably need
to be used to prevent excessive queries.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
Re: [Bug 2948] New: URI tests against DNSBL and RHSBL
Posted by Marc Perkel <ma...@perkel.com>.
Yes - but invisible text is also a dead givaway of spam. By forcing
spammers to be clever it make them do other things that make them
identifyable. So - I would welcome such a ploy.
Besides - my serve is fast and even with some DNS lookups it's still
faster than me reading it.
Also - if the spam contains an unusually high number of what would be
artificial links - that could also be detected. If there were say more
than 6 links then we could check the first three and the last three.
That would catch most of them - the rest would be caught by other rules.
Dan DeVoe wrote:
>On Tue, 20 Jan 2004, Marc Perkel wrote:
>
>
>
>
>Spammers also want you to read the text of their message, but this doesn't
>stop them from including text just to foil spam filters; invisible,
>visible, at the bottom, at the top, in the middle, even between WORDS
>just to muck with spam filters. If, for each parsed spam, I knew I could
>cause the remote MX to generate arbitrary DNS lookups, not only do I
>potentially have a way to DoS _that_ system, I also have a way to
>potentially DoS any nameserver out there.
>
>I'm not saying that designing a system to detect spammer URLs and score
>against them isn't possible or even a good idea, simply to suggest that it
>will need to be a LOT more intelligent than simply "pull out all the URLs,
>perform DNS queries".
>
>Your reasoning that spammers will not risk ever confusing the user for the
>sake of wreaking havok makes far too many assumptions about the mindset
>of the spammer based on the evidence we have in 2004, IMNSHO.
>
>
>
>>I think that a reverse lookup on URLs would create a lot of good token
>>data for the bayesian filter and would add to the accuracy of
>>identifying spam.
>>
>>
>
>Well, to each their own I suppose, but I would not consider running
>anything like this unless I'd seen good hard evidence that it could not
>be used nefariously when deployed on a large scale. So far I've seen
>nothing like that.
>
>
>
Re: [Bug 2948] New: URI tests against DNSBL and RHSBL
Posted by Dan DeVoe <dd...@zeus.netset.com>.
On Tue, 20 Jan 2004, Marc Perkel wrote:
> I don't agree with you on this because I think it would be unlikely that
> a spammer would include a lot of other URLs. The reason is that spam
> wants you to click on the url that takes them to their web site - so
> they can sell you something - so other URLs compete with that link and
> would reduce the spammers sales.
Spammers also want you to read the text of their message, but this doesn't
stop them from including text just to foil spam filters; invisible,
visible, at the bottom, at the top, in the middle, even between WORDS
just to muck with spam filters. If, for each parsed spam, I knew I could
cause the remote MX to generate arbitrary DNS lookups, not only do I
potentially have a way to DoS _that_ system, I also have a way to
potentially DoS any nameserver out there.
I'm not saying that designing a system to detect spammer URLs and score
against them isn't possible or even a good idea, simply to suggest that it
will need to be a LOT more intelligent than simply "pull out all the URLs,
perform DNS queries".
Your reasoning that spammers will not risk ever confusing the user for the
sake of wreaking havok makes far too many assumptions about the mindset
of the spammer based on the evidence we have in 2004, IMNSHO.
> I think that a reverse lookup on URLs would create a lot of good token
> data for the bayesian filter and would add to the accuracy of
> identifying spam.
Well, to each their own I suppose, but I would not consider running
anything like this unless I'd seen good hard evidence that it could not
be used nefariously when deployed on a large scale. So far I've seen
nothing like that.
--
.''`. Daniel DeVoe <dd...@netset.com>
: :' : http://www.netset.com/~ddevoe
`. `'`
`- Debian - when you have better things to do than fix a system
Re: [Bug 2948] New: URI tests against DNSBL and RHSBL
Posted by Marc Perkel <ma...@perkel.com>.
I don't agree with you on this because I think it would be unlikely that
a spammer would include a lot of other URLs. The reason is that spam
wants you to click on the url that takes them to their web site - so
they can sell you something - so other URLs compete with that link and
would reduce the spammers sales.
I think that a reverse lookup on URLs would create a lot of good token
data for the bayesian filter and would add to the accuracy of
identifying spam.
Dan DeVoe wrote:
>On Sun, 18 Jan 2004 bugzilla-daemon@bugzilla.spamassassin.org wrote:
>
>
>
>I'm sure this has been discussed on the list before, but the problem I
>see with performing DNS lookups on URLs contained in possible spam is that
>a nefarious spammer could simply attach a very large list of URLs which
>would reduce performance or even cause a denial of service as each one was
>looked up in turn. Bad enough problem if only DNSBLs are queried, but
>even worse when each domain is resolved to an IP in ADDITION to the DNSBL
>lookup.
>
>Especially given domains and nameservers under the control of spammers,
>this could be evil evil evil.
>
>
>
Re: [Bug 2948] New: URI tests against DNSBL and RHSBL
Posted by Dan DeVoe <dd...@zeus.netset.com>.
On Sun, 18 Jan 2004 bugzilla-daemon@bugzilla.spamassassin.org wrote:
> http://bugzilla.spamassassin.org/show_bug.cgi?id=2948
>
> Many additional e-mails can be caught if the URIs are extracted from the
> message and the FQDN portion compared against one or more DNSBL's and/or
> RHSBL's. Of course the FQDN would have to be resolved first before testing
> against a DNSBL. Some sort of caching and unique routines would probably need
> to be used to prevent excessive queries.
I'm sure this has been discussed on the list before, but the problem I
see with performing DNS lookups on URLs contained in possible spam is that
a nefarious spammer could simply attach a very large list of URLs which
would reduce performance or even cause a denial of service as each one was
looked up in turn. Bad enough problem if only DNSBLs are queried, but
even worse when each domain is resolved to an IP in ADDITION to the DNSBL
lookup.
Especially given domains and nameservers under the control of spammers,
this could be evil evil evil.
--
.''`. Daniel DeVoe <dd...@netset.com>
: :' : http://www.netset.com/~ddevoe
`. `'`
`- Debian - when you have better things to do than fix a system