You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Craig Baird <cr...@xpressweb.com> on 2005/05/05 19:33:51 UTC
URIs being split over multiple lines
Most of my spam that's getting through at this point is stuff that has a URI
with multiple carriage returns in it like this:
<A href="h
ttp://eafbfowksugw.org&ghikk2hnvo32i7d21gun%2Eetn
eanim
bme%2Ecom/">
I know this trick has been discussed. I looked for a bug report, and couldn't
find one on this particular thing. I did find a thread in the archives about
this, and a couple of rules were suggested, but someone mentioned that at
least one of the rules results in a lot of FPs. Is anyone aware of a rule
that will catch these that doesn't trigger a lot of FPs?
Thanks!
Craig
Re: URIs being split over multiple lines
Posted by Martin Hepworth <ma...@solid-state-logic.com>.
Craig
I found SA 3.0.3 correctly spotted this and fed it to surbl.org URI-RBLs
which trapped it.
--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300
Craig Baird wrote:
> Most of my spam that's getting through at this point is stuff that has a URI
> with multiple carriage returns in it like this:
>
> <A href="h
> ttp://eafbfowksugw.org&ghikk2hnvo32i7d21gun%2Eetn
> eanim
> bme%2Ecom/">
>
> I know this trick has been discussed. I looked for a bug report, and couldn't
> find one on this particular thing. I did find a thread in the archives about
> this, and a couple of rules were suggested, but someone mentioned that at
> least one of the rules results in a lot of FPs. Is anyone aware of a rule
> that will catch these that doesn't trigger a lot of FPs?
>
> Thanks!
>
> Craig
**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.
This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.
**********************************************************************
Re: URIs being split over multiple lines
Posted by Loren Wilton <lw...@earthlink.net>.
> Then there's the other problem: rawbody rules seem to act on a
> line-by-line basis,
Yes. By design. :-( (Which I consider broken design.)
so you can look for /href=h$/ or /^ttp/ but not
> /href=h\nttp/
I'm pretty sure (but not positive) that the rawbody rule might have been
hitting in this case, because the line was broken with cr's rather than
actual newlines. On the other hand, I may well have added the rawbody rule
as a duplicate of the full rule without thinking about it. I tend to
duplicate most of my rawbody or full rules as both, since they will often
hit different cases.
Loren
A quick check of last month's spam indicates several hundred hits on these
rules. In every case it looks like both rules hit.
Re: URIs being split over multiple lines
Posted by Kelson <ke...@speed.net>.
Robert Menschel wrote:
> Best I've seen in a bunch of testing:
> rawbody __LW_URI_CR1 /href=\"[^"]*\r[^\n]/is
> full __LW_URI_CR2 /href=\"[^"]*\r[^\n]/is
> meta LW_URI_CR __LW_URI_CR1 || __LW_URI_CR2
> score LW_URI_CR 2
> describe LW_URI_CR unescaped cr in uri
> #hist LW_URI_CR Loren Wilton
> #counts LW_URI_CR 49s/0h of 292007 corpus (122219s/169788h RM) 04/27/05
>
> Doesn't catch all of them, for reasons I haven't yet figured out, but
> catches some, and no FPs here.
I have yet to get any hits on this one in over a week, despite receiving
several mails that look like they use this pattern. From what I can
tell, either the raw-CR spammers aren't targetting us, or something is
converting them to newlines before SA gets to see it.
Then there's the other problem: rawbody rules seem to act on a
line-by-line basis, so you can look for /href=h$/ or /^ttp/ but not
/href=h\nttp/
--
Kelson Vibber
SpeedGate Communications <www.speed.net>
Re: URIs being split over multiple lines
Posted by Robert Menschel <Ro...@Menschel.net>.
Hello Craig,
Thursday, May 5, 2005, 10:33:51 AM, you wrote:
CB> Most of my spam that's getting through at this point is stuff that has a URI
CB> with multiple carriage returns in it like this:
CB> <A href="h
CB> ttp://eafbfowksugw.org&ghikk2hnvo32i7d21gun%2Eetn
CB> eanim
bme%2Ecom/">>
CB> I know this trick has been discussed. I looked for a bug report, and couldn't
CB> find one on this particular thing. I did find a thread in the archives about
CB> this, and a couple of rules were suggested, but someone mentioned that at
CB> least one of the rules results in a lot of FPs. Is anyone aware of a rule
CB> that will catch these that doesn't trigger a lot of FPs?
Best I've seen in a bunch of testing:
rawbody __LW_URI_CR1 /href=\"[^"]*\r[^\n]/is
full __LW_URI_CR2 /href=\"[^"]*\r[^\n]/is
meta LW_URI_CR __LW_URI_CR1 || __LW_URI_CR2
score LW_URI_CR 2
describe LW_URI_CR unescaped cr in uri
#hist LW_URI_CR Loren Wilton
#counts LW_URI_CR 49s/0h of 292007 corpus (122219s/169788h RM) 04/27/05
Doesn't catch all of them, for reasons I haven't yet figured out, but
catches some, and no FPs here.
Bob Menschel
Re: URIs being split over multiple lines
Posted by Loren Wilton <lw...@earthlink.net>.
> Most of my spam that's getting through at this point is stuff that has a
URI
> with multiple carriage returns in it like this:
>
> I know this trick has been discussed. I looked for a bug report, and
couldn't
> find one on this particular thing. I did find a thread in the archives
about
There was a bug report, I filed one. Don't know current status, it may have
been closed.
I think we may have a new SARE rule to catch these sort of things. I've
been using a simple rule here that hasn't fp'ed. But then, I don't get a
huge amount of mail or spam. (Although spam is approximately 100x1 times the
real mail.)
Loren