You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Celene <ce...@sandydproductions.com> on 2013/07/06 21:24:52 UTC

Re: Spamassassin with single link in body

On 6/25/2013 4:47 PM, John Hardin wrote:
> On Tue, 25 Jun 2013, Celene wrote:
>
>> I am currently getting lots of messages with just a single url in them.
>> Is there a way for spamassassin to match those?
>>
>> Example: http://pastebin.com/UZtzfyEs
>
> I've added a test rule for this to my sandbox, we'll see what the
> masscheck corpus thinks of it.
>
> Don't expect too much, though. A one-line body containing a URI is
> very little data upon which to make a decision. That's a very common
> and legitimate practice: people will dash off a quick email to a
> friend to tell them about a web page they found with no more than
> "Check this out!" and the URI.
>
> Maybe it would be useful with FP avoidance exclusions, but I don't
> have high hopes. At best it might add another point and push a spam
> over the top without too badly affecting hams. It may be too much to
> hope that whitelisting or AWL would compensate for FPs from this.
>
Sorry about the long response - took some time to get my IP Address off
the Sorbs DUHL list...

To be honest, I have never gotten any emails from people with only a
URL, unless they are spam, so this shouldn't be a problem. I just want
to match all emails that have a single link in the body

Re: SOLVED: Spamassassin with single link in body

Posted by Martin Gregorie <ma...@gregorie.org>.
On Wed, 2013-07-10 at 22:37 -0700, Celene wrote:
> On 7/6/2013 2:07 PM, Martin Gregorie wrote:
> > On Sat, 2013-07-06 at 12:24 -0700, Celene wrote:
> >> To be honest, I have never gotten any emails from people with only a
> >> URL, unless they are spam, so this shouldn't be a problem. I just want
> >> to match all emails that have a single link in the body
> > I'm getting reasonable results from this:
> >
> > rawbody  MG_BARE_URI  /^\s{0,10}(http:|www\.)\S{1,70}\s{0,10}$/i
> >
> Ive been testing this, and it seems to have the effect I want.
> 
Good.

FWIW, I can probably get away with more dangerous rules than a lot of
people on this list partly because my MTA serves a small closed group of
users and partly because I use an auto-whitelister. This is an SA rule
and associated module that queries my Postgres-based mail archive:
senders of incoming mail are compared with recipients of previously sent
outgoing mail and matches are white-listed.
  
Martin





SOLVED: Spamassassin with single link in body

Posted by Celene <ce...@sandydproductions.com>.
On 7/6/2013 2:07 PM, Martin Gregorie wrote:
> On Sat, 2013-07-06 at 12:24 -0700, Celene wrote:
>> To be honest, I have never gotten any emails from people with only a
>> URL, unless they are spam, so this shouldn't be a problem. I just want
>> to match all emails that have a single link in the body
> I'm getting reasonable results from this:
>
> rawbody  MG_BARE_URI  /^\s{0,10}(http:|www\.)\S{1,70}\s{0,10}$/i
>
>
> though I'm using it as part of a meta-rule that applies further checks
> if and only if this rule fires. It seems to be sufficiently specific:
> when I ran it against the spam collection I use for regression testing
> rules (849 spams) it only fired on the ones I expected it to recognise.
>
>
> Martin
>
>  
>
>
Ive been testing this, and it seems to have the effect I want.

Thanks!
Celene

Re: Spamassassin with single link in body

Posted by Martin Gregorie <ma...@gregorie.org>.
On Sat, 2013-07-06 at 12:24 -0700, Celene wrote:
> To be honest, I have never gotten any emails from people with only a
> URL, unless they are spam, so this shouldn't be a problem. I just want
> to match all emails that have a single link in the body

I'm getting reasonable results from this:

rawbody  MG_BARE_URI  /^\s{0,10}(http:|www\.)\S{1,70}\s{0,10}$/i


though I'm using it as part of a meta-rule that applies further checks
if and only if this rule fires. It seems to be sufficiently specific:
when I ran it against the spam collection I use for regression testing
rules (849 spams) it only fired on the ones I expected it to recognise.


Martin

 



Re: Spamassassin with single link in body

Posted by Axb <ax...@gmail.com>.
On 07/06/2013 10:39 PM, Rick Cooper wrote:
> Benny Pedersen wrote:
>> Celene skrev den 2013-07-06 21:24:
>>
>>>>> Example: http://pastebin.com/UZtzfyEs
>>
>>> To be honest, I have never gotten any emails from people with only a
>>> URL, unless they are spam, so this shouldn't be a problem. I just
>>> want to match all emails that have a single link in the body
>>
>> uri __HAS_URL /./
>> meta LESS_THEN_2_URL (__HAS_URL < 2)
>> describe LESS_THEN_2_URL Meta: have one single url
>> score LESS_THEN_2_URL 0.1
>>
>> untested
>
> This would match a message that had a body full of text that was legit but
> only had one link, I believe the OP wants to match a body that is blank
> except for a single url and I found no way to do that without writing a
> custom plugin

*SA 3.4* includes the magic to do stuff like

body           BODY_LENGTH_1o0        eval:check_body_length('100')

meta that with Benny's "LESS_THEN_2_URL" and bingo!
(not sure if one could use an if condition to make it even cheaper to run.)



RE: Spamassassin with single link in body

Posted by Rick Cooper <rc...@dwford.com>.
Benny Pedersen wrote:
> Celene skrev den 2013-07-06 21:24:
> 
>>>> Example: http://pastebin.com/UZtzfyEs
> 
>> To be honest, I have never gotten any emails from people with only a
>> URL, unless they are spam, so this shouldn't be a problem. I just
>> want to match all emails that have a single link in the body
> 
> uri __HAS_URL /./
> meta LESS_THEN_2_URL (__HAS_URL < 2)
> describe LESS_THEN_2_URL Meta: have one single url
> score LESS_THEN_2_URL 0.1
> 
> untested

This would match a message that had a body full of text that was legit but
only had one link, I believe the OP wants to match a body that is blank
except for a single url and I found no way to do that without writing a
custom plugin

Rick


Re: Spamassassin with single link in body

Posted by Benny Pedersen <me...@junc.eu>.
Celene skrev den 2013-07-06 21:24:

>>> Example: http://pastebin.com/UZtzfyEs

> To be honest, I have never gotten any emails from people with only a
> URL, unless they are spam, so this shouldn't be a problem. I just 
> want
> to match all emails that have a single link in the body

uri __HAS_URL /./
meta LESS_THEN_2_URL (__HAS_URL < 2)
describe LESS_THEN_2_URL Meta: have one single url
score LESS_THEN_2_URL 0.1

untested

-- 
senders that put my email into body content will deliver it to my own 
trashcan, so if you like to get reply, dont do it

Re: Spamassassin with single link in body

Posted by RW <rw...@googlemail.com>.
On Sat, 06 Jul 2013 20:44:11 -0700
Dave Warren wrote:

> On 2013-07-06 12:24, Celene wrote:
> > To be honest, I have never gotten any emails from people with only a
> > URL, unless they are spam, so this shouldn't be a problem.
> 
> While this might work for you personally, I would encourage you to
> avoid applying this type of rule in the general case.
> 
> I regularly send emails to people with a link so they can review the 
> site we're discussing while we're on the phone. Since we're already
> on the phone, no additional context is needed, and therefore the
> email might not include any.
> 


This isn't really an argument for not having such a rule, we already
have  MISSING_SUBJECT (1.8 points) which is a similar sort of thing.
The important thing is whether there's a score that improves spam
detection without the inevitable rule FPs translating into a
significant number of classification FPs. I would suggest that if
anyone implements such a rule, it shouldn't overlap with
MISSING_SUBJECT, and there should be an attachment test too.









Re: Spamassassin with single link in body

Posted by Dave Warren <da...@hireahit.com>.
On 2013-07-06 12:24, Celene wrote:
> To be honest, I have never gotten any emails from people with only a
> URL, unless they are spam, so this shouldn't be a problem.

While this might work for you personally, I would encourage you to avoid 
applying this type of rule in the general case.

I regularly send emails to people with a link so they can review the 
site we're discussing while we're on the phone. Since we're already on 
the phone, no additional context is needed, and therefore the email 
might not include any.

Admittedly corporate signatures probably negate this having too huge an 
effect in the professional world, I've sent emails to my mom that are 
just the same, and have received similar when working with people on an 
existing issue.


-- 
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren