You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@issues.apache.org on 2010/04/13 21:58:23 UTC

[Bug 6406] New: FROM_STARTS_WITH_NUMS matches on text-to-email

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

           Summary: FROM_STARTS_WITH_NUMS matches on text-to-email
           Product: Spamassassin
           Version: 3.3.1
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Rules
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: jason@i6ix.com


>From a thread on the users list of the same name, this rule is matching on
messages sent from cell phones via text-to-email.  Specifically, the From:
address is crafted in the format [phone_number]@carrier.com.  Martin Gregorie
has crafted a new rule that appears to meet the intent of the original rule,
but does not match when the From: address is all numbers.

describe MG_STARTSWITH_NUM  Sender address starts with numbers
header   MG_STARTSWITH_NUM  From =~ /\d{6,}[a-z._-][a-z0-9._-]{0,50}@/i


Can someone comment on the old rule, the new rule, or put this in their sandbox
for masschecks?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

--- Comment #1 from Karsten Bräckelmann <gu...@rudersport.de> 2010-04-13 16:57:43 EDT ---
(In reply to comment #0)
> header   MG_STARTSWITH_NUM  From =~ /\d{6,}[a-z._-][a-z0-9._-]{0,50}@/i

Why not limit it to the address part only? This feels overly complicated
anyway, to express "starts with numbers, but does not consist exclusively of
numbers". Oh, and it isn't anchored, so it matches also addresses that do not
*start* with numbers...

  From:addr =~ /^\d{2,50}\D/

What about this? Didn't read most of the thread (yet), and think about the RE
for too long... Take with some salt.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] [review] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

Kevin A. McGrail <km...@pccc.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kmcgrail@pccc.com
  Status Whiteboard|needs 1 vote                |ready to commit

--- Comment #7 from Kevin A. McGrail <km...@pccc.com> 2010-06-16 12:44:16 EDT ---
(In reply to comment #6)
> Created an attachment (id=4780)
 --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4780) [details]
> proposed rule change
> 
> Attached is the proposed rule change in 20_head_tests.cf :
> 
> -header FROM_STARTS_WITH_NUMS   From:addr =~ /^\d{6,}\S+\@/i
> -describe FROM_STARTS_WITH_NUMS From: starts with many numbers
> +header FROM_STARTS_WITH_NUMS   From:addr =~ /^\d{3,50}[^0-9\@]/
> +describe FROM_STARTS_WITH_NUMS From: starts with several numbers
> 
> Please vote.

+1

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] [review] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

Mark Martinec <Ma...@ijs.si> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|FROM_STARTS_WITH_NUMS       |[review]
                   |matches on text-to-email    |FROM_STARTS_WITH_NUMS
                   |                            |matches on text-to-email
  Status Whiteboard|                            |needs 1 vote

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

--- Comment #5 from Karsten Bräckelmann <gu...@rudersport.de> 2010-06-16 12:30:22 EDT ---
(In reply to comment #4)
> > That looks fine. I wonder if there is any deeper meaning
> > with requiring at least 6 leading digits in the original rule.

Frankly, no idea either.

> Based on that, my choice would be 3 digits:
> header FROM_STARTS_WITH_NUMS  From:addr =~ /^\d{3,50}[^0-9\@]/

+1  Sure, looks good to me. Using a minimum of 2 digits in my example rule was
made up out of thin air anyway. :)

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

--- Comment #4 from Mark Martinec <Ma...@ijs.si> 2010-06-16 09:11:25 EDT ---
(In reply to comment #3)
> < From:addr =~ /^\d{6,}\S+\@/i
> > From:addr =~ /^\d{2,50}[^0-9@]/
> 
> That looks fine. I wonder if there is any deeper meaning
> with requiring at least 6 leading digits in the original rule.

Trying to answer my own question, I prepared the following rules
and let it run on our regular mail traffic for the last 8 days:

header L_FROM_STARTS_WITH_NUMS_1  From:addr =~ /^\d{1,50}[^0-9\@]/
header L_FROM_STARTS_WITH_NUMS_2  From:addr =~ /^\d{2,50}[^0-9\@]/
header L_FROM_STARTS_WITH_NUMS_3  From:addr =~ /^\d{3,50}[^0-9\@]/
header L_FROM_STARTS_WITH_NUMS_4  From:addr =~ /^\d{4,50}[^0-9\@]/
header L_FROM_STARTS_WITH_NUMS_5  From:addr =~ /^\d{5,50}[^0-9\@]/
header L_FROM_STARTS_WITH_NUMS_6  From:addr =~ /^\d{6,50}[^0-9\@]/
header L_FROM_STARTS_WITH_NUMS_7  From:addr =~ /^\d{7,50}[^0-9\@]/

This is the result:

digits ham spam   S/O %
  1+    76  393   83.8
  2+    57  263   82.2
  3+     4  183   97.9
  4+     3   45   93.8
  5+     0   24  100
  6+     0   19  100
  7+     0   16  100

Based on that, my choice would be 3 digits:

header FROM_STARTS_WITH_NUMS  From:addr =~ /^\d{3,50}[^0-9\@]/

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

--- Comment #3 from Mark Martinec <Ma...@ijs.si> 2010-06-04 09:56:42 EDT ---
< From:addr =~ /^\d{6,}\S+\@/i
> From:addr =~ /^\d{2,50}[^0-9@]/

That looks fine. I wonder if there is any deeper meaning
with requiring at least 6 leading digits in the original rule.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] [review] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

Mark Martinec <Ma...@ijs.si> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #8 from Mark Martinec <Ma...@ijs.si> 2010-06-16 12:59:42 EDT ---
thanks!

  Bug 6406: FROM_STARTS_WITH_NUMS matches on text-to-email
Sending rules/20_head_tests.cf
Committed revision 955300.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

--- Comment #6 from Mark Martinec <Ma...@ijs.si> 2010-06-16 12:40:08 EDT ---
Created an attachment (id=4780)
 --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4780)
proposed rule change

Attached is the proposed rule change in 20_head_tests.cf :

-header FROM_STARTS_WITH_NUMS   From:addr =~ /^\d{6,}\S+\@/i
-describe FROM_STARTS_WITH_NUMS From: starts with many numbers
+header FROM_STARTS_WITH_NUMS   From:addr =~ /^\d{3,50}[^0-9\@]/
+describe FROM_STARTS_WITH_NUMS From: starts with several numbers

Please vote.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

Mark Martinec <Ma...@ijs.si> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|Undefined                   |3.3.2

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6406] FROM_STARTS_WITH_NUMS matches on text-to-email

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6406

--- Comment #2 from Karsten Bräckelmann <gu...@rudersport.de> 2010-04-13 17:08:49 EDT ---
Blarg. Forgot to code in the RE, what I was thinking about. "Do not match the
@, when checking for a non-digit, to avoid exactly this issue." This does.

  From:addr =~ /^\d{2,50}[^0-9@]/

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.