You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2007/08/24 13:51:05 UTC

[Bug 5624] New: Add --no-uri switch to seek-phrases-in-corpus

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5624

           Summary: Add --no-uri switch to seek-phrases-in-corpus
           Product: Spamassassin
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Masses
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: alex.uribl@gmail.com


Lookin at JM's SOUGHT rules, there may be rules which include URIs & become
redundant with URIBL listings & may be a waste to have them in static form.

may also be a good addition for seek-phrases-in-log

Comments?



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5624] Add --no-uri switch to seek-phrases-in-corpus

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5624





------- Additional Comments From jm@jmason.org  2007-08-24 09:42 -------
> IMO, nowadays, most "safe" to list URL'd spam runs usually last less than 4 hrs
> and I'd expect at least 4 URIBL zones to catch them faster than any .cf rule
> which is why I'd prefer not to included any URI in autogen'd rulesets.

not all URIs found in spam patterns would be in URIBL, I would guess.  I have to
make a guess, since you guys don't publish your listing criteria, so who knows ;)



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5624] Add --no-uri switch to seek-phrases-in-corpus

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5624


jm@jmason.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX




------- Additional Comments From jm@jmason.org  2007-08-26 10:37 -------
(In reply to comment #4)
> (In reply to comment #3)
> > not all URIs found in spam patterns would be in URIBL, I would guess.  I have to
> > make a guess, since you guys don't publish your listing criteria, so who
knows ;)
> 
> 1. no too good a guess.  SURBL/URIBL.com probably list more than many ppl may
> approve to.
>
> 2. No published listing criteria?
> http://www.surbl.org/lists.html
> http://www.uribl.com/about.shtml

ah, ok, missed that ;)

> If this request is useless - just "Wont Fix" it - I'll weed out the URI rules in
> the results. Just thought others could find it a usefull feature as well.

I'd prefer not to fix it.  The thing is, those URLs could be anything --
including  nonspam urls -- the important thing is the rest of the string, "see
Attach" or whatever it is.

also I'd prefer not to have to perform lookups to vet out the bad URIs.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5624] Add --no-uri switch to seek-phrases-in-corpus

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5624





------- Additional Comments From alex.uribl@gmail.com  2007-08-24 08:48 -------
(In reply to comment #1)
> (In reply to comment #0)
> > Lookin at JM's SOUGHT rules, there may be rules which include URIs & become
> > redundant with URIBL listings & may be a waste to have them in static form.
> 
> well, they don't include *just* the URI:
> 
> body __SEEK_MFY7BF  /See attach\. http:\/\/www\.[url]\.net\//
> 
> note the "See attach\. " beforehand, which is very unusual.
> 
> also, they aren't static -- they're regenerated daily...

asking: will my request be trashed?

IMO, nowadays, most "safe" to list URL'd spam runs usually last less than 4 hrs
and I'd expect at least 4 URIBL zones to catch them faster than any .cf rule
which is why I'd prefer not to included any URI in autogen'd rulesets.

If this requires a Amazon type bribe, pls data mail directly :-)

Alex





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5624] Add --no-uri switch to seek-phrases-in-corpus

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5624





------- Additional Comments From alex.uribl@gmail.com  2007-08-24 10:30 -------
(In reply to comment #3)
> not all URIs found in spam patterns would be in URIBL, I would guess.  I have to
> make a guess, since you guys don't publish your listing criteria, so who knows ;)

1. no too good a guess.  SURBL/URIBL.com probably list more than many ppl may
approve to.

2. No published listing criteria?
http://www.surbl.org/lists.html
http://www.uribl.com/about.shtml

If this request is useless - just "Wont Fix" it - I'll weed out the URI rules in
the results. Just thought others could find it a usefull feature as well.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

[Bug 5624] Add --no-uri switch to seek-phrases-in-corpus

Posted by bu...@bugzilla.spamassassin.org.
http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5624





------- Additional Comments From jm@jmason.org  2007-08-24 07:22 -------
(In reply to comment #0)
> Lookin at JM's SOUGHT rules, there may be rules which include URIs & become
> redundant with URIBL listings & may be a waste to have them in static form.

well, they don't include *just* the URI:

body __SEEK_MFY7BF  /See attach\. http:\/\/www\.[url]\.net\//

note the "See attach\. " beforehand, which is very unusual.

also, they aren't static -- they're regenerated daily...



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.