You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Joseph Acquisto <jo...@j4computers.com> on 2012/05/16 23:05:23 UTC
regex needed for http link
I have been unsuccessful creating a rule to detect and weight http links in message body, such as this one below:
http://boguslink.ru
The ones I have created get "hits" when tested on the command line, but don't seem to work in local.cf. Maybe that's the wrong place?
Re: ***Possible SPAM*** Re: regex needed for http link
Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 5/17/2012 at 6:16 PM, John Hardin <jh...@impsec.org> wrote:
> On Thu, 17 May 2012, Joseph Acquisto wrote:
>
>> I attempted to adapt something from a similar regex provided by a vendor
>> of a commercial product. It was to detect country codes we do not want
>> to accept mail from. No doubt my ignorance of SA and regex in general
>> will be on display for the amusement of many.
>>
>> rawbody URI_RU m,^https?://[^.\.][ru]/,i
>
> heh. Yeah, that won't work. "[]" means a character class, one character
> that matches anything within the square brackets.
>
> What the above RE says is:
>
> blah blah blah // (not-period OR period) (r OR u) /
>
> ...so it would match, for example:
>
> https://.r/
> https://.u/
>
> but never:
>
> https://{anything}.ru/
>
> And you actually had success testing that from the command line?
>
> --
> John Hardin KA7OHZ http://www.impsec.org/~jhardin/
> jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
> key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
I believe so. It was weeks ago that I did that (then comment it out, intending to get back to it).
I won't be able to focus on this for a while. I forgot we are having a social gathering tonight.
Sigh. Sometimes that sort of thing has to happen.
joe a.
Re: ***Possible SPAM*** Re: regex needed for http link
Posted by John Hardin <jh...@impsec.org>.
On Thu, 17 May 2012, Joseph Acquisto wrote:
> I attempted to adapt something from a similar regex provided by a vendor
> of a commercial product. It was to detect country codes we do not want
> to accept mail from. No doubt my ignorance of SA and regex in general
> will be on display for the amusement of many.
>
> rawbody URI_RU m,^https?://[^.\.][ru]/,i
heh. Yeah, that won't work. "[]" means a character class, one character
that matches anything within the square brackets.
What the above RE says is:
blah blah blah // (not-period OR period) (r OR u) /
...so it would match, for example:
https://.r/
https://.u/
but never:
https://{anything}.ru/
And you actually had success testing that from the command line?
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Justice is justice, whereas "social justice" is code for one set
of rules for the rich, another for the poor; one set for whites,
another set for minorities; one set for straight men, another for
women and gays. In short, it's the opposite of actual justice.
-- Burt Prelutsky
-----------------------------------------------------------------------
2 days until SpaceX Dragon first mission to ISS
***Possible SPAM*** Re: regex needed for http link
Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 5/17/2012 at 9:55 AM, John Hardin <jh...@impsec.org> wrote:
> On Wed, 16 May 2012, Joseph Acquisto wrote:
>
>>>>> On 5/16/2012 at 8:53 PM, "Joseph Acquisto" <jo...@j4computers.com> wrote:
>>>>>> On 5/16/2012 at 5:18 PM, Brent Gardner <bg...@gmail.com> wrote:
>>>>
>>>> How about:
>>>>
>>>> /\.ru\b/i
>>>
>>> I will give that a try.
>>
>> That worked. But I imagine it may trigger on innocuous instances of .ru as
> well, so it should also include check for http:// and wildcard for domain.
>
> What were you doing that _didn't_ detect that? The "proper" way is this:
>
> uri URI_DOT_RU /\.ru\b/i
>
> ...and let the body parser figure out the "link" context.
>
> Is there some reason that won't work?
>
> Could you post the rule you were originally using?
>
> --
> John Hardin KA7OHZ http://www.impsec.org/~jhardin/
>
I attempted to adapt something from a similar regex provided by a vendor
of a commercial product. It was to detect country codes we do not want
to accept mail from. No doubt my ignorance of SA and regex in general
will be on display for the amusement of many.
rawbody URI_RU m,^https?://[^.\.][ru]/,i
joe a.
Re: regex needed for http link
Posted by John Hardin <jh...@impsec.org>.
On Wed, 16 May 2012, Joseph Acquisto wrote:
>>>> On 5/16/2012 at 8:53 PM, "Joseph Acquisto" <jo...@j4computers.com> wrote:
>>>>> On 5/16/2012 at 5:18 PM, Brent Gardner <bg...@gmail.com> wrote:
>>>
>>> How about:
>>>
>>> /\.ru\b/i
>>
>> I will give that a try.
>
> That worked. But I imagine it may trigger on innocuous instances of .ru as well, so it should also include check for http:// and wildcard for domain.
What were you doing that _didn't_ detect that? The "proper" way is this:
uri URI_DOT_RU /\.ru\b/i
...and let the body parser figure out the "link" context.
Is there some reason that won't work?
Could you post the rule you were originally using?
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
If Microsoft made hammers, everyone would whine about how poorly
screws were designed and about how they are hard to hammer in, and
wonder why it takes so long to paint a wall using the hammer.
-----------------------------------------------------------------------
2 days until SpaceX Dragon first mission to ISS
Re: regex needed for http link
Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 5/16/2012 at 8:53 PM, "Joseph Acquisto" <jo...@j4computers.com> wrote:
>>>> On 5/16/2012 at 5:18 PM, Brent Gardner <bg...@gmail.com> wrote:
>> On 05/16/2012 02:15 PM, Joseph Acquisto wrote:
>>>>>> On 5/16/2012 at 5:05 PM, "Joseph Acquisto"<jo...@j4computers.com> wrote:
>>>> I have been unsuccessful creating a rule to detect and weight http links in
>>>> message body, such as this one below:
>>>>
>>>> http://boguslink.xx
>>>>
>>>> The ones I have created get "hits" when tested on the command line, but
>>>> don't seem to work in local.cf. Maybe that's the wrong place?
>>> I should have said, to detect the two character country code.
>>>
>> What are you using now?
>>
>> How about:
>>
>> /\.ru\b/i
>>
>>
>>
>> Brent Gardner
>
> I will give that a try.
That worked. But I imagine it may trigger on innocuous instances of .ru as well, so it should also include check for http:// and wildcard for domain.
joe a.
Re: regex needed for http link
Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 5/16/2012 at 5:18 PM, Brent Gardner <bg...@gmail.com> wrote:
> On 05/16/2012 02:15 PM, Joseph Acquisto wrote:
>>>>> On 5/16/2012 at 5:05 PM, "Joseph Acquisto"<jo...@j4computers.com> wrote:
>>> I have been unsuccessful creating a rule to detect and weight http links in
>>> message body, such as this one below:
>>>
>>> http://boguslink.ru
>>>
>>> The ones I have created get "hits" when tested on the command line, but
>>> don't seem to work in local.cf. Maybe that's the wrong place?
>> I should have said, to detect the two character country code.
>>
> What are you using now?
>
> How about:
>
> /\.ru\b/i
>
>
>
> Brent Gardner
I will give that a try.
Re: regex needed for http link
Posted by Brent Gardner <bg...@gmail.com>.
On 05/16/2012 02:15 PM, Joseph Acquisto wrote:
>>>> On 5/16/2012 at 5:05 PM, "Joseph Acquisto"<jo...@j4computers.com> wrote:
>> I have been unsuccessful creating a rule to detect and weight http links in
>> message body, such as this one below:
>>
>> http://boguslink.ru
>>
>> The ones I have created get "hits" when tested on the command line, but
>> don't seem to work in local.cf. Maybe that's the wrong place?
> I should have said, to detect the two character country code.
>
What are you using now?
How about:
/\.ru\b/i
Brent Gardner
Re: regex needed for http link
Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 5/16/2012 at 5:05 PM, "Joseph Acquisto" <jo...@j4computers.com> wrote:
> I have been unsuccessful creating a rule to detect and weight http links in
> message body, such as this one below:
>
> http://boguslink.ru
>
> The ones I have created get "hits" when tested on the command line, but
> don't seem to work in local.cf. Maybe that's the wrong place?
I should have said, to detect the two character country code.
Re: regex needed for http link
Posted by Joseph Acquisto <jo...@j4computers.com>.
>>> On 5/16/2012 at 8:28 PM, John Hardin <jh...@impsec.org> wrote:
> On Wed, 16 May 2012, Joseph Acquisto wrote:
>
>> I have been unsuccessful creating a rule to detect and weight http links in
> message body, such as this one below:
>>
>> http://boguslink.ru
>>
>> The ones I have created get "hits" when tested on the command line, but
>> don't seem to work in local.cf. Maybe that's the wrong place?
>
> Are you restarting the spamd or amavisd daemon after you make your change?
>
> --
> John Hardin KA7OHZ http://www.impsec.org/~jhardin/
> jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
> key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
> -----------------------------------------------------------------------
> Taking my gun away because I *might* shoot someone is like cutting
> my tongue out because I *might* yell "Fire!" in a crowded theater.
> -- Peter Venetoklis
> -----------------------------------------------------------------------
> 3 days until SpaceX Dragon first mission to ISS
Yes, I'm restarting. "rcspamd restart"
Re: regex needed for http link
Posted by John Hardin <jh...@impsec.org>.
On Wed, 16 May 2012, Joseph Acquisto wrote:
> I have been unsuccessful creating a rule to detect and weight http links in message body, such as this one below:
>
> http://boguslink.ru
>
> The ones I have created get "hits" when tested on the command line, but
> don't seem to work in local.cf. Maybe that's the wrong place?
Are you restarting the spamd or amavisd daemon after you make your change?
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Taking my gun away because I *might* shoot someone is like cutting
my tongue out because I *might* yell "Fire!" in a crowded theater.
-- Peter Venetoklis
-----------------------------------------------------------------------
3 days until SpaceX Dragon first mission to ISS