You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Sebastian Arcus <s....@open-t.co.uk> on 2018/03/31 21:24:37 UTC
BODY custom rule not working if text and html parts are different?
I have a really simple rule looking for custom text string contained in
spam urls in the body of the email, like so:
body SHORT_BITCOIN_DATING /specific_string_here/i
score SHORT_BITCOIN_DATING 3.0
describe SHORT_BITCOIN_DATING Body URL signature of spam
I just realised that it is only working if the URL exists in both the
text and html versions. If the text version doesn't have the url, it
isn't working. Do "body" rules only work on the html part of the
message? I've tried searching through the documentation, but I can't see
that being the case. Maybe there is something else having an effect here?
Many thanks for any hints.
Re: BODY custom rule not working if text and html parts are
different?
Posted by Sebastian Arcus <s....@open-t.co.uk>.
On 31/03/18 22:39, John Hardin wrote:
> On Sat, 31 Mar 2018, Sebastian Arcus wrote:
>
>> I have a really simple rule looking for custom text string contained
>> in spam urls in the body of the email, like so:
>>
>> body SHORT_BITCOIN_DATING /specific_string_here/i
>> score SHORT_BITCOIN_DATING 3.0
>> describe SHORT_BITCOIN_DATING Body URL signature of spam
>>
>> I just realised that it is only working if the URL exists in both the
>> text and html versions. If the text version doesn't have the url, it
>> isn't working. Do "body" rules only work on the html part of the
>> message? I've tried searching through the documentation, but I can't
>> see that being the case. Maybe there is something else having an
>> effect here?
>
> "body" includes the *rendered* part of HTML. If the URL only appears
> within <a href="..."> in the HTML part then "body" will not see it.
>
> If you are looking for URLs, you should probably be using a "uri" rule.
> There are heuristics to pull those out of the body text, as well out of
> HTML tags.
Thank you for the suggestions - much appreciated. As my original rule
worked initially, I didn't realise the subtle difference between using
BODY and URI rules. It is working fine now. Thank you again!
Re: BODY custom rule not working if text and html parts are
different?
Posted by Sebastian Arcus <s....@open-t.co.uk>.
On 01/04/18 19:18, John Hardin wrote:
> On Sun, 1 Apr 2018, John Hardin wrote:
>
>> On Sun, 1 Apr 2018, Matus UHLAR - fantomas wrote:
>>
>>> On 01.04.18 05:47, Pedro David Marco wrote:
>>>> This is a problem i see oftenly...
>>>> what if the URL is only in the TEXT part and not in the HTML? many
>>>> email aplications show those URLs as clickable as if they were valid
>>>> HTML HREFs when they are not...
>>>
>>> in this case, body rule matches, but uri does not.
>>
>> I think there are hueristics to pull (non-obfuscated) URIs out of body
>> text.
>
> Yeah, just confirmed. A non-obfuscated URI in plain-text body part is
> recognized and extracted for uri rules.
That's great - thank you for testing this out and letting us know.
Re: BODY custom rule not working if text and html parts are
different?
Posted by John Hardin <jh...@impsec.org>.
On Mon, 2 Apr 2018, Pedro David Marco wrote:
>
>
>> Yeah, just confirmed. A non-obfuscated URI in plain-text body part is
>> recognized and extracted for uri rules.
>
> Thanks John... can you provide any pastebein sample please??...
It's trivially easy to add a URI to the text body part of any test message
you may have lying around. If you run SpamAssassin in rule debug mode and
add a rule like this to your test environment it will be really easy to
see the extracted URIs:
uri __ALL_URI /.+/
tflags __ALL_URI multiple
(running SA in rule debug mode:
./spamassassin -L -t --debug area=all,rules,rules-all < $MSG
)
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Gun Control: The theory that a woman found dead in an alley,
raped and strangled with her panty hose, is somehow
morally superior to a woman explaining to police
how her attacker got that fatal bullet wound. -- L. Neil Smith
-----------------------------------------------------------------------
368 days since the first commercial re-flight of an orbital booster (SpaceX)
Re: BODY custom rule not working if text and html parts are
different?
Posted by Pedro David Marco <pe...@yahoo.com>.
>Yeah, just confirmed. A non-obfuscated URI in plain-text body part is
>recognized and extracted for uri rules.
Thanks John... can you provide any pastebein sample please??...
----PedroD
Re: BODY custom rule not working if text and html parts are
different?
Posted by John Hardin <jh...@impsec.org>.
On Sun, 1 Apr 2018, John Hardin wrote:
> On Sun, 1 Apr 2018, Matus UHLAR - fantomas wrote:
>
>> On 01.04.18 05:47, Pedro David Marco wrote:
>>> This is a problem i see oftenly...
>>> what if the URL is only in the TEXT part and not in the HTML? many email
>>> aplications show those URLs as clickable as if they were valid HTML HREFs
>>> when they are not...
>>
>> in this case, body rule matches, but uri does not.
>
> I think there are hueristics to pull (non-obfuscated) URIs out of body text.
Yeah, just confirmed. A non-obfuscated URI in plain-text body part is
recognized and extracted for uri rules.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
The reason it took so long to get Bin Laden is that it took the
SEALs five years to swim that far into the desert. -- anon
-----------------------------------------------------------------------
Today: April Fools' day
Re: BODY custom rule not working if text and html parts are
different?
Posted by John Hardin <jh...@impsec.org>.
On Sun, 1 Apr 2018, Matus UHLAR - fantomas wrote:
> On 01.04.18 05:47, Pedro David Marco wrote:
>> This is a problem i see oftenly...
>> what if the URL is only in the TEXT part and not in the HTML? many email
>> aplications show those URLs as clickable as if they were valid HTML HREFs
>> when they are not...
>
> in this case, body rule matches, but uri does not.
I think there are hueristics to pull (non-obfuscated) URIs out of body
text.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
The reason it took so long to get Bin Laden is that it took the
SEALs five years to swim that far into the desert. -- anon
-----------------------------------------------------------------------
Today: April Fools' day
Re: BODY custom rule not working if text and html parts are
different?
Posted by Sebastian Arcus <s....@open-t.co.uk>.
On 01/04/18 07:10, Matus UHLAR - fantomas wrote:
> On 01.04.18 05:47, Pedro David Marco wrote:
>> This is a problem i see oftenly...
>> what if the URL is only in the TEXT part and not in the HTML? many
>> email aplications show those URLs as clickable as if they were valid
>> HTML HREFs when they are not...
>
> in this case, body rule matches, but uri does not.
I wonder if RAWBODY would match the url both in the text part and in the
html part? Does anybody know?
Re: BODY custom rule not working if text and html parts are
different?
Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 01.04.18 05:47, Pedro David Marco wrote:
> This is a problem i see oftenly...
>what if the URL is only in the TEXT part and not in the HTML? many email aplications show those URLs as clickable as if they were valid HTML HREFs when they are not...
in this case, body rule matches, but uri does not.
--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Microsoft dick is soft to do no harm
Re: BODY custom rule not working if text and html parts are different?
Posted by Leandro <le...@spfbl.net>.
2018-04-01 2:47 GMT-03:00 Pedro David Marco <pe...@yahoo.com>:
> This is a problem i see oftenly...
>
> what if the URL is only in the TEXT part and not in the HTML? many email
> aplications show those URLs as clickable as if they were valid HTML HREFs
> when they are not...
>
We have a script that can extract URLs at text part. Lines 998-1016:
https://www.dropbox.com/s/5aorrijafw5ygk0/uribl.pl?dl=0
You can use it as model to your own script or use it as is.
>
> -----
> PedroD
>
Re: BODY custom rule not working if text and html parts are
different?
Posted by Pedro David Marco <pe...@yahoo.com>.
This is a problem i see oftenly...
what if the URL is only in the TEXT part and not in the HTML? many email aplications show those URLs as clickable as if they were valid HTML HREFs when they are not...
-----PedroD
Re: BODY custom rule not working if text and html parts are
different?
Posted by John Hardin <jh...@impsec.org>.
On Sat, 31 Mar 2018, Sebastian Arcus wrote:
> I have a really simple rule looking for custom text string contained in spam
> urls in the body of the email, like so:
>
> body SHORT_BITCOIN_DATING /specific_string_here/i
> score SHORT_BITCOIN_DATING 3.0
> describe SHORT_BITCOIN_DATING Body URL signature of spam
>
> I just realised that it is only working if the URL exists in both the text
> and html versions. If the text version doesn't have the url, it isn't
> working. Do "body" rules only work on the html part of the message? I've
> tried searching through the documentation, but I can't see that being the
> case. Maybe there is something else having an effect here?
"body" includes the *rendered* part of HTML. If the URL only appears
within <a href="..."> in the HTML part then "body" will not see it.
If you are looking for URLs, you should probably be using a "uri" rule.
There are heuristics to pull those out of the body text, as well out of
HTML tags.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Liberals love sex ed because it teaches kids to be safe around their
sex organs. Conservatives love gun education because it teaches kids
to be safe around guns. However, both believe that the other's
education goals lead to dangers too terrible to contemplate.
-----------------------------------------------------------------------
Tomorrow: April Fools' day