You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by gi...@paclan.it on 2022/05/02 07:16:59 UTC

Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Recently committed DecodeShortURLs.pm plugin is not compatible with previous third party plugin nor with what was in-tree until few days ago.
Users running trunk should carefully read docs and upgrade their database schema.
Should we warn users@ as well when we commit those incompatible changes ?
 Giovanni

On 5/1/22 20:13, hege@apache.org wrote:
> Author: hege
> Date: Sun May  1 18:13:16 2022
> New Revision: 1900464
> 
> URL: http://svn.apache.org/viewvc?rev=1900464&view=rev
> Log:
> Improve docs
> 
> Modified:
>     spamassassin/trunk/UPGRADE
>     spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm
> 
> Modified: spamassassin/trunk/UPGRADE
> URL: http://svn.apache.org/viewvc/spamassassin/trunk/UPGRADE?rev=1900464&r1=1900463&r2=1900464&view=diff
> ==============================================================================
> --- spamassassin/trunk/UPGRADE (original)
> +++ spamassassin/trunk/UPGRADE Sun May  1 18:13:16 2022
> @@ -196,7 +196,9 @@ Note for Users Upgrading to SpamAssassin
>  
>  - New DMARC policy check plugin
>  
> -- New DecodeShortURLs plugin
> +- New project maintained DecodeShortURLs plugin.  Note that it might not be
> +  directly compatible with rules from the old third party plugin.  See
> +  documentation for configuration and rule format.
>  
>  Note for Users Upgrading to SpamAssassin 3.4.5
>  ----------------------------------------------
> 
> Modified: spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm
> URL: http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm?rev=1900464&r1=1900463&r2=1900464&view=diff
> ==============================================================================
> --- spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm (original)
> +++ spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm Sun May  1 18:13:16 2022
> @@ -27,13 +27,25 @@ DecodeShortURLs - Expand shortened URLs
>  
>    url_shortener bit.ly
>    url_shortener go.to
> -  ...
> +
>    body HAS_SHORT_URL          eval:short_url()
> -  describe HAS_SHORT_URL      Message contains one or more shortened URLs
> +  describe HAS_SHORT_URL      Message has one or more shortened URLs
>  
>    body SHORT_URL_CHAINED      eval:short_url_chained()
>    describe SHORT_URL_CHAINED  Message has shortened URL chained to other shorteners
>  
> +  body SHORT_URL_MAXCHAIN     eval:short_url_maxchain()
> +  describe SHORT_URL_MAXCHAIN Message has shortened URL that causes more than 10 redirections
> +
> +  body SHORT_URL_LOOP         eval:short_url_loop()
> +  describe SHORT_URL_LOOP     Message has short URL that loops back to itself
> +
> +  body SHORT_URL_200          eval:short_url_200()
> +  describe SHORT_URL_200      Message has shortened URL returning HTTP 200
> +
> +  body SHORT_URL_404          eval:short_url_404()
> +  describe SHORT_URL_404      Message has shortened URL returning HTTP 404
> +
>  =head1 DESCRIPTION
>  
>  This plugin looks for URLs shortened by a list of URL shortening services and
> @@ -45,31 +57,28 @@ SpamAssassin which can then be accessed
>  This plugin also sets the rule HAS_SHORT_URL if any matching short URLs are
>  found.
>  
> -This plug-in will follow 'chained' shorteners e.g.
> -from short URL to short URL to short URL and finally to the real URL
> -
> +This plug-in will follow 'chained' shorteners e.g.  from short URL to short
> +URL to short URL and finally to the real URL.
>  
>  If this form of chaining is found, then the rule 'SHORT_URL_CHAINED' will be
> -fired.  If a loop is detected then 'SHORT_URL_LOOP' will be fired.
> -This plug-in limits the number of chained shorteners to a maximim of 10 at
> -which point it will fire the rule 'SHORT_URL_MAXCHAIN' and go no further.
> +fired.  If a loop is detected then 'SHORT_URL_LOOP' will be fired.  This
> +plug-in limits the number of chained shorteners to a maximum of 10 at which
> +point it will fire the rule 'SHORT_URL_MAXCHAIN' and go no further.
>  
>  If a shortener returns a '404 Not Found' result for the short URL then the
>  rule 'SHORT_URL_404' will be fired.
>  
> -If a shortener returns a '200 OK' result for the short URL then the
> -rule 'SHORT_URL_200' will be fired.
> -
> -This can cover the case when an abuse page is displayed.
> +If a shortener returns a '200 OK' result for the short URL then the rule
> +'SHORT_URL_200' will be fired.  This can cover the case when an abuse page
> +is displayed.
>  
>  =head1 NOTES
>  
> -This plugin runs the check_dnsbl hook with a priority of -10 so that
> -it may modify the parsed URI list prior to the URIDNSBL plugin which
> -runs as priority 0.
> +This plugin runs at the check_dnsbl hook with a priority of -10 so that it
> +may modify the parsed URI list prior to the URIDNSBL plugin.
>  
> -Currently the plugin queries a maximum of 10 distinct shortened URLs with
> -a maximum timeout of 5 seconds per lookup.
> +Currently the plugin queries a maximum of 10 distinct shortened URLs with a
> +maximum timeout of 5 seconds per lookup.
>  
>  =head1 ACKNOWLEDGEMENTS
>  
> 
> 


Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by Henrik K <he...@hege.li>.
On Mon, May 02, 2022 at 03:45:30PM +0200, giovanni@paclan.it wrote:
> On 5/2/22 15:28, Henrik K wrote:
> > On Mon, May 02, 2022 at 03:15:37PM +0200, giovanni@paclan.it wrote:
> >> actually it could (and can) work with any DBI provider, documentation should have been definitely improved. 
> > 
> > That's of course assuming that the SQL is so basic that it will work with
> > anything.
> > 
> > Then again the module works just fine even without caching, I've seen like
> > two short urls today on my mail feed.
> I saw few thousands today, this is why I need a working cache layer.

Let me know if you spot anything from my last updates.

I don't understand what the point of "modified" col even is, proper name for
it would have been "last_seen".  Also "hits" has really no function at all,
you could always parse stuff from logs, which don't get cleaned from TTL..


Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by gi...@paclan.it.
On 5/2/22 15:28, Henrik K wrote:
> On Mon, May 02, 2022 at 03:15:37PM +0200, giovanni@paclan.it wrote:
>> actually it could (and can) work with any DBI provider, documentation should have been definitely improved. 
> 
> That's of course assuming that the SQL is so basic that it will work with
> anything.
> 
> Then again the module works just fine even without caching, I've seen like
> two short urls today on my mail feed.
I saw few thousands today, this is why I need a working cache layer.

> Maybe went for a bit overkill on the
> database support side..  I'll see if I can back it down a bit today.
> 
> For future it would probably be better to have some common caching layer
> that modules could utilize and communicate cross-process.
> 
a general caching layer would help a lot and simplify some code but let's concentrate on release now,
this should be meat for the future.
 Giovanni

Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by Henrik K <he...@hege.li>.
On Mon, May 02, 2022 at 03:15:37PM +0200, giovanni@paclan.it wrote:
> actually it could (and can) work with any DBI provider, documentation should have been definitely improved. 

That's of course assuming that the SQL is so basic that it will work with
anything.

Then again the module works just fine even without caching, I've seen like
two short urls today on my mail feed.  Maybe went for a bit overkill on the
database support side..  I'll see if I can back it down a bit today.

For future it would probably be better to have some common caching layer
that modules could utilize and communicate cross-process.


Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by "Kevin A. McGrail" <km...@apache.org>.
I have spoke about using the plugin with mysql instructions at at least 
one conference with implementation notes.  We need to definitely 
consider people are using this in production.

On 5/2/2022 9:15 AM, giovanni@paclan.it wrote:
> No idea who is using that plugin with which database, I do not think that changing database schema only for MySQL
> is a good idea, better to have a similar database schema for all DBI providers to prevent possible future issues.
> I would send an email to users@ just to be sure.

-- 
Kevin A. McGrail
KMcGrail@Apache.org

Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by gi...@paclan.it.
On 5/2/22 13:10, Henrik K wrote:
> On Mon, May 02, 2022 at 12:25:04PM +0200, giovanni@paclan.it wrote:
>> On 5/2/22 09:40, Henrik K wrote:
>>> On Mon, May 02, 2022 at 09:16:59AM +0200, giovanni@paclan.it wrote:
>>>> Recently committed DecodeShortURLs.pm plugin is not compatible with previous third party plugin nor with what was in-tree until few days ago.
>>>> Users running trunk should carefully read docs and upgrade their database schema.
>>>> Should we warn users@ as well when we commit those incompatible changes ?
>>>
>>> I don't think I made any changes that would make trunk users incompatible,
>>> if that's what you are suggesting?
>>>
>>> SQLite version requirement I did bump up for upsert simplicity.
>>>
>> I tried to convert my database to the new schema and I had this error when running alter table queries:
>>
>> ALTER TABLE `short_url_cache` CHANGE `created` `created` TIMESTAMP on update CURRENT_TIMESTAMP NOT NULL; 
>> #1292 - Incorrect datetime value: '1641801897' for column `test`.`short_url_cache`.`created` at row 1
>>
>> MySQL TIMESTAMP is stored as "YYYY-MM-DD HH:MM:SS" and not as a Unix epoch timestamp.
>> For this to work cached data should be converted or emptied.
> 
> Ah ok I see...  so you are actually using MySQL?  I kind of figured it was
> not supported, since the plugin clearly documents only "sqlite" is supported
> 
> "Currently only sqlite is configured"
> "url_shortener_cache_type sqlite"
> 
> I did notice there is sql/decodeshorturl_mysql.sql but I kind of shrugged it
> off as something in development..
> 
> I actually just noticed that year ago there was actually some talk about
> improving the plugin..  well atleast now it should be generally usable.
>
actually it could (and can) work with any DBI provider, documentation should have been definitely improved. 
 
> If I hazard a guess, then hopefully you are actually the only one using
> MySQL, not sure it's worth bothering users@ about.  As a cache table there
> no need to convert anything, just drop it and create again.  If you prefer,
> the plugin can go back to using unix timestamps for MySQL, doesn't matter to
> me..
> 
No idea who is using that plugin with which database, I do not think that changing database schema only for MySQL
is a good idea, better to have a similar database schema for all DBI providers to prevent possible future issues.
I would send an email to users@ just to be sure.

 Cheers
  Giovanni

Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by Henrik K <he...@hege.li>.
On Mon, May 02, 2022 at 12:25:04PM +0200, giovanni@paclan.it wrote:
> On 5/2/22 09:40, Henrik K wrote:
> > On Mon, May 02, 2022 at 09:16:59AM +0200, giovanni@paclan.it wrote:
> >> Recently committed DecodeShortURLs.pm plugin is not compatible with previous third party plugin nor with what was in-tree until few days ago.
> >> Users running trunk should carefully read docs and upgrade their database schema.
> >> Should we warn users@ as well when we commit those incompatible changes ?
> > 
> > I don't think I made any changes that would make trunk users incompatible,
> > if that's what you are suggesting?
> > 
> > SQLite version requirement I did bump up for upsert simplicity.
> > 
> I tried to convert my database to the new schema and I had this error when running alter table queries:
> 
> ALTER TABLE `short_url_cache` CHANGE `created` `created` TIMESTAMP on update CURRENT_TIMESTAMP NOT NULL; 
> #1292 - Incorrect datetime value: '1641801897' for column `test`.`short_url_cache`.`created` at row 1
> 
> MySQL TIMESTAMP is stored as "YYYY-MM-DD HH:MM:SS" and not as a Unix epoch timestamp.
> For this to work cached data should be converted or emptied.

Ah ok I see...  so you are actually using MySQL?  I kind of figured it was
not supported, since the plugin clearly documents only "sqlite" is supported

"Currently only sqlite is configured"
"url_shortener_cache_type sqlite"

I did notice there is sql/decodeshorturl_mysql.sql but I kind of shrugged it
off as something in development..

I actually just noticed that year ago there was actually some talk about
improving the plugin..  well atleast now it should be generally usable.

If I hazard a guess, then hopefully you are actually the only one using
MySQL, not sure it's worth bothering users@ about.  As a cache table there
no need to convert anything, just drop it and create again.  If you prefer,
the plugin can go back to using unix timestamps for MySQL, doesn't matter to
me..


Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by gi...@paclan.it.
On 5/2/22 09:40, Henrik K wrote:
> On Mon, May 02, 2022 at 09:16:59AM +0200, giovanni@paclan.it wrote:
>> Recently committed DecodeShortURLs.pm plugin is not compatible with previous third party plugin nor with what was in-tree until few days ago.
>> Users running trunk should carefully read docs and upgrade their database schema.
>> Should we warn users@ as well when we commit those incompatible changes ?
> 
> I don't think I made any changes that would make trunk users incompatible,
> if that's what you are suggesting?
> 
> SQLite version requirement I did bump up for upsert simplicity.
> 
I tried to convert my database to the new schema and I had this error when running alter table queries:

ALTER TABLE `short_url_cache` CHANGE `created` `created` TIMESTAMP on update CURRENT_TIMESTAMP NOT NULL; 
#1292 - Incorrect datetime value: '1641801897' for column `test`.`short_url_cache`.`created` at row 1

MySQL TIMESTAMP is stored as "YYYY-MM-DD HH:MM:SS" and not as a Unix epoch timestamp.
For this to work cached data should be converted or emptied.

 Giovanni

Re: svn commit: r1900464 - in /spamassassin/trunk: UPGRADE lib/Mail/SpamAssassin/Plugin/DecodeShortURLs.pm

Posted by Henrik K <he...@hege.li>.
On Mon, May 02, 2022 at 09:16:59AM +0200, giovanni@paclan.it wrote:
> Recently committed DecodeShortURLs.pm plugin is not compatible with previous third party plugin nor with what was in-tree until few days ago.
> Users running trunk should carefully read docs and upgrade their database schema.
> Should we warn users@ as well when we commit those incompatible changes ?

I don't think I made any changes that would make trunk users incompatible,
if that's what you are suggesting?

SQLite version requirement I did bump up for upsert simplicity.