You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@bugzilla.spamassassin.org on 2009/09/14 21:30:05 UTC

[Bug 6200] New: Replace use of Digest::SHA1 by Digest::SHA

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

           Summary: Replace use of Digest::SHA1 by Digest::SHA
           Product: Spamassassin
           Version: 3.3.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: Libraries
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: Mark.Martinec@ijs.si


- perl module Digest::SHA covers and extends the functionality of Digest::SHA1,
  providing SHA hashing for some additional bit sizes (224, 256, 384, 512),
  besides the 160 bit SHA-1; its usage is compatible with Digest::SHA1;

- with perl 5.10.0 the module Digest::SHA became a perl base module, i.e.
  it comes installed with a perl itself (Digest::SHA1 is not a base module,
  and never was);

- the DKIM signature verification requires support of SHA-256 for a compliant
  implementation; module Mail::DKIM (as used by the DKIM plugin) needs
  to use Digest::SHA for this reason, so Digest::SHA is already 'in';

- the following SA modules currently use Digest::SHA1:
  BayesStore/BDB.pm, BayesStore/SQL.pm, BayesStore/DBM.pm,
  Plugin/Bayes.pm, Plugin/Hashcash.pm, (and Util/DependencyInfo.pm)

So, how about switching from Digest::SHA1 to Digest::SHA for a 3.3.0 ?
The required change is a trivial syntactical change.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

--- Comment #10 from Mark Martinec <Ma...@ijs.si> 2009-11-11 09:42:19 UTC ---
Created an attachment (id=4570)
 --> (https://issues.apache.org/SpamAssassin/attachment.cgi?id=4570)
A patch to Razor agents, mentioned in Comment 4

For anyone wanting to loose a dependency on Digest::SHA1 in Razor2,
attached is a patch to razor-agents, and below is my mail that I sent to
Richard Soderberg on 2009-10-26 (who seemed to be a current maintainer,
but apparently is not, as I got no response).



Richard,

I wonder, are you still the one maintaining razor-agents?

Considering the:

  https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

it seems Razor2 is about the only remaining module used by SpamAssassin 3.3
still depending on a module Digest::SHA1. The rest can live with Digest::SHA,
which is compatible with Digest::SHA1, but provides additional functionality
(e.g. needed by DKIM signature validation/generation).  With perl 5.10.0
the Digest::SHA became a perl base module (while Digest::SHA1 is not a base
module, it needs to be installed separately).

I'm attaching a patch against razor-agents-2.84, which lets it use either
the Digest::SHA or the Digest::SHA1, whichever happens to be available.
I wonder if this can be incorporated into the next release.


On a related note, examining the use of SHA1 one thing pops out and
appears to me like a bug. The code in Signature::Whiplash::whiplash :

        my $sha1 = Digest::SHA1->new();

        $sha1->add($host);
        $sig = substr $sha1->hexdigest, 0, 12;

        $sha1->add($corrected_length);
        $sig .= substr $sha1->hexdigest, 0, 4;

is equivalent to my:

         $sig = substr sha1_hex($host), 0, 12;
         $sig .= substr sha1_hex($corrected_length), 0, 4;


I wonder if this was really what was meant. Seems like the
intention was to do a:

        my $sha1 = Digest::SHA1->new();

        $sha1->add($host);
        $sig = substr $sha1->clone->hexdigest, 0, 12;

        $sha1->add($corrected_length);
        $sig .= substr $sha1->hexdigest, 0, 4;

(note the use of ->clone), which would be equivalent to:

         $sig = substr sha1_hex($host), 0, 12;
         $sig .= substr sha1_hex($host, $corrected_length), 0, 4;

It seems to me that whoever did the coding forgot that the hexdigest
method (and other related methods) destroy the digests state.

If this indeed a bug, fixing it will probably cause incompatibility
in queries between fixed and unfixed clients.

Regards
   Mark

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

--- Comment #4 from Mark Martinec <Ma...@ijs.si> 2009-10-30 17:45:58 UTC ---
I went through code and tests and seems to have fished out all cases.
The only external perl module that I'm aware of that is needed by SA
and still depends on Digest::SHA1 is Razor. I send a patch (similar to
my suggestion in #3) to its maintainer, although I'm not holding breath
to receive any response.

Running it now on our server (with patched razor) without Digest::SHA1.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

Justin Mason <jm...@jmason.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jm@jmason.org

--- Comment #7 from Justin Mason <jm...@jmason.org> 2009-11-09 10:02:50 UTC ---
(In reply to comment #6)
> Bug 6200: let SA use either Digest::SHA or Digest::SHA1,
> whichever is available (the Digest::SHA is now a base
> module since perl 5.10.0)

hmm, may still need some work, as Hudson is failing to build.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

--- Comment #6 from Mark Martinec <Ma...@ijs.si> 2009-11-09 09:38:48 UTC ---
Bug 6200: let SA use either Digest::SHA or Digest::SHA1,
whichever is available (the Digest::SHA is now a base
module since perl 5.10.0)
Sending        INSTALL
Sending        build/sha1sum.pl
Sending        lib/Mail/SpamAssassin/BayesStore/BDB.pm
Sending        lib/Mail/SpamAssassin/BayesStore/DBM.pm
Sending        lib/Mail/SpamAssassin/BayesStore/SQL.pm
Sending        lib/Mail/SpamAssassin/Plugin/Bayes.pm
Sending        lib/Mail/SpamAssassin/Plugin/Hashcash.pm
Sending        lib/Mail/SpamAssassin/Util/DependencyInfo.pm
Sending        masses/corpora/fuzzy-hash-maildir
Sending        masses/rule-dev/seek-phrases-in-log
Sending        sa-update.raw
Sending        t/mimeparse.t
Sending        t/rule_names.t
Sending        t/sha1.t
Transmitting file data ..............
Committed revision 834155.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

Mark Martinec <Ma...@ijs.si> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #9 from Mark Martinec <Ma...@ijs.si> 2009-11-11 09:32:42 UTC ---
Closing. Please reopen if any issues or concerns pop up.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

--- Comment #5 from Mark Martinec <Ma...@ijs.si> 2009-11-06 11:55:37 UTC ---
Just in case people would worry about a change in performance, I did some
benchmarking. Generally, the sha1 in Digest::SHA has a higher call cost,
but then it runs faster on data, so this means that Digest::SHA is
better for digesting long strings (like generating message id from half
of the body text), while Digest::SHA1 is better at digesting many small
strings, like tokenization in bayes.

Probably the heaviest consumer of sha1 digesting is converting tokens to
hashes in Bayes.pm, so I instrumented it with some statistics gathering
and let it run on real mail, pouring into our mailer. Typically for one
mail message it needs to process 120..200 tokens, and occasionally more.
Typical average token length is between 9 and 14 characters.

The loop that goes through all the tokens and hashes them typically
takes about 1 ms per message, and a difference between Digest::SHA and
Digest::SHA1 is about 0.2 ms per message. So, nothing to worry about,
hardly measurable. If one would want to squeeze out the last drop,
a MD5 hashing should have been be used.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

Warren Togami <wt...@redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wtogami@redhat.com

--- Comment #1 from Warren Togami <wt...@redhat.com> 2009-09-14 15:34:59 PDT ---
I wonder how this would effect spamassassin's perl version compatibility?

Older distributions of perl didn't ship with Digest::SHA.  I recently built
Digest::SHA for EPEL5 in order to support Mail::DKIM and it seems to be working
fine for DKIM at least.  I didn't test it with RHEL-4 yet, which has perl-5.8.5
and the oldest distribution that we possibly would care for spamassassin.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

Mark Martinec <Ma...@ijs.si> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Mark.Martinec@ijs.si
   Target Milestone|Undefined                   |3.3.0

--- Comment #2 from Mark Martinec <Ma...@ijs.si> 2009-09-14 18:18:32 PDT ---
> I wonder how this would effect spamassassin's perl version compatibility?
> 
> Older distributions of perl didn't ship with Digest::SHA.  I recently built
> Digest::SHA for EPEL5 in order to support Mail::DKIM and it seems to be
> working fine for DKIM at least.  I didn't test it with RHEL-4 yet, which
> has perl-5.8.5 and the oldest distribution that we possibly would care
> for spamassassin.

With versions of perl before the 5.10.0, one has to install one or
the other module anyway, as it does not come with the perl package.
I do not doubt that Digest::SHA installs and works just fine with 5.8.x,
as this is not a new module. What I don't know is whether it is more
difficult to install Digest::SHA compared to Digest::SHA1 on systems
that you mention. On FreeBSD it is very straightforward. The Solaris
CSW packages also have both modules available.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

--- Comment #8 from Mark Martinec <Ma...@ijs.si> 2009-11-09 10:04:19 UTC ---
Bug 6200 cont'd: fixed DependencyInfo.pm
Sending        lib/Mail/SpamAssassin/Util/DependencyInfo.pm
Committed revision 834162.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6200] Replace use of Digest::SHA1 by Digest::SHA

Posted by bu...@bugzilla.spamassassin.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6200

--- Comment #3 from Mark Martinec <Ma...@ijs.si> 2009-10-16 08:39:41 UTC ---
So, how about replacing the existing:

use Digest::SHA1 qw(sha1);

with:

BEGIN {
  eval { require Digest::SHA1; import Digest::SHA1 qw(sha1); 1 }
  or do { require Digest::SHA; import Digest::SHA qw(sha1) }
}


(or the other way around, trying SHA first and falling back to SHA1)


This way we could lose hard dependency on Digest::SHA1 (which is not a
base module), but retain compatibility with sites which will disable
the DKIM plugin and refuse to install Digest::SHA on pre-10.0 perls.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.