You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/09/28 08:27:55 UTC

Re: speedup for PerMsgStatus

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Loren Wilton writes:
> > > I have no idea how painful linked lists are in Perl (or if they even
> exist).
> >
> > Why are you commenting then???
> 
> Because they are very useful, as I pointed out.
> 
> > They don't exist as a native data structure.  Arrays are fast, painless,
> > and dynamically sized.
> 
> They don't exist as a native data structure in C++ either.  But they get a
> lot of use. Even when template classes exist to do reasonably fast and
> reasonably painless dynamic arrays.  For certain things (like collections of
> objects that can get reordered frequently) they are generally more efficient
> than dynamic arrays.
> 
> If there is an SA coding requirement for only using native data structures,
> then forget lists.  If no such requirement exists and there is an interest
> in optimizing performance, then they should be a tool to be considered.

Unfortunately, perl speed optimisation doesn't work like that.
The reason is that perl native data structures (arrays, hashes, strings,
numeric SVs, etc.) can be looked up in one perl OP, but a user-defined
data structure cannot.

The OP is the lowest level "command" in the perl VM, equivalent to an
assembly opcode, and as such is very very fast -- since the innards of an
OP is pure C.   That's why regexp matching in perl is as fast as it
is in C -- because a regexp match is compiled to a single OP.

(Perl's not like Java in that respect.  Perl's vm has quite high-level
opcodes, whereas java's is more like "real" assembly and more low-level.
that's why perl is faster than java ;)

Unfortunately when reading fields in a perl data structure like a hash or
array, and traversing reference chains, each variable access, and
ref derefence, is an individual OP.

So the upshot is that using a native perl data type will always be
faster than defining a new "non-native" data type structure in perl.

cf. http://www.ccl4.org/~nick/P/Fast_Enough/#ops_are_bad,_m%27kay
for more details...   in fact, I'm even considering looking into some
use of pack() here for the very reasons noted here ;)

(ps.  I'm sure if I got any of that wrong Matt will correct me ;)

- --j.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFBWQRrQTcbUG5Y7woRAvGHAJwOAxmPKpX09LoiZBCsYypL5UzA2ACgvbTm
6uB3igI7ObXF+vn+jeOmN98=
=cQEI
-----END PGP SIGNATURE-----