You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by Justin Mason <jm...@jmason.org> on 2004/11/21 00:27:07 UTC
TIP: very useful '%seen' trick
this just came up on perl5-porters...
http://www.nntp.perl.org/group/perl.perl5.porters/96100 :
Subject: Re: sharing hash-values
From: btilly[at]gmail.com (Ben Tilly)
...
I forgot who I first saw mention this, possibly gbarr, but the following
variation on %seen seems to be the fastest in native Perl:
my %seen;
undef @seen{@special};
for (@things) {
if (exists $seen{$_}) {
...
}
}
This avoids creating the hash values entirely. (Or at least it did a few
revs of Perl ago.)
Cheers,
Ben
sure enough, using the shared "undef" SV as the magic value is 7% faster and
doesn't allocate the scalars to reduce RAM usage ;) definitely the better
idiom. Benchmark:
: jm 1122...; perl psc
Rate traditional undef_keys
traditional 100014/s -- -6%
undef_keys 106684/s 7% --
script:
#!/usr/bin/perl -w
use Benchmark qw(:all);
use strict;
my @things = qw(
foo bar baz foo foo foo bar bar baz baz blarg
);
cmpthese (-2, {
'traditional' => sub {
my $res = '';
my %seen;
for (@things) {
next if $seen{$_};
$seen{$_} = 1;
$res .= "$_\n";
}
},
'undef_keys' => sub {
my $res = '';
my %seen;
# undef @seen{@special};
for (@things) {
next if exists $seen{$_};
undef $seen{$_};
$res .= "$_\n";
}
}
});
(ps: note the 'undef @seen{@special};' -- can be used to undef a list of
already-seen "special" values before the loop.)
--j.