You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@spamassassin.apache.org on 2022/05/10 15:21:03 UTC

[Bug 7989] New: Plugin Esp.pm: Tags do not work as described, rewrite of plugin needed

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7989

            Bug ID: 7989
           Summary: Plugin Esp.pm: Tags do not work as described, rewrite
                    of plugin needed
           Product: Spamassassin
           Version: 4.0.0
          Hardware: All
                OS: All
            Status: NEW
          Severity: blocker
          Priority: P2
         Component: Plugins
          Assignee: dev@spamassassin.apache.org
          Reporter: sa-mst@lrz.de
  Target Milestone: Undefined

This plugin has not yet reached production quality. It needs to be rewritten.

From the description:

"The plugin sets some tags when a rule match, those tags can be used to use
direct queries against rbl.

If direct queries are used the main rule will be used only to set the tag and
the score should be added to the askdns rule."

This is NOT true. A tag is only set when a customer ID matches an entry in a
feed file. However, if there is a match, there is no point in doing another DNS
query for the same result. Tags must be set regardless of whether feed files
are used.

Other issues:

- NEVER EVER use a value directly from a header field without checking the
allowed syntax. This includes the header fields X-Roving-Id and X-MC-User.
Syntax for X-Roving-Id is '/^(\d+)\.\d+$/' and for X-MC-User
'/^([0-9a-z]{25})$/'.

- tag MAILGUN should be named MAILGUNID in set_tag command

- When checking for the EnvelopFrom to determine if the mail belongs to a
particular ESP, forwards that replace the EnvelopeFrom, such as GMAIL, will no
longer be recognized.

- ESP sendinblue uses two types of emails with different header fields. From
the description of my configuration:

# Header fields: there are two types of emails with different header fields
#
# type A:
# -------
# Feedback-ID: IP-Addr:CID_CAMPAIGNID:CID:Sendinblue
# List-Id: BASE64 <BASE64.list_id.DOMAIN>
#   BASE64 of CID-\d+-\d+
# Message-Id: <20\d{2}(?:0\d|1[012])[0123]\d{5}\.[0-9a-z]{9,17}\@\S+>
# X-Mailer: Sendinblue
# X-Mailin-Campaign: \d+
# X-Mailin-Client: CID
# X-sib-id: long string
#
# type B:
# -------
# X-Mailin-EID: includes info about EnvFrom and original Message-ID in BASE64
# Message-Id:
<[0...@smtp-relay.sendinblue.com>/
# Origin-messageId: fromat of message-id of different clients
# X-sib-id: long string
# Feedback-ID: IP-Addr:CID_-1:CID:Sendinblue

I'm using therefore

Feedback-ID =~
/^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:(\d+)_(?:-1|\d+):\1:Sendinblue$/

instead of X-Mailin-Client.

- I am not sure if the logic for Mailup is correct. The logic says: There MUST
be a header field X-Abuse. However, it can have multiple syntaxes. If it does
not have the expected syntax, the cid can be extracted from the list-id field.
Since I don't have an example, I can't verify if this is true.

- sub esp_mdrctr_check uses variable names from esp_sendgrid_check_id and has
an unnecessary line 'my $envfrom =...'

- Provider Invaluement advertising does not belong in plugin description or
debug messages.

- Be consistent in what you do
  * if you alphabetize the eval functions, do it consistently and in the same
order everywhere.
  * use either \s in every regexp or use ' '.
  * escape only meta characters in regexp

- Refactor the code and use subroutines instead of copy/paste, this makes the
code more readable and maintainable, e.g.

sub check_in_esp_list {
  my ($self, $pms, $esp, $list, $cid) = @_;
  my $rulename = $pms->get_current_eval_rule_name();
  #dbg("(check_in_esp_list) esp=$esp, list=$list, cid=$cid,
rulename=$rulename");

  if (exists $self->{ESP}->{$list}->{$cid}) {
    dbg("HIT! customer id $cid found in $esp feed");
    $pms->test_log("$esp id: $cid");
    $pms->got_hit($rulename, "", ruletype => 'eval');
    return 1;
  }

  return;
}

sub set_cid_and_tag {
  my ($self, $pms, $header_name, $pattern, $tag) = @_;
  my $header = $pms->get($header_name, undef);
  #dbg("(set_cid_and_tag) tag=$tag, header_name=$header_name,
pattern=$pattern");

  # check the pattern
  my ($regexp, $err) = compile_regexp($pattern, 1);
  if (!$regexp) {
    warn "invalid regexp $regexp for header $header: $err";
    return;
  }

  # get customer id from header field
  return unless $header && $header =~ $regexp;
  my $cid = $1;

  # set tag
  $pms->set_tag($tag, $cid);

  return $cid;
}

sub esp_mailchimp_check {
  my ($self, $pms) = @_;

  # return if X-Mailer is not what we want
  my $xmailer = $pms->get("X-Mailer", undef);
  return unless $xmailer and $xmailer =~ /MailChimp\sMailer/;

  # get customer id from header field X-MC-User, set tag MAILCHIMPID
  my $cid = $self->set_cid_and_tag($pms, 'X-MC-User', '/^([0-9a-z]{25})$/',
'MAILCHIMPID');
  return unless $cid;

  # check for customer id
  return  $self->check_in_esp_list($pms, 'Mailchimp', 'MAILCHIMP', $cid);
}

Since these suggestions have no code in common with my plugin, the code has NOT
been tested!

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7989] Plugin Esp.pm: Tags do not work as described, rewrite of plugin needed

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7989

Henrik Krohns <ap...@hege.li> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #3 from Henrik Krohns <ap...@hege.li> ---
Plugin was removed from SA.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7989] Plugin Esp.pm: Tags do not work as described, rewrite of plugin needed

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7989

--- Comment #2 from Henrik Krohns <ap...@hege.li> ---
There's several regexps with extra captures that aren't used for anything,
polluting variable space

  $maildome_id =~
/subject=https:\/\/.*\/unsubscribe\/([0-9]+)\/([0-9]+)\/.*\/([0-9]+)\/([0-9]+)\>/;
  $maildome_id = $2;

  $envfrom =~ /bounce\+(\w+)\.(\w+)\-/;
  $mailgun_id = $2;

  if($xabuse =~ /Please report abuse here:
https?:\/\/.*\.musvc([0-9]+)\.net\/p\?c=([0-9]+)/) {
    $mailup_id = $2;

  if($envfrom =~ /\@(\w+\.)?([\w\.]+)\>?$/) {
    $sendgrid_domain = $2;

  if($fid =~ /(\d+):(\d+):([a-z]+)/i) {
    $mdrctr_id = $1;

Anyway.. is there really use for this plugin in SA distribution? I just saw the
discussion again a year ago on dev-list (Subject: svn commit: r1889364),
apparently I complained about it being hard to maintain as many things are
hardcoded. There's no quick way to release updated plugin to users.

How is it supposed to be used anyway? There's no information in Esp.pm itself
on where to get the files or what format they are in. There's only
https://github.com/bigio/spamassassin-esp/blob/master/contrib/esp-download.sh
that seems to download some Invaluement files, which do not even currently
contain any data!

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7989] Plugin Esp.pm: Tags do not work as described, rewrite of plugin needed

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7989

Sidney Markowitz <si...@sidney.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sidney@sidney.com

--- Comment #4 from Sidney Markowitz <si...@sidney.com> ---
The commit to remove Esp.pm did not reference this issue in the commit message.
I'm copying the log here for reference.

revision 1900942 | gbechis | 2022-05-16 19:46:47 +1200 (Mon, 16 May 2022)

Remove Esp plugin

That commit missed three more references to Esp in tests, fixed:

trunk % svn ci -m "Bug 7989 Remove three more references in tests to deleted
plugin Esp.pm"
Sending        t/all_modules.t
Sending        t/data/01_test_rules.pre
Sending        xt/20_saw_ampersand.t
Transmitting file data ...done
Committing transaction...
Committed revision 1901318.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 7989] Plugin Esp.pm: Tags do not work as described, rewrite of plugin needed

Posted by bu...@spamassassin.apache.org.
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7989

Henrik Krohns <ap...@hege.li> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |apache@hege.li

--- Comment #1 from Henrik Krohns <ap...@hege.li> ---
Test is failing currently:

t/esp.t .. 1/2  Not found: Mailchimp =  MAILCHIMP_ID  at t/esp.t line 36.

I guess data/spam/esp/mailchimp.eml has wrong format:

X-MC-User: 1234

-- 
You are receiving this mail because:
You are the assignee for the bug.