You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spamassassin.apache.org by bu...@issues.apache.org on 2010/05/14 13:47:48 UTC

[Bug 6435] New: locale decimal point not used by SA

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6435

           Summary: locale decimal point not used by SA
           Product: Spamassassin
           Version: 3.3.1
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: spamassassin
        AssignedTo: dev@spamassassin.apache.org
        ReportedBy: nico.prenzel@pn-systeme.de


Hello Devs,

my SA install used in conjunction with MySQL for the userpref store, does not
use the locale settings. Especially, the locale decimal point doesn't get payed
tribute.

My MySQL backend table does contain the follwoing entry:
 Nico Prenzel/pn-systeme    required_hits    5,4    1520110

but a simple test with my user 
 /usr/bin/spamc -R -d 192.168.253.5 --username "Nico Prenzel/pn-systeme" <
sample-spam.txt

does result in a required score of 5.0:

1002.5/5.0
Spam detection software, running on the system "dema1m040.bb.bbmsg", has
identified this incoming email as possible spam.  The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email.  If you have any questions, see
@@CONTACT_ADDRESS@@ for details.

Content preview:  This is the GTUBE, the Generic Test for Unsolicited Bulk
Email
   If your spam filter supports it, the GTUBE provides a test by which you can
   verify that the filter is installed correctly and is detecting incoming
spam.
   You can send yourself a test mail containing the following string of
characters
   (in upper case and with no white spaces and line breaks): [...]

Content analysis details:   (1002.5 points, 5.0 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
-0.0 NO_RELAYS              Informational: message was not relayed via SMTP
1000 GTUBE                  BODY: Generic Test for Unsolicited Bulk Email
 2.5 BAYES_60               BODY: Bayes spam probability is 60 to 80%
                            [score: 0.7682]
-0.0 NO_RECEIVED            Informational: message has no Received headers





If i do change my userpref to the following
 Nico Prenzel/pn-systeme    required_hits    5.4    1520110

then my test outputs a needed score of 5.4:
1002.5/5.4
Spam detection software, running on the system "dema1m040.bb.bbmsg", has
identified this incoming email as possible spam.  The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email.  If you have any questions, see
@@CONTACT_ADDRESS@@ for details.

Content preview:  This is the GTUBE, the Generic Test for Unsolicited Bulk
Email
   If your spam filter supports it, the GTUBE provides a test by which you can
   verify that the filter is installed correctly and is detecting incoming
spam.
   You can send yourself a test mail containing the following string of
characte                                  rs
   (in upper case and with no white spaces and line breaks): [...]

Content analysis details:   (1002.5 points, 5.4 required)

 pts rule name              description
---- ---------------------- --------------------------------------------------
-0.0 NO_RELAYS              Informational: message was not relayed via SMTP
1000 GTUBE                  BODY: Generic Test for Unsolicited Bulk Email
 2.5 BAYES_60               BODY: Bayes spam probability is 60 to 80%
                            [score: 0.7682]
-0.0 NO_RECEIVED            Informational: message has no Received headers


system locale:
$ LANG=de_DE.UTF-8 locale -k LC_NUMERIC LC_MONETARY | grep decimal_point
decimal_point=","
mon_decimal_point=","

perl's locale:
#!/usr/bin/perl

use POSIX qw(locale_h);

# Get a reference to a hash of locale-dependent info
$locale_values = localeconv();

# Output sorted list of the values
for (sort keys %$locale_values) {
    printf "%-20s = %s\n", $_, $locale_values->{$_}
}

decimal_point        = .


Is this the intended behaviour, or a misconfigured perl?
Is SA always using the "minimum C locale"?


Thanks.

NicoP.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6435] locale decimal point not used by SA for required_hits

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6435

--- Comment #4 from Karsten Bräckelmann <gu...@rudersport.de> 2010-05-26 09:45:20 EDT ---
> I don't think the locale setting could influence the behaviour as the MySQL
> column is defined as text. But this depends on where the text is converted to a
> number, I think.

Ah, I thought it was a floating point number, not text.

Anyway, according to the docs, there is very little locali[sz]ation in SA (see
that section), and I don't recall any hint that required_score (and then
score...) would allow anything but C locale. So I guess this is intended
behavior (as per your original question).

Moreover, your report template is in English, so your spamd does not appear to
be running in a German locale anyway.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6435] locale decimal point not used by SA for required_hits

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6435

--- Comment #3 from Nico Prenzel <ni...@pn-systeme.de> 2010-05-26 09:02:05 EDT ---
(In reply to comment #2)
> (In reply to comment #0)
> > system locale:
> > $ LANG=de_DE.UTF-8 locale -k LC_NUMERIC LC_MONETARY | grep decimal_point
> Setting LANG here might influence the result, in particular if LC_NUMERIC is
> unset.
I've pasted the wrong command here.
The following 'locale' is the originally one.
> What does a plain 'locale' return for LC_NUMERIC, LC_ALL and LANG?
~# locale
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=


> Also, any chance your spamd init script sets or changes locale settings?
I've searched after this but I've not found any locale depended settings. I do
use the init scripts provided with debian.

> Likewise, any specific locale settings for MySQL?
I don't think the locale setting could influence the behaviour as the MySQL
column is defined as text. But this depends on where the text is converted to a
number, I think.
My SpamAsssassin's local.cf also list the following select statement. So, I
think the required_hits (here the value contains the number) is treated as text
and is then converted by perl to a floating point number:
user_scores_sql_custom_query     SELECT preference, value FROM _TABLE_ WHERE
username = _USERNAME_ OR username = '@GLOBAL' ORDER BY username ASC

Any more hints?

Thanks
NicoP.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6435] locale decimal point not used by SA for required_hits

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6435

--- Comment #1 from Nico Prenzel <ni...@pn-systeme.de> 2010-05-26 04:50:04 EDT ---
Any comments?

Pherhaps this could also be in 3.3.2?


NicoP.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6435] locale decimal point not used by SA for required_hits

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6435

Nico Prenzel <ni...@pn-systeme.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nico.prenzel@pn-systeme.de
            Summary|locale decimal point not    |locale decimal point not
                   |used by SA                  |used by SA for
                   |                            |required_hits

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6435] locale decimal point not used by SA for required_hits

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6435

Karsten Bräckelmann <gu...@rudersport.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX

--- Comment #5 from Karsten Bräckelmann <gu...@rudersport.de> 2010-05-26 09:59:36 EDT ---
Thinking about it... It would be a support nightmare to honor localized decimal
point. This would require distributing localized cf files for scores.

WONTFIX, IMHO.

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

[Bug 6435] locale decimal point not used by SA for required_hits

Posted by bu...@issues.apache.org.
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6435

--- Comment #2 from Karsten Bräckelmann <gu...@rudersport.de> 2010-05-26 07:27:37 EDT ---
(In reply to comment #0)
> system locale:
> $ LANG=de_DE.UTF-8 locale -k LC_NUMERIC LC_MONETARY | grep decimal_point

Setting LANG here might influence the result, in particular if LC_NUMERIC is
unset. What does a plain 'locale' return for LC_NUMERIC, LC_ALL and LANG?

Also, any chance your spamd init script sets or changes locale settings?
Likewise, any specific locale settings for MySQL?

-- 
Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.