You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Tim Wetterek Andersson <ti...@norrkoping.se> on 2020/09/22 06:07:31 UTC

Character encoding in Report Templates

Hi!


I am very new to SpamAssassin and I can not get the following to work:


I am trying to use the swedish characters åäö in the report template I have set up in my SpamAssasin instances.

I have tried using the ASCII-tables described here<https://cwiki.apache.org/confluence/display/SPAMASSASSIN/WritingRulesAdvanced> but the only thing that shows is the following below.


We have a combined enviroment with two SMTP relays running SUSE Linux Enterprise for outgoing mail from Exchange and two other SUSE servers acting spamguards for Exchange.

The version of SA is 3.4.2


All the other ordinary messages works fine with these encodings.


Excerpts of different config outputs I've tried:


Spam programvaran /p(?:\xfc|\xc3\xa5)/ "xxxx"


Spam programvaran p(?:\xfc|\xc3\xa5) "xxxx"



Excerpts of how the mail looks without tried configs:

Content type: Unchecked
Internal reference code for the message is xxxx

First upstream SMTP client IP address: xxxxx
Received from: xxxxxx

Return-Path: <xxxx>
From: xxxx <xxxx>
Message-ID:
xxxx

The message WILL BE relayed to:
<xxxxx>

Spam scanner report:
Spam programvaran på "xxxx"
har identifierat detta meddelande som potentiellt SPAM(Skräppost). Orginalmeddelandet
finns bifogat som en bilaga så att du i din e-post klient kan markera detta som skräppost om du vill.
Om du har frågor ring till Servicedesk xxxx eller e-posta

xxxx .

Förhandsgranskning av innehåll:  xxxx

Information om innehållsanalys:   (7.9 points, 5.0 required)

Regel                       Beskrivning
---- ---------------------- --------------------------------------------------
xxxx



Vänliga hälsningar


Tim Wetterek Andersson

Teknikansvarig
Serverdrift, Digitaliseringsavdelningen
Norrköpings kommun

Telefon: +4611156418
tim.andersson@norrkoping.se
www.norrkoping.se<http://www.norrkoping.se/>

[X]


All e-post som skickas till Norrköpings kommun är allmän handling.
Norrköpings kommun hanterar dina personuppgifter enligt dataskyddsförordningen (GDPR).
För mer information se http://www.norrkoping.se/dataskyddsforordningen---gdpr.html

Character encoding in Report Templates

Posted by Tim Wetterek Andersson <ti...@norrkoping.se>.
Hi again!

Thanks for all the answers!

I have tried using both Report_charset iso-8859-1 and Report_charset utf-8 but the same results as before...
What am I missing?

Vänliga hälsningar

Tim Wetterek Andersson

-----Ursprungligt meddelande-----
Från: RW <rw...@googlemail.com>
Skickat: den 23 september 2020 02:16
Till: users@spamassassin.apache.org
Ämne: Re: Character encoding in Report Templates

On Tue, 22 Sep 2020 10:50:58 -0700 (PDT) John Hardin wrote:


>
> Did you try just pasting in the proper accented text verbatim?
> Explicit hex values shouldn't be needed. See the report lines of this
> for example:
>
> https://cwiki.apache.org/confluence/display/SPAMASSASSIN/TranslateFren
> ch

I think this is probably the most important thing:

 report_charset CHARSET        (default: unset)
           Set the MIME Content-Type charset used for the text/plain
           report which is attached to spam mail messages.

The wiki has this example:

  lang fr report_charset iso-8859-1

But this form doesn't seem useful unless there are traditional unix users with individual locales.
All e-post som skickas till Norrköpings kommun är allmän handling.
Norrköpings kommun hanterar dina personuppgifter enligt dataskyddsförordningen (GDPR).
För mer information se http://www.norrkoping.se/dataskyddsforordningen---gdpr.html

Re: Character encoding in Report Templates

Posted by RW <rw...@googlemail.com>.
On Tue, 22 Sep 2020 10:50:58 -0700 (PDT)
John Hardin wrote:


> 
> Did you try just pasting in the proper accented text verbatim?
> Explicit hex values shouldn't be needed. See the report lines of this
> for example:
> 
> https://cwiki.apache.org/confluence/display/SPAMASSASSIN/TranslateFrench

I think this is probably the most important thing:

 report_charset CHARSET        (default: unset)
           Set the MIME Content-Type charset used for the text/plain
           report which is attached to spam mail messages.

The wiki has this example:

  lang fr report_charset iso-8859-1

But this form doesn't seem useful unless there are traditional unix
users with individual locales.

Re: SV: Character encoding in Report Templates

Posted by John Hardin <jh...@impsec.org>.
On Thu, 15 Oct 2020, Tim Wetterek Andersson wrote:

> I am sorry for not really understanding but is what you mean I should 
> try using like "lang sv ...." for the whole report template?

Please keep the conversation on the users list so others can potentially 
benefit in the future.

That might be a solution, but it presumes your language is set to Swedish, 
rather than being set to English and just overriding all the default 
English messages with Swedish text...

All I can say is: try it and see if it works.

Create a 30_text_sv.cf file in your local configuration folder, and put in 
stuff that's similar to one of the existing translation files. These are 
what we already have in the base code:

https://svn.apache.org/viewvc/spamassassin/trunk/rules/30_text_de.cf
https://svn.apache.org/viewvc/spamassassin/trunk/rules/30_text_fr.cf
https://svn.apache.org/viewvc/spamassassin/trunk/rules/30_text_it.cf
https://svn.apache.org/viewvc/spamassassin/trunk/rules/30_text_nl.cf
https://svn.apache.org/viewvc/spamassassin/trunk/rules/30_text_pl.cf
https://svn.apache.org/viewvc/spamassassin/trunk/rules/30_text_pt_br.cf

(Click the "(view)" link near the top to see the file itself, click the 
"(download)" link to download a copy to edit.)

Don't worry about all the rule description translations unless you want 
to. Focus on the report template translations first.

I don't have a suggestion for the proper encoding to use. But with the 
proper encoding set, you should be able to just paste in the appropriate 
text without having to fiddle around with escapes.

Let us know how that goes!


A general note to the list: more translations would be gratefully 
welcomed, if anyone wants to provide languages that we don't 
currently have.


>
> Thanks
>
> Tim Wetterek Andersson
> Digitaliseringsavdelningen, Norrköpings kommun
> Telefon/SMS: +46725935115
>
> -----Ursprungligt meddelande-----
> Från: John Hardin <jh...@impsec.org>
> Skickat: den 22 september 2020 19:51
> Till: users@spamassassin.apache.org
> Ämne: Re: Character encoding in Report Templates
>
> On Tue, 22 Sep 2020, Tim Wetterek Andersson wrote:
>
>> I am very new to SpamAssassin and I can not get the following to work:
>>
>> I am trying to use the swedish characters åäö in the report template I have set up in my SpamAssasin instances.
>
> FYI, QP encodes that as: the swedish characters =E5=E4=F6
>
>> I have tried using the ASCII-tables described here
>> <https://cwiki.apache.org/confluence/display/SPAMASSASSIN/WritingRules
>> Advanced> but the only thing that shows is the following below.
>
> That's specific to writing the regular expressions used in rules. Report template text is just... text.
>
>> Excerpts of different config outputs I've tried:
>>
>> Spam programvaran /p(?:\xfc|\xc3\xa5)/ "xxxx"
>>
>> Spam programvaran p(?:\xfc|\xc3\xa5) "xxxx"
>
> I note that none of the characters there are \xe5, \xe4 or \xf6.
>
>> Excerpts of how the mail looks without tried configs:
>>
>> Spam programvaran på "xxxx"
>
> FYI, QP encodes that as: Spam programvaran p=C3=A5 "xxxx"
>
> ...so the second half of what you used (\xc3\xa5) *is* being emitted in the report.
>
> Try:
>
>   Spam programvaran p\xe5 "xxxx"
>
>
> Did you try just pasting in the proper accented text verbatim? Explicit hex values shouldn't be needed. See the report lines of this for example:
>
> https://cwiki.apache.org/confluence/display/SPAMASSASSIN/TranslateFrench
>

-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   Never forget, even for an instant, that the one and only reason
   anyone has for taking your gun away is to make you weaker than
   he is, so he can do something to you that you wouldn’t let him
   do if you were equipped to prevent it. This goes for burglars,
   muggers, and rapists, and even more so for policemen,
   bureaucrats, and politicians.                     -- Alexander Pope
-----------------------------------------------------------------------
  Today: the 491st anniversary of the muslim Ottoman defeat at Vienna

Re: Character encoding in Report Templates

Posted by John Hardin <jh...@impsec.org>.
On Tue, 22 Sep 2020, Tim Wetterek Andersson wrote:

> I am very new to SpamAssassin and I can not get the following to work:
>
> I am trying to use the swedish characters åäö in the report template I have set up in my SpamAssasin instances.

FYI, QP encodes that as: the swedish characters =E5=E4=F6

> I have tried using the ASCII-tables described here 
> <https://cwiki.apache.org/confluence/display/SPAMASSASSIN/WritingRulesAdvanced>
> but the only thing that shows is the following below.

That's specific to writing the regular expressions used in rules. Report 
template text is just... text.

> Excerpts of different config outputs I've tried:
>
> Spam programvaran /p(?:\xfc|\xc3\xa5)/ "xxxx"
>
> Spam programvaran p(?:\xfc|\xc3\xa5) "xxxx"

I note that none of the characters there are \xe5, \xe4 or \xf6.

> Excerpts of how the mail looks without tried configs:
>
> Spam programvaran på "xxxx"

FYI, QP encodes that as: Spam programvaran p=C3=A5 "xxxx"

...so the second half of what you used (\xc3\xa5) *is* being emitted in 
the report.

Try:

   Spam programvaran p\xe5 "xxxx"


Did you try just pasting in the proper accented text verbatim? Explicit 
hex values shouldn't be needed. See the report lines of this for example:

https://cwiki.apache.org/confluence/display/SPAMASSASSIN/TranslateFrench



-- 
  John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
  jhardin@impsec.org                         pgpk -a jhardin@impsec.org
  key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
   The tree of freedom must be freshened from time to time
   with the blood of tyrants and tyrannosaurs.
                      -- DW, commenting on the GM6 Lynx .50BMG bullpup
-----------------------------------------------------------------------
  42 days until the Presidential Election