You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@spamassassin.apache.org by Rich <rg...@ellerbach.com> on 2005/02/05 20:58:54 UTC

Re: salearn parsing error

>On Mon, 20 Dec 2004 13:47:25 -0500 (EST), Rich <rg...@ellerbach.com> wrote:
>  
>
>>>Am Montag, 20. Dezember 2004 14:36 schrieb Rich:
>>>      
>>>
>>>>Some messages trigger the following error:
>>>>
>>>>Parsing of undecoded UTF-8 will give garbage when decoding entities at
>>>>/usr/local/lib/perl5/site_perl/5.8.6/Mail/SpamAssassin/HTML.pm line 182
>>>>
>>>>Why isn't salearn handling these messages correctly?
>>>>
>>>>Rich
>>>>        
>>>>
>>>Which version do you use?
>>>      
>>>
>>3.0.1
>>
>>    
>>
>
>For the record, I'm seeing this with 3.0.2 as well.  w/ the package
>from Debian Sarge.
>
>Did you ever figure it out, Rich?
>  
>
Nope, not in the sense of "why wasn't salearn handling this itself" 
anyway. Nor did I ever see any comment from the developers.

To my mind if salearn is coming across undecoded UTF-8 then it should a) 
decode it if SA decodes it to use the result in its scores, or b) 
shut-up about it. If I had to make a choice I'd prefer that SA & salearn 
did a).

Rich

Re: salearn parsing error

Posted by Daniel CaƱas <da...@unity.ncsu.edu>.
On Feb 5, 2005, at 2:58 PM, Rich wrote:

>
>> On Mon, 20 Dec 2004 13:47:25 -0500 (EST), Rich <rg...@ellerbach.com> 
>> wrote:
>>
>>>> Am Montag, 20. Dezember 2004 14:36 schrieb Rich:
>>>>
>>>>> Some messages trigger the following error:
>>>>>
>>>>> Parsing of undecoded UTF-8 will give garbage when decoding 
>>>>> entities at
>>>>> /usr/local/lib/perl5/site_perl/5.8.6/Mail/SpamAssassin/HTML.pm 
>>>>> line 182
>>>>>
>>>>> Why isn't salearn handling these messages correctly?
>>>>>
>>>>> Rich
>>>>>
>>>> Which version do you use?
>>>>
>>> 3.0.1
>>>
>>>
>>
>> For the record, I'm seeing this with 3.0.2 as well.  w/ the package
>> from Debian Sarge.
>>
>> Did you ever figure it out, Rich?
>>
> Nope, not in the sense of "why wasn't salearn handling this itself" 
> anyway. Nor did I ever see any comment from the developers.
>
> To my mind if salearn is coming across undecoded UTF-8 then it should 
> a) decode it if SA decodes it to use the result in its scores, or b) 
> shut-up about it. If I had to make a choice I'd prefer that SA & 
> salearn did a).
>
> Rich
>
>
I was having the same problem when I switched RH 7.3 to Debian Sarge:
Parsing of undecoded UTF-8 will give garbage when decoding entities at 
/usr/local/share/perl/5.8.4/Mail/SpamAssassin/HTML.pm line 182.

It is a reported bug and I changed a line in the file HTML.pm and it 
seems to work fine now. It may only work with perl 5.8?

http://bugzilla.spamassassin.org/show_bug.cgi?id=4046