You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@roller.apache.org by Anil Gangolli <an...@busybuddha.org> on 2006/05/14 04:34:53 UTC

UTF-8 charset identifier

The charset identifer for UTF-8 is defined by the IANA as "UTF-8"; case is supposed to be insensitive.
There are no aliases defined.  In particular "utf8" is not officially recognized, though some software does implement that identifier as an alias.

In order to ensure maximum compatibility, I think we are best off using the official forms "UTF-8" or "utf-8" (not "UTF8" or "utf8").

I fixed the cases of this that I could find.  They were mostly in Atom-related code.

--a.

Re: UTF-8 charset identifier

Posted by Brian Blakeley <we...@labourunions.com>.
Hi Anil,

Here is the section out of my weblog template:

<div class="item">
  #showNewsfeed("http://www.chenuke.com/rss/news2.php" false 5 false)
</div>


To my knowledge everything is OK through my systems with utf-8 at the
source (www.chenuke.com) and at cheblogs.com/

Thanks for any time you might have for this minor issue.


I tried to the Planet code going for me, but in the end found that the
PlanetPlanet code was exactly what I was looking for and worked
immediately out of the box. So, I disabled the Plant stuff on Che Blogs.


Brian



On Sun, 2006-05-14 at 09:27 -0700, Anil Gangolli wrote:
> There were earlier issues with the newsfeed cache (NewsfeedCache.java) with 
> encodings that were fixed with ROL-766 in 2.0 and 1.3.
> 
> The fix for ROL-766 introduced an instance of this separate bug (ROL-1132) 
> which I just fixed now.
> 
> Since this was done based on code inspection rather than symptom reports and 
> verification, I don't know all of the actual symptoms fixed by these 
> changes.  However, I suspect the issue you are seeing might NOT be the same.
> 
> Please send me a link to the feed URL you are using and I can do some local 
> testing.  If it's a different issue, the fix won't make 2.3.
> 
> --a.
> 
> ps. Be aware that #showNewsfeed() was deprecated after the Planet 
> aggregation functionality came in.
> 
> 
> ----- Original Message ----- 
> From: "Brian Blakeley" <we...@labourunions.com>
> To: "RollerDev" <ro...@incubator.apache.org>
> Sent: Saturday, May 13, 2006 11:44 PM
> Subject: Re: UTF-8 charset identifier
> 
> 
> >
> > I have been having trouble with #showNewsfeed() messing up the utf-8
> > characters when displaying the feed.  I wonder if this is the problem?
> >
> > Check the right margin of http://www.cheblogs.com/roller/page/bblakeley
> > to see what I mean with the French newsfeed titles.
> >
> > Brian
> >
> >
> > On Sat, 2006-05-13 at 19:34 -0700, Anil Gangolli wrote:
> >> The charset identifer for UTF-8 is defined by the IANA as "UTF-8"; case 
> >> is supposed to be insensitive.
> >> There are no aliases defined.  In particular "utf8" is not officially 
> >> recognized, though some software does implement that identifier as an 
> >> alias.
> >>
> >> In order to ensure maximum compatibility, I think we are best off using 
> >> the official forms "UTF-8" or "utf-8" (not "UTF8" or "utf8").
> >>
> >> I fixed the cases of this that I could find.  They were mostly in 
> >> Atom-related code.
> >>
> >> --a.
> > 
> 


Re: UTF-8 charset identifier

Posted by Anil Gangolli <an...@busybuddha.org>.
There were earlier issues with the newsfeed cache (NewsfeedCache.java) with 
encodings that were fixed with ROL-766 in 2.0 and 1.3.

The fix for ROL-766 introduced an instance of this separate bug (ROL-1132) 
which I just fixed now.

Since this was done based on code inspection rather than symptom reports and 
verification, I don't know all of the actual symptoms fixed by these 
changes.  However, I suspect the issue you are seeing might NOT be the same.

Please send me a link to the feed URL you are using and I can do some local 
testing.  If it's a different issue, the fix won't make 2.3.

--a.

ps. Be aware that #showNewsfeed() was deprecated after the Planet 
aggregation functionality came in.


----- Original Message ----- 
From: "Brian Blakeley" <we...@labourunions.com>
To: "RollerDev" <ro...@incubator.apache.org>
Sent: Saturday, May 13, 2006 11:44 PM
Subject: Re: UTF-8 charset identifier


>
> I have been having trouble with #showNewsfeed() messing up the utf-8
> characters when displaying the feed.  I wonder if this is the problem?
>
> Check the right margin of http://www.cheblogs.com/roller/page/bblakeley
> to see what I mean with the French newsfeed titles.
>
> Brian
>
>
> On Sat, 2006-05-13 at 19:34 -0700, Anil Gangolli wrote:
>> The charset identifer for UTF-8 is defined by the IANA as "UTF-8"; case 
>> is supposed to be insensitive.
>> There are no aliases defined.  In particular "utf8" is not officially 
>> recognized, though some software does implement that identifier as an 
>> alias.
>>
>> In order to ensure maximum compatibility, I think we are best off using 
>> the official forms "UTF-8" or "utf-8" (not "UTF8" or "utf8").
>>
>> I fixed the cases of this that I could find.  They were mostly in 
>> Atom-related code.
>>
>> --a.
> 


Re: UTF-8 charset identifier

Posted by Brian Blakeley <we...@labourunions.com>.
I have been having trouble with #showNewsfeed() messing up the utf-8
characters when displaying the feed.  I wonder if this is the problem?

Check the right margin of http://www.cheblogs.com/roller/page/bblakeley
to see what I mean with the French newsfeed titles.

Brian


On Sat, 2006-05-13 at 19:34 -0700, Anil Gangolli wrote:
> The charset identifer for UTF-8 is defined by the IANA as "UTF-8"; case is supposed to be insensitive.
> There are no aliases defined.  In particular "utf8" is not officially recognized, though some software does implement that identifier as an alias.
> 
> In order to ensure maximum compatibility, I think we are best off using the official forms "UTF-8" or "utf-8" (not "UTF8" or "utf8").
> 
> I fixed the cases of this that I could find.  They were mostly in Atom-related code.
> 
> --a.