You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@myfaces.apache.org by P V <pe...@gmail.com> on 2008/03/11 19:51:44 UTC

UnicodeEncoder quiestion

Hi, all,

I have a question on the "UnicodeEncoder" class implementation.
The class contains one method: "public static String encode (String
string)".

My question is about usage of StringBuilder/StringBuffer in it (the logic is
common for both 1.1.5 and 1.2.2).
Initially we have
StringBuilder sb = null;

and as I understand we have lazy creation of it.
For characters with "((int)c) >= 0x80" we check is our "sb" null or not, and
in case of null we create new instance of it.
But for rest of chars (((int)c) < 0x80) we have the following logic which is
not correct for me:
"else if( sb != null )
{
    sb.append(c);
}"

So we append char to buffer/builder only if we have the instance of
buffer/builder.

For now imagine that we have some string with leading latin symbols. In
result we'll have incorrect encoded string.

I think I can miss something. Could somebody please describe the method's
logic?

Thank you in advance,
Peter

Re: UnicodeEncoder quiestion

Posted by P V <pe...@gmail.com>.
Hi, Michael,

Many thanks for the explanation. I see for now.
It was end of the day so sorry for the stupid question.

Peter


On Tue, Mar 11, 2008 at 10:02 PM, Michael Kurz <mi...@gmx.at> wrote:

> P V schrieb:
> > For now imagine that we have some string with leading latin symbols. In
> > result we'll have incorrect encoded string.
> >
> > I think I can miss something. Could somebody please describe the
> > method's logic?
>
> Yeah, I can. The main concept behind this is to avoid unnecessary string
> building if the string does not contain special character. If you take a
> closer look you'll see that on creating the string builder the whole
> strind read so far is appended:
>
> sb = new StringBuilder( string.length()+4 );
> sb.append( string.substring(0,i) );
>
> This would be relevant for your example with leading latin characters.
> On meeting the first non-latin character all latin chars read so far are
> appended to the builder.
>
> This approach improves performance for strings that do not contain
> non-latin characters (which normally should be quite some amount).
>
> Hope this helps!
>
> regards
> Michael
>

Re: UnicodeEncoder quiestion

Posted by Michael Kurz <mi...@gmx.at>.
P V schrieb:
> For now imagine that we have some string with leading latin symbols. In 
> result we'll have incorrect encoded string.
> 
> I think I can miss something. Could somebody please describe the 
> method's logic?

Yeah, I can. The main concept behind this is to avoid unnecessary string 
building if the string does not contain special character. If you take a 
closer look you'll see that on creating the string builder the whole 
strind read so far is appended:

sb = new StringBuilder( string.length()+4 );
sb.append( string.substring(0,i) );

This would be relevant for your example with leading latin characters. 
On meeting the first non-latin character all latin chars read so far are 
appended to the builder.

This approach improves performance for strings that do not contain 
non-latin characters (which normally should be quite some amount).

Hope this helps!

regards
Michael