You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by Dmitriy Kiriy <dk...@oilspace.com> on 2003/04/17 16:10:02 UTC

RE: Problems with Russian characters

Why you don't want use UTF-8?
That's very easy.

>>>-----Original Message-----
>>>From: Vladimir [mailto:vlad@profitsoft.com.ua]
>>>Sent: Thursday, April 17, 2003 8:55 PM
>>>To: tapestry-user@jakarta.apache.org
>>>Subject: Problems with Russian characters
>>>
>>>
>>>Hello, Tapestry developers and users!
>>>We are intending to create a large web-application and
>>>considering which of
>>>frameworks we can use. We like the idea of the Tapestry, but we have some
>>>questions to you before using it.
>>>
>>>I) The problem is that web-application should support three languages:
>>>English, German, Russian. We see that you have excellent solutions for
>>>localization and internationalization, but we have problems with
>>>character
>>>encodings. We don't want to use UTF-8. We want to use mostly windows-1251
>>>(Cyrillic encodingfor Windows) or KOI8-R for russian-speaking users, and
>>>iso-8859-2 for german-speaking users, but a russian-speaking
>>>user should be
>>>able to input german symbols (a-umlaut, o-umlaut, u-umlaut, sz).
>>>I suppose the encoding should be determinated by the current engine's
>>>locale, so the solution might be easy, something like this:
>>>a) replace application servlet with our own subclass, something
>>>like this:
>>>
>>>public class MyApplicationServlet extends
>>>org.apache.tapestry.ApplicationServlet {
>>>  protected void doService(HttpServletRequest request,
>>>HttpServletResponse
>>>response)
>>>   throws java.io.IOException, javax.servlet.ServletException {
>>>    org.apache.tapestry.request.RequestContext context = null;
>>>    context = createRequestContext(request, response);
>>>    org.apache.tapestry.IEngine engine = getEngine(context);
>>>    if (engine != null) {
>>>      java.lang.String l = engine.getLocale().getLanguage();
>>>      java.lang.String charset = null;
>>>      if ("ru".equals(l))
>>>        charset = "windows-1251";
>>>      else if ("de".equals(l))
>>>        charset = "iso-8859-2";
>>>      else
>>>        charset = "iso-8859-1";
>>>      request.setCharacterEncoding(charset);
>>>    }
>>>    super.doService(request, response);
>>>  }
>>>}
>>>
>>>b) The pages should implement the getResponseWriter(java.io.OutputStream
>>>out) something like this
>>>
>>>public class Test extends BasePage {
>>>
>>>  public IMarkupWriter getResponseWriter(OutputStream out) {
>>>    java.lang.StringBuffer sb = new
>>>java.lang.StringBuffer("text/html; chars
>>>et=");
>>>    java.lang.String l = getEngine().getLocale().getLanguage();
>>>    java.lang.String charset = null;
>>>    if ("ru".equals(l))
>>>      charset = "windows-1251";
>>>    else if ("de".equals(l))
>>>      charset = "iso-8859-2";
>>>    else
>>>      charset = "iso-8859-1";
>>>    sb.append(charset);
>>>    return new HTMLWriter(sb.toString(), out);
>>>  }
>>>.......
>>>}
>>>
>>>So we won't have problems with displaying the templates written in
>>>windows-1251 or KOI8-R, but  we have the following problems:
>>>a) All the symbols, which were input by user, with ASCII-code > 128 are
>>>converted to the UTF-8-codes
>>>b) It reads inproperly russian properties-files (resources) written in
>>>windows-1251 and also converts them to UTF-8-codes
>>>c) When a user with russian locale inputs german symbols, IE
>>>sends them as
>>>&auml; (&amp;auml; if you read this message in a HTML browser) &ouml;
>>>(&amp;ouml;) and so on, but the framework (it seems to me,
>>>rg.apache.tapestry.html.HTMLWriter) replaces & with &amp; (&amp;amp;)...
>>>
>>>Could you tell us how to avoid these problems?
>>>
>>>II) Could you tell us, where we can find an information on your next
>>>version:
>>>terms, corrections, new features?
>>>
>>>III) I have version 2.4-alpha-5 and Inspector didn't work in
>>>this version.
>>>It have taken
>>>me much time to make it running, though it runs still unstable now. Could
>>>you tell whether it is typical or I am alone who have this
>>>problem, and are
>>>there any updates?
>>>
>>>Thank you,
>>>Vladimir Khalyavin, ProfITsoft Ltd.
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
>>>For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
>>>
>>>


RE: Problems with Russian characters

Posted by "Howard M. Lewis Ship" <hl...@attbi.com>.
There may be a latent bug in AbstractMarkupWriter, since it doesn't handle
unicode characters above 128 properly.  I'm doin some research on what's the
correct thing for Tapestry to be doing.  The Servlet API makes this hard,
since you need to know the right character encoding before you can read any
parameters.  Hunter's servlets book has a few directions for us to follow.
There may be a configuration parameter to control this stuff, so you would
put in your app spec the right encodings for your pages.

--
Howard M. Lewis Ship
Creator, Tapestry: Java Web Components
http://jakarta.apache.org/tapestry



> -----Original Message-----
> From: Vladimir [mailto:vlad@profitsoft.com.ua] 
> Sent: Thursday, April 17, 2003 1:25 PM
> To: Tapestry users
> Subject: Re: Problems with Russian characters
> 
> 
> Excuse me, we don't use UTF-8 because we are storing all the 
> file in different character encodings (windows-1251 and 
> iso-8859-2) and it will be surplus problems for our designers 
> to manage files in UTF-8. Moreover, every russian character 
> which were input by a user will take about 7 bytes in the 
> result HTML, for example, &#1074; (&amp;#1074;)
> 
> ----- Original Message -----
> From: "Dmitriy Kiriy" <dk...@oilspace.com>
> To: "Tapestry users" <ta...@jakarta.apache.org>
> Sent: Thursday, April 17, 2003 5:10 PM
> Subject: RE: Problems with Russian characters
> 
> 
> > Why you don't want use UTF-8?
> > That's very easy.
> >
> > >>>-----Original Message-----
> > >>>From: Vladimir [mailto:vlad@profitsoft.com.ua]
> > >>>Sent: Thursday, April 17, 2003 8:55 PM
> > >>>To: tapestry-user@jakarta.apache.org
> > >>>Subject: Problems with Russian characters
> > >>>
> > >>>
> > >>>Hello, Tapestry developers and users!
> > >>>We are intending to create a large web-application and 
> considering 
> > >>>which of frameworks we can use. We like the idea of the 
> Tapestry, 
> > >>>but we have
> some
> > >>>questions to you before using it.
> > >>>
> > >>>I) The problem is that web-application should support three 
> > >>>languages: English, German, Russian. We see that you 
> have excellent 
> > >>>solutions for localization and internationalization, but we have 
> > >>>problems with character encodings. We don't want to use 
> UTF-8. We 
> > >>>want to use mostly
> windows-1251
> > >>>(Cyrillic encodingfor Windows) or KOI8-R for russian-speaking 
> > >>>users,
> and
> > >>>iso-8859-2 for german-speaking users, but a 
> russian-speaking user 
> > >>>should be able to input german symbols (a-umlaut, o-umlaut, 
> > >>>u-umlaut, sz). I suppose the encoding should be 
> determinated by the 
> > >>>current engine's locale, so the solution might be easy, 
> something 
> > >>>like this:
> > >>>a) replace application servlet with our own subclass, something 
> > >>>like this:
> > >>>
> > >>>public class MyApplicationServlet extends 
> > >>>org.apache.tapestry.ApplicationServlet {
> > >>>  protected void doService(HttpServletRequest request, 
> > >>>HttpServletResponse
> > >>>response)
> > >>>   throws java.io.IOException, javax.servlet.ServletException {
> > >>>    org.apache.tapestry.request.RequestContext context = null;
> > >>>    context = createRequestContext(request, response);
> > >>>    org.apache.tapestry.IEngine engine = getEngine(context);
> > >>>    if (engine != null) {
> > >>>      java.lang.String l = engine.getLocale().getLanguage();
> > >>>      java.lang.String charset = null;
> > >>>      if ("ru".equals(l))
> > >>>        charset = "windows-1251";
> > >>>      else if ("de".equals(l))
> > >>>        charset = "iso-8859-2";
> > >>>      else
> > >>>        charset = "iso-8859-1";
> > >>>      request.setCharacterEncoding(charset);
> > >>>    }
> > >>>    super.doService(request, response);
> > >>>  }
> > >>>}
> > >>>
> > >>>b) The pages should implement the
> getResponseWriter(java.io.OutputStream
> > >>>out) something like this
> > >>>
> > >>>public class Test extends BasePage {
> > >>>
> > >>>  public IMarkupWriter getResponseWriter(OutputStream out) {
> > >>>    java.lang.StringBuffer sb = new 
> > >>>java.lang.StringBuffer("text/html; chars et=");
> > >>>    java.lang.String l = getEngine().getLocale().getLanguage();
> > >>>    java.lang.String charset = null;
> > >>>    if ("ru".equals(l))
> > >>>      charset = "windows-1251";
> > >>>    else if ("de".equals(l))
> > >>>      charset = "iso-8859-2";
> > >>>    else
> > >>>      charset = "iso-8859-1";
> > >>>    sb.append(charset);
> > >>>    return new HTMLWriter(sb.toString(), out);
> > >>>  }
> > >>>.......
> > >>>}
> > >>>
> > >>>So we won't have problems with displaying the templates 
> written in 
> > >>>windows-1251 or KOI8-R, but  we have the following problems:
> > >>>a) All the symbols, which were input by user, with 
> ASCII-code > 128 
> > >>>are converted to the UTF-8-codes
> > >>>b) It reads inproperly russian properties-files 
> (resources) written 
> > >>>in windows-1251 and also converts them to UTF-8-codes
> > >>>c) When a user with russian locale inputs german 
> symbols, IE sends 
> > >>>them as &auml; (&amp;auml; if you read this message in a HTML 
> > >>>browser) &ouml;
> > >>>(&amp;ouml;) and so on, but the framework (it seems to me,
> > >>>rg.apache.tapestry.html.HTMLWriter) replaces & with &amp;
> (&amp;amp;)...
> > >>>
> > >>>Could you tell us how to avoid these problems?
> > >>>
> > >>>II) Could you tell us, where we can find an information on your 
> > >>>next
> > >>>version:
> > >>>terms, corrections, new features?
> > >>>
> > >>>III) I have version 2.4-alpha-5 and Inspector didn't 
> work in this 
> > >>>version. It have taken
> > >>>me much time to make it running, though it runs still 
> unstable now.
> Could
> > >>>you tell whether it is typical or I am alone who have 
> this problem, 
> > >>>and are there any updates?
> > >>>
> > >>>Thank you,
> > >>>Vladimir Khalyavin, ProfITsoft Ltd.
> > >>>
> > >>>
> > 
> >>>-------------------------------------------------------------------
> > >>>--
> > >>>To unsubscribe, e-mail: 
> tapestry-user-unsubscribe@jakarta.apache.org
> > >>>For additional commands, e-mail: 
> tapestry-user-help@jakarta.apache.org
> > >>>
> > >>>
> >
> >
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: 
> tapestry-user-help@jakarta.apache.org
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> 
> 


Re: Problems with Russian characters

Posted by Vladimir <vl...@profitsoft.com.ua>.
Excuse me, we don't use UTF-8 because we are storing all the file in
different character encodings (windows-1251 and iso-8859-2) and it will be
surplus problems for our designers to manage files in UTF-8. Moreover, every
russian character which were input by a user will take about 7 bytes in the
result HTML, for example, &#1074; (&amp;#1074;)

----- Original Message -----
From: "Dmitriy Kiriy" <dk...@oilspace.com>
To: "Tapestry users" <ta...@jakarta.apache.org>
Sent: Thursday, April 17, 2003 5:10 PM
Subject: RE: Problems with Russian characters


> Why you don't want use UTF-8?
> That's very easy.
>
> >>>-----Original Message-----
> >>>From: Vladimir [mailto:vlad@profitsoft.com.ua]
> >>>Sent: Thursday, April 17, 2003 8:55 PM
> >>>To: tapestry-user@jakarta.apache.org
> >>>Subject: Problems with Russian characters
> >>>
> >>>
> >>>Hello, Tapestry developers and users!
> >>>We are intending to create a large web-application and
> >>>considering which of
> >>>frameworks we can use. We like the idea of the Tapestry, but we have
some
> >>>questions to you before using it.
> >>>
> >>>I) The problem is that web-application should support three languages:
> >>>English, German, Russian. We see that you have excellent solutions for
> >>>localization and internationalization, but we have problems with
> >>>character
> >>>encodings. We don't want to use UTF-8. We want to use mostly
windows-1251
> >>>(Cyrillic encodingfor Windows) or KOI8-R for russian-speaking users,
and
> >>>iso-8859-2 for german-speaking users, but a russian-speaking
> >>>user should be
> >>>able to input german symbols (a-umlaut, o-umlaut, u-umlaut, sz).
> >>>I suppose the encoding should be determinated by the current engine's
> >>>locale, so the solution might be easy, something like this:
> >>>a) replace application servlet with our own subclass, something
> >>>like this:
> >>>
> >>>public class MyApplicationServlet extends
> >>>org.apache.tapestry.ApplicationServlet {
> >>>  protected void doService(HttpServletRequest request,
> >>>HttpServletResponse
> >>>response)
> >>>   throws java.io.IOException, javax.servlet.ServletException {
> >>>    org.apache.tapestry.request.RequestContext context = null;
> >>>    context = createRequestContext(request, response);
> >>>    org.apache.tapestry.IEngine engine = getEngine(context);
> >>>    if (engine != null) {
> >>>      java.lang.String l = engine.getLocale().getLanguage();
> >>>      java.lang.String charset = null;
> >>>      if ("ru".equals(l))
> >>>        charset = "windows-1251";
> >>>      else if ("de".equals(l))
> >>>        charset = "iso-8859-2";
> >>>      else
> >>>        charset = "iso-8859-1";
> >>>      request.setCharacterEncoding(charset);
> >>>    }
> >>>    super.doService(request, response);
> >>>  }
> >>>}
> >>>
> >>>b) The pages should implement the
getResponseWriter(java.io.OutputStream
> >>>out) something like this
> >>>
> >>>public class Test extends BasePage {
> >>>
> >>>  public IMarkupWriter getResponseWriter(OutputStream out) {
> >>>    java.lang.StringBuffer sb = new
> >>>java.lang.StringBuffer("text/html; chars
> >>>et=");
> >>>    java.lang.String l = getEngine().getLocale().getLanguage();
> >>>    java.lang.String charset = null;
> >>>    if ("ru".equals(l))
> >>>      charset = "windows-1251";
> >>>    else if ("de".equals(l))
> >>>      charset = "iso-8859-2";
> >>>    else
> >>>      charset = "iso-8859-1";
> >>>    sb.append(charset);
> >>>    return new HTMLWriter(sb.toString(), out);
> >>>  }
> >>>.......
> >>>}
> >>>
> >>>So we won't have problems with displaying the templates written in
> >>>windows-1251 or KOI8-R, but  we have the following problems:
> >>>a) All the symbols, which were input by user, with ASCII-code > 128 are
> >>>converted to the UTF-8-codes
> >>>b) It reads inproperly russian properties-files (resources) written in
> >>>windows-1251 and also converts them to UTF-8-codes
> >>>c) When a user with russian locale inputs german symbols, IE
> >>>sends them as
> >>>&auml; (&amp;auml; if you read this message in a HTML browser) &ouml;
> >>>(&amp;ouml;) and so on, but the framework (it seems to me,
> >>>rg.apache.tapestry.html.HTMLWriter) replaces & with &amp;
(&amp;amp;)...
> >>>
> >>>Could you tell us how to avoid these problems?
> >>>
> >>>II) Could you tell us, where we can find an information on your next
> >>>version:
> >>>terms, corrections, new features?
> >>>
> >>>III) I have version 2.4-alpha-5 and Inspector didn't work in
> >>>this version.
> >>>It have taken
> >>>me much time to make it running, though it runs still unstable now.
Could
> >>>you tell whether it is typical or I am alone who have this
> >>>problem, and are
> >>>there any updates?
> >>>
> >>>Thank you,
> >>>Vladimir Khalyavin, ProfITsoft Ltd.
> >>>
> >>>
> >>>---------------------------------------------------------------------
> >>>To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> >>>For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
> >>>
> >>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tapestry-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tapestry-user-help@jakarta.apache.org
>
>