You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Jasper Michalczik <ja...@gmx.net> on 2004/05/28 22:18:46 UTC

Short Introduction to using Cocoon with non-roman languages - was: Has anyone used Cocoon for chinese language application ?

Dear Reinhard, dear Cocoon-users,

I was asked to give a short explanation on how to use Cocoon for
non-roman languages - especially Arabic - which should be of use for
Chinese as well.

I'm not too firm in using Cocoon, so please feel free to correct or
extend this.


All files have to be saved as utf-8, so make sure to add/change the
first line of your xml/xsl-files:

	<?xml version="1.0" encoding="UTF-8"?>

In sitemap.xmap I added the following to each serializer:

	<map:serializer logger=...>
		<encoding>UTF-8</encoding>
	</map:serializer>

This adds the following META-Tag to the serialized document:

	<META http-equiv="Content-Type" content="text/html;
charset=UTF-8">

Then I set the following parameters in web.xml...

	<init-param>
		<param-name>container-encoding</param-name>
		<param-value>ISO-8859-1</param-value>
	</init-param>
	<init-param>
		<param-name>form-encoding</param-name>
		<param-value>UTF-8</param-value>
	</init-param>

... to make sure the forms are processed correctly.

On the client side at least Windows 2000 (I don't know about Linux or
Mac) must be used with the keyboard settings set up to allow
Arabic/Chinese typing. If you only need to display non-roman characters,
this also works with any system and a browser that supports
Unicode-display. IE5+ for example downloads the necessary fonts
automatically when needed.

I remember having some troubles using Tomcat 4.1.29, but 4.1.18 works
fine. I don't have any experiences with any other version or
servlet-container.


I only can't explain why the container-encoding in web.xml has to be set
to ISO-8859-1. If anybody knows about this, please add it to this text.
Any other setting I tried to use didn't work out.


I hope I could make a small contribution to the growing
cocoon-community...


Jasper Michalczik




-----Ursprüngliche Nachricht-----
Von: Reinhard Poetz [mailto:reinhard@apache.org] 
Gesendet: Freitag, 28. Mai 2004 14:58
An: users@cocoon.apache.org
Betreff: Re: AW: Has anyone used Cocoon for chinese language application
?

Jasper Michalczik wrote:

>Hello Vincent,
>
> 
>
>I developed an application with Arabic contents using cocoon. I didn’t
>have any trouble yet, my xml files are stored as utf8, but I don’t have
>any experiences with utf16. You only need to make sure that the form
>encoding is set accordingly, if you plan to use formulars.
>  
>


Do you mind preparing a small sample (including instructions what you 
need on the client-side)? If not, file a Bugzilla report 
(http://nagoya.apache.org/bugzilla/index.html) and I add them to the 
Cocoon samples.

-- 
Reinhard


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages - was: Has anyone used Cocoon for chinese language application ?

Posted by roy huang <li...@hotmail.com>.
I use Simplified Chinese,my configuration is almost the same and I will descript the difference here:
The basic idea is using utf-8.so:
1.(same)serializer  setting encoding utf-8 ,using iso-8859-1 you can also see Chinese like sample,but can't get Chinese string is client side JavaScript.
2.(same)container-encoding remain default iso-8859-1
3.(difference)form-encoding remain default iso-8859-1 but using setCharacterEncodingAction action to set encoding to utf-8.
Reason here:
form-encoding using utf-8 is fine but when you using Cocoon upload,file name will wrong even you try to reencoding.So,set form-encoding ISO-8859-1 and using setCharacterEncodingAction  when you won't process upload.You can get the file name correctly by:
Part part = (Part) request.get(fileField);
String tmp = part.getFileName();
String fileName = new String(tmp.getBytes("ISO-8859-1"));
4.(Other)if you access sql data from CLOB/NCLOB by SQLTransformer or database action,this two process CLOB using getAsciiStream/setAsciiStream ,you may get wrong string.try get/setCharacterStream to solve this problem.Check my post here:
 http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=108571178129741&w=2

The last two I ever posted mail in dev maillist and no reply  so I send this mail to dev mail list tow this time.

Roy Huang



----- Original Message ----- 
From: "Jasper Michalczik" <ja...@gmx.net>
To: <us...@cocoon.apache.org>
Sent: Saturday, May 29, 2004 4:18 AM
Subject: Short Introduction to using Cocoon with non-roman languages - was: Has anyone used Cocoon for chinese language application ?


Dear Reinhard, dear Cocoon-users,

I was asked to give a short explanation on how to use Cocoon for
non-roman languages - especially Arabic - which should be of use for
Chinese as well.

I'm not too firm in using Cocoon, so please feel free to correct or
extend this.


All files have to be saved as utf-8, so make sure to add/change the
first line of your xml/xsl-files:

<?xml version="1.0" encoding="UTF-8"?>

In sitemap.xmap I added the following to each serializer:

<map:serializer logger=...>
<encoding>UTF-8</encoding>
</map:serializer>

This adds the following META-Tag to the serialized document:

<META http-equiv="Content-Type" content="text/html;
charset=UTF-8">

Then I set the following parameters in web.xml...

<init-param>
<param-name>container-encoding</param-name>
<param-value>ISO-8859-1</param-value>
</init-param>
<init-param>
<param-name>form-encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>

... to make sure the forms are processed correctly.

On the client side at least Windows 2000 (I don't know about Linux or
Mac) must be used with the keyboard settings set up to allow
Arabic/Chinese typing. If you only need to display non-roman characters,
this also works with any system and a browser that supports
Unicode-display. IE5+ for example downloads the necessary fonts
automatically when needed.

I remember having some troubles using Tomcat 4.1.29, but 4.1.18 works
fine. I don't have any experiences with any other version or
servlet-container.


I only can't explain why the container-encoding in web.xml has to be set
to ISO-8859-1. If anybody knows about this, please add it to this text.
Any other setting I tried to use didn't work out.


I hope I could make a small contribution to the growing
cocoon-community...


Jasper Michalczik




-----Ursprüngliche Nachricht-----
Von: Reinhard Poetz [mailto:reinhard@apache.org] 
Gesendet: Freitag, 28. Mai 2004 14:58
An: users@cocoon.apache.org
Betreff: Re: AW: Has anyone used Cocoon for chinese language application
?

Jasper Michalczik wrote:

>Hello Vincent,
>
> 
>
>I developed an application with Arabic contents using cocoon. I didn't
>have any trouble yet, my xml files are stored as utf8, but I don't have
>any experiences with utf16. You only need to make sure that the form
>encoding is set accordingly, if you plan to use formulars.
>  
>


Do you mind preparing a small sample (including instructions what you 
need on the client-side)? If not, file a Bugzilla report 
(http://nagoya.apache.org/bugzilla/index.html) and I add them to the 
Cocoon samples.

-- 
Reinhard


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages - was: Has anyone used Cocoon for chinese language application ?

Posted by roy huang <li...@hotmail.com>.
I use Simplified Chinese,my configuration is almost the same and I will descript the difference here:
The basic idea is using utf-8.so:
1.(same)serializer  setting encoding utf-8 ,using iso-8859-1 you can also see Chinese like sample,but can't get Chinese string is client side JavaScript.
2.(same)container-encoding remain default iso-8859-1
3.(difference)form-encoding remain default iso-8859-1 but using setCharacterEncodingAction action to set encoding to utf-8.
Reason here:
form-encoding using utf-8 is fine but when you using Cocoon upload,file name will wrong even you try to reencoding.So,set form-encoding ISO-8859-1 and using setCharacterEncodingAction  when you won't process upload.You can get the file name correctly by:
Part part = (Part) request.get(fileField);
String tmp = part.getFileName();
String fileName = new String(tmp.getBytes("ISO-8859-1"));
4.(Other)if you access sql data from CLOB/NCLOB by SQLTransformer or database action,this two process CLOB using getAsciiStream/setAsciiStream ,you may get wrong string.try get/setCharacterStream to solve this problem.Check my post here:
 http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=108571178129741&w=2

The last two I ever posted mail in dev maillist and no reply  so I send this mail to dev mail list tow this time.

Roy Huang



----- Original Message ----- 
From: "Jasper Michalczik" <ja...@gmx.net>
To: <us...@cocoon.apache.org>
Sent: Saturday, May 29, 2004 4:18 AM
Subject: Short Introduction to using Cocoon with non-roman languages - was: Has anyone used Cocoon for chinese language application ?


Dear Reinhard, dear Cocoon-users,

I was asked to give a short explanation on how to use Cocoon for
non-roman languages - especially Arabic - which should be of use for
Chinese as well.

I'm not too firm in using Cocoon, so please feel free to correct or
extend this.


All files have to be saved as utf-8, so make sure to add/change the
first line of your xml/xsl-files:

<?xml version="1.0" encoding="UTF-8"?>

In sitemap.xmap I added the following to each serializer:

<map:serializer logger=...>
<encoding>UTF-8</encoding>
</map:serializer>

This adds the following META-Tag to the serialized document:

<META http-equiv="Content-Type" content="text/html;
charset=UTF-8">

Then I set the following parameters in web.xml...

<init-param>
<param-name>container-encoding</param-name>
<param-value>ISO-8859-1</param-value>
</init-param>
<init-param>
<param-name>form-encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>

... to make sure the forms are processed correctly.

On the client side at least Windows 2000 (I don't know about Linux or
Mac) must be used with the keyboard settings set up to allow
Arabic/Chinese typing. If you only need to display non-roman characters,
this also works with any system and a browser that supports
Unicode-display. IE5+ for example downloads the necessary fonts
automatically when needed.

I remember having some troubles using Tomcat 4.1.29, but 4.1.18 works
fine. I don't have any experiences with any other version or
servlet-container.


I only can't explain why the container-encoding in web.xml has to be set
to ISO-8859-1. If anybody knows about this, please add it to this text.
Any other setting I tried to use didn't work out.


I hope I could make a small contribution to the growing
cocoon-community...


Jasper Michalczik




-----Ursprüngliche Nachricht-----
Von: Reinhard Poetz [mailto:reinhard@apache.org] 
Gesendet: Freitag, 28. Mai 2004 14:58
An: users@cocoon.apache.org
Betreff: Re: AW: Has anyone used Cocoon for chinese language application
?

Jasper Michalczik wrote:

>Hello Vincent,
>
> 
>
>I developed an application with Arabic contents using cocoon. I didn't
>have any trouble yet, my xml files are stored as utf8, but I don't have
>any experiences with utf16. You only need to make sure that the form
>encoding is set accordingly, if you plan to use formulars.
>  
>


Do you mind preparing a small sample (including instructions what you 
need on the client-side)? If not, file a Bugzilla report 
(http://nagoya.apache.org/bugzilla/index.html) and I add them to the 
Cocoon samples.

-- 
Reinhard


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages -was: Has anyone used Cocoon for chinese language application ?

Posted by Antonio Gallardo <ag...@agssa.net>.
Bruno Dumon dijo:
> On Sat, 2004-05-29 at 13:30, Antonio Gallardo wrote:
>> Hi Bruno:
>>
>> Thanks for the answer.
>>
>> Currently, I have no time to test it.
>
> I understand that.
>
>>  I know this is a issue very frecuent
>> now, when people realize the right encoding is UTF-8. Here is a link
>> from
>> Tomcat:
>>
>> http://jakarta.apache.org/tomcat/faq/misc.html#utf8
>
> yep, but from a quick glance that information is very tomcat/jsp/servlet
> specific, ie the -Dfile.encoding isn't needed.

Cocoon is a servlet....

You got me! ;-)

I am writing a RT for dev list now to solve the proble.... please answer
there. :-)

Best Regards,

Antonio Gallardo


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages -was: Has anyone used Cocoon for chinese language application ?

Posted by Bruno Dumon <br...@outerthought.org>.
On Sat, 2004-05-29 at 13:30, Antonio Gallardo wrote:
> Hi Bruno:
> 
> Thanks for the answer.
> 
> Currently, I have no time to test it.

I understand that.

>  I know this is a issue very frecuent
> now, when people realize the right encoding is UTF-8. Here is a link from
> Tomcat:
> 
> http://jakarta.apache.org/tomcat/faq/misc.html#utf8

yep, but from a quick glance that information is very tomcat/jsp/servlet
specific, ie the -Dfile.encoding isn't needed.

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
bruno@outerthought.org                          bruno@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages -was: Has anyone used Cocoon for chinese language application ?

Posted by Antonio Gallardo <ag...@agssa.net>.
Hi Bruno:

Thanks for the answer.

Currently, I have no time to test it. I know this is a issue very frecuent
now, when people realize the right encoding is UTF-8. Here is a link from
Tomcat:

http://jakarta.apache.org/tomcat/faq/misc.html#utf8

Best Regards,

Antonio Gallardo

Bruno Dumon dijo:
> On Sat, 2004-05-29 at 12:26, Antonio Gallardo wrote:
>> Bruno Dumon dijo:
>> >> I only can't explain why the container-encoding in web.xml has to be
>> set
>> >> to ISO-8859-1. If anybody knows about this, please add it to this
>> text.
>> >> Any other setting I tried to use didn't work out.
>> >
>> > It has to be ISO-8859-1, always. This is because the servlet
>> > specification requires that request parameters are by default decoded
>> as
>> > ISO-8859-1 (regardless of the default platform encoding). The only
>> > reason I can imagine this is configurable at all is to work around
>> buggy
>> > servlet containers.
>> >
>> > More background on all this is also available at:
>> >
>> > http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding
>>
>> I never saw the abovelinked page before.
>
> It's there since 13/3/2003 and its URL has been dropped on this list
> multiple times since then.
>
> I'd like to move (a subset of) that info into the standard Cocoon docs,
> but first I'd like to see the Tomcat issue resolved.
>
>>  But for more than a year I have
>> this set is web.xml:
>>
>>     <init-param>
>>       <param-name>container-encoding</param-name>
>>       <param-value>utf-8</param-value>
>>     </init-param>
>>
>>     <init-param>
>>       <param-name>form-encoding</param-name>
>>       <param-value>utf-8</param-value>
>>     </init-param>
>>
>> In the site map we are using this HTML 4.01 serializer component:
>>
>> <map:serializer name="html" ....>
>>   <doctype-public>-//W3C//DTD HTML 4.01
>> Transitional//EN</doctype-public>
>>   <doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-system>
>>   <encoding>ISO-8859-1</encoding>
>>   <buffer-size>1024</buffer-size>
>>   <omit-xml-declaration>true</omit-xml-declaration>
>> </map:serializer>
>>
>> With this configuration we are able to connect to a PostgreSQL database
>> UTF-8 encoded.
>>
>> Hope this help.
>
> oops! that's a quite wrong configuration you have there. If you thought
> you were using UTF-8 for the communication with your browser, then I'll
> have to dissapoint you. You're using ISO-8859-1. Specifying UTF-8 twice
> in the web.xml is the same as specifying nothing, because it negates the
> effect. The servlet container decodes the request parameters as
> ISO-8859-1, and then cocoon does this:
>
> new String(value.getBytes("UTF-8"), "UTF-8");
>
> which is an effectless operation (but does burn a lot of CPU cycles,
> you're better of disabling those parameters in the web.xml if you're
> just using ISO-8859-1).
>
> Note that the encoding used to connect to your database (and how your
> database stores the data internally) are completely seperate issues from
> what encoding is used to communicate between webserver and browser (if
> and how this needs to be configured depends on the database product).
>
> --
> Bruno Dumon                             http://outerthought.org/
> Outerthought - Open Source, Java & XML Competence Support Center
> bruno@outerthought.org                          bruno@apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages -was: Has anyone used Cocoon for chinese language application ?

Posted by Bruno Dumon <br...@outerthought.org>.
On Sat, 2004-05-29 at 12:26, Antonio Gallardo wrote:
> Bruno Dumon dijo:
> >> I only can't explain why the container-encoding in web.xml has to be set
> >> to ISO-8859-1. If anybody knows about this, please add it to this text.
> >> Any other setting I tried to use didn't work out.
> >
> > It has to be ISO-8859-1, always. This is because the servlet
> > specification requires that request parameters are by default decoded as
> > ISO-8859-1 (regardless of the default platform encoding). The only
> > reason I can imagine this is configurable at all is to work around buggy
> > servlet containers.
> >
> > More background on all this is also available at:
> >
> > http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding
> 
> I never saw the abovelinked page before.

It's there since 13/3/2003 and its URL has been dropped on this list
multiple times since then.

I'd like to move (a subset of) that info into the standard Cocoon docs,
but first I'd like to see the Tomcat issue resolved.

>  But for more than a year I have
> this set is web.xml:
> 
>     <init-param>
>       <param-name>container-encoding</param-name>
>       <param-value>utf-8</param-value>
>     </init-param>
> 
>     <init-param>
>       <param-name>form-encoding</param-name>
>       <param-value>utf-8</param-value>
>     </init-param>
> 
> In the site map we are using this HTML 4.01 serializer component:
> 
> <map:serializer name="html" ....>
>   <doctype-public>-//W3C//DTD HTML 4.01 Transitional//EN</doctype-public>
>   <doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-system>
>   <encoding>ISO-8859-1</encoding>
>   <buffer-size>1024</buffer-size>
>   <omit-xml-declaration>true</omit-xml-declaration>
> </map:serializer>
> 
> With this configuration we are able to connect to a PostgreSQL database
> UTF-8 encoded.
> 
> Hope this help.

oops! that's a quite wrong configuration you have there. If you thought
you were using UTF-8 for the communication with your browser, then I'll
have to dissapoint you. You're using ISO-8859-1. Specifying UTF-8 twice
in the web.xml is the same as specifying nothing, because it negates the
effect. The servlet container decodes the request parameters as
ISO-8859-1, and then cocoon does this:

new String(value.getBytes("UTF-8"), "UTF-8");

which is an effectless operation (but does burn a lot of CPU cycles,
you're better of disabling those parameters in the web.xml if you're
just using ISO-8859-1).

Note that the encoding used to connect to your database (and how your
database stores the data internally) are completely seperate issues from
what encoding is used to communicate between webserver and browser (if
and how this needs to be configured depends on the database product).

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
bruno@outerthought.org                          bruno@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages -was: Has anyone used Cocoon for chinese language application ?

Posted by Antonio Gallardo <ag...@agssa.net>.
Bruno Dumon dijo:
>> I only can't explain why the container-encoding in web.xml has to be set
>> to ISO-8859-1. If anybody knows about this, please add it to this text.
>> Any other setting I tried to use didn't work out.
>
> It has to be ISO-8859-1, always. This is because the servlet
> specification requires that request parameters are by default decoded as
> ISO-8859-1 (regardless of the default platform encoding). The only
> reason I can imagine this is configurable at all is to work around buggy
> servlet containers.
>
> More background on all this is also available at:
>
> http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding

I never saw the abovelinked page before. But for more than a year I have
this set is web.xml:

    <init-param>
      <param-name>container-encoding</param-name>
      <param-value>utf-8</param-value>
    </init-param>

    <init-param>
      <param-name>form-encoding</param-name>
      <param-value>utf-8</param-value>
    </init-param>

In the site map we are using this HTML 4.01 serializer component:

<map:serializer name="html" ....>
  <doctype-public>-//W3C//DTD HTML 4.01 Transitional//EN</doctype-public>
  <doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-system>
  <encoding>ISO-8859-1</encoding>
  <buffer-size>1024</buffer-size>
  <omit-xml-declaration>true</omit-xml-declaration>
</map:serializer>

With this configuration we are able to connect to a PostgreSQL database
UTF-8 encoded.

Hope this help.

Best Regards,

Antonio Gallardo

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages - was: Has anyone used Cocoon for chinese language application ?

Posted by Bruno Dumon <br...@outerthought.org>.
On Fri, 2004-05-28 at 22:18, Jasper Michalczik wrote:
> Dear Reinhard, dear Cocoon-users,
> 
> I was asked to give a short explanation on how to use Cocoon for
> non-roman languages - especially Arabic - which should be of use for
> Chinese as well.
> 
> I'm not too firm in using Cocoon, so please feel free to correct or
> extend this.
> 
> 
> All files have to be saved as utf-8, so make sure to add/change the
> first line of your xml/xsl-files:
> 
> 	<?xml version="1.0" encoding="UTF-8"?>

This isn't a requirement, it can be any encoding you like as long as it
supports the characters you need. It can be a different encoding then
the one being used to send the page to the browser. UTF-8 is a good
choice though.

> In sitemap.xmap I added the following to each serializer:
> 
> 	<map:serializer logger=...>
> 		<encoding>UTF-8</encoding>
> 	</map:serializer>
> 
> This adds the following META-Tag to the serialized document:
> 
> 	<META http-equiv="Content-Type" content="text/html;
> charset=UTF-8">

yep, but it only does it if your page has already a html/head tag in it.

> 
> Then I set the following parameters in web.xml...
> 
> 	<init-param>
> 		<param-name>container-encoding</param-name>
> 		<param-value>ISO-8859-1</param-value>
> 	</init-param>
> 	<init-param>
> 		<param-name>form-encoding</param-name>
> 		<param-value>UTF-8</param-value>
> 	</init-param>
> 
> ... to make sure the forms are processed correctly.
> 
> On the client side at least Windows 2000 (I don't know about Linux or
> Mac) must be used with the keyboard settings set up to allow
> Arabic/Chinese typing. If you only need to display non-roman characters,
> this also works with any system and a browser that supports
> Unicode-display. IE5+ for example downloads the necessary fonts
> automatically when needed.
> 
> I remember having some troubles using Tomcat 4.1.29, but 4.1.18 works
> fine.

This is because of the following issue:
http://issues.apache.org/bugzilla/show_bug.cgi?id=26997

>  I don't have any experiences with any other version or
> servlet-container.
> 
> 
> I only can't explain why the container-encoding in web.xml has to be set
> to ISO-8859-1. If anybody knows about this, please add it to this text.
> Any other setting I tried to use didn't work out.

It has to be ISO-8859-1, always. This is because the servlet
specification requires that request parameters are by default decoded as
ISO-8859-1 (regardless of the default platform encoding). The only
reason I can imagine this is configurable at all is to work around buggy
servlet containers.

More background on all this is also available at:

http://wiki.cocoondev.org/Wiki.jsp?page=RequestParameterEncoding

> 
> 
> I hope I could make a small contribution to the growing
> cocoon-community...

sure!

> 
> 
> Jasper Michalczik
> 

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
bruno@outerthought.org                          bruno@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: Short Introduction to using Cocoon with non-roman languages - was: Has anyone used Cocoon for chinese language application ?

Posted by Antonio Gallardo <ag...@agssa.net>.
Hi:

Many people already told onthis list that Cocoon works for Arabic and
Chineese. The Chineese goverment is using Cocoon in some web sites. Search
the users mail list, I think there is even a link to the chineese
goverment pages.

The usage of ISO-8859-1 in Tomcat is optional, it depends on the standard
you are using. I suggest you to review the UTF-8 and UTF-16 standard.
AFAIK, Java use UTF-16 internal, so I don't see a problem to use Chineese
language.

Best Regards,

Antonio Gallardo


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org