You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lenya.apache.org by "Gregor J. Rothfuss" <gr...@apache.org> on 2004/05/15 17:08:03 UTC

one form editor screws encoding

from OneFormEditorSaveAction:

// Aggregate content
String encoding = parameters.getParameter("encoding");
String content = "<?xml version=\"1.0\" encoding=\"" + encoding  + 
"\"?>\n" +addNamespaces(namespaces, request.getParameter("content"));
FileWriter fw = new FileWriter(file);
fw.write(content, 0, content.length())

http://java.sun.com/j2se/1.4.2/docs/api/java/io/FileWriter.html

"Convenience class for writing character files. The constructors of this 
class assume that the default character encoding and the default 
byte-buffer size are acceptable. To specify these values yourself, 
construct an OutputStreamWriter on a FileOutputStream."

this breaks roundtrips of files with umlauts. if you really want to 
hardcode iso-8859-1 (why?), you have to set it explicitly and not rely 
on defaults. the filewriter does not determine its encoding by looking 
at the string contents..

-- 
Gregor J. Rothfuss
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://wyona.com                   http://cocoon.apache.org/lenya
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Michael Wechner <mi...@wyona.com>.
Gregor J. Rothfuss wrote:

> Michael Wechner wrote:
>
>
>> no, that I know. Although I have just seen that egli has a done 
>> patch, whereas
>> I don't know if this fixed the problem or not, because it was always 
>> working
>> for me and it still works for me.
>
>
> it now works on two different setups. i think it is fixed. the reason 
> why you didnt see the error is because your system was set to iso 8859-1
>
> i would suggest to set your locale to something else to avoid these 
> assumptions in the code in the future. 


done, it's POSIX again

> in your case, the file writer used the default encoding of the system 
> which happened to be the same one as the form encoding.
>
> here is a a little something on utf-8 support for vi
>
> http://mail.nl.linux.org/linux-utf8/2001-09/msg00098.html


thanks for the pointer

Michi


-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com              http://cocoon.apache.org/lenya/
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Michael Wechner wrote:


> no, that I know. Although I have just seen that egli has a done patch, 
> whereas
> I don't know if this fixed the problem or not, because it was always 
> working
> for me and it still works for me.

it now works on two different setups. i think it is fixed. the reason 
why you didnt see the error is because your system was set to iso 8859-1

i would suggest to set your locale to something else to avoid these 
assumptions in the code in the future. in your case, the file writer 
used the default encoding of the system which happened to be the same 
one as the form encoding.

here is a a little something on utf-8 support for vi

http://mail.nl.linux.org/linux-utf8/2001-09/msg00098.html

-- 
Gregor J. Rothfuss
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://wyona.com                   http://cocoon.apache.org/lenya
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Michael Wechner <mi...@wyona.com>.
Andreas Kuckartz wrote:

>>its a blocker for 1.2.
>>    
>>
>
>Was this bug entered into Bugzilla?
>  
>

no, that I know. Although I have just seen that egli has a done patch, 
whereas
I don't know if this fixed the problem or not, because it was always working
for me and it still works for me.

Michi

>Andreas
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
>For additional commands, e-mail: lenya-dev-help@cocoon.apache.org
>
>
>  
>


-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com              http://cocoon.apache.org/lenya/
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Michael Wechner <mi...@wyona.com>.
Christian Egli wrote:

>
>
>afaik it is not needed anymore. The action now fetches the encoding
>from the request header and ignores this param. I'll fix it.
>  
>

I have removed it ;-)

Thanks

Michi


-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com              http://cocoon.apache.org/lenya/
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Christian Egli <ch...@wyona.net>.
Michael Wechner <mi...@wyona.com> writes:

> I guess the encoding line within usecase.xmap isn't necessary anymore
> 
> <map:act type="oneformeditorsave">
>    <map:parameter name="file"
> value="pubs/{../1}/work/oneformeditor/authoring/{../2}.xml"/>
>    <map:parameter name="encoding" value="ISO-8859-1"/>
> 
> or is it?

afaik it is not needed anymore. The action now fetches the encoding
from the request header and ignores this param. I'll fix it.

-- 
Christian Egli



---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Michael Wechner <mi...@wyona.com>.
I guess the encoding line within usecase.xmap isn't necessary anymore

<map:act type="oneformeditorsave">
   <map:parameter name="file" 
value="pubs/{../1}/work/oneformeditor/authoring/{../2}.xml"/>
   <map:parameter name="encoding" value="ISO-8859-1"/>

or is it?

Thanks

Michi



Andreas Kuckartz wrote:

>>its a blocker for 1.2.
>>    
>>
>
>Was this bug entered into Bugzilla?
>
>Andreas
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
>For additional commands, e-mail: lenya-dev-help@cocoon.apache.org
>
>
>  
>


-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com              http://cocoon.apache.org/lenya/
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Andreas Kuckartz <A....@ping.de>.
> its a blocker for 1.2.

Was this bug entered into Bugzilla?

Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Michael Wechner <mi...@wyona.com>.
Gregor J. Rothfuss wrote:

> Michael Wechner wrote:
>
>>> // Aggregate content
>>> String encoding = parameters.getParameter("encoding");
>>> String content = "<?xml version=\"1.0\" encoding=\"" + encoding  + 
>>> "\"?>\n" +addNamespaces(namespaces, request.getParameter("content"));
>>
>
> > it's not hardcoded, but rather depends on how the client
> > returns the data and hence configurable within the sitemap.
>
> i hope you realize that you do not set encoding in a file by writing 
> some text into a string?


yes, I know that. But as I said before the browser returns it in ISO-8859-1
because it is serialized as ISO-8859-1 (or I have a misunderstanding 
there), which
is configured within the sitemap as well. Hence, depending on the 
serialization
configuration one can configure the encoding string for the output as well.

>
>>
>> If you have an alternative solution which fixes your problems you 
>> might encounter,
>> then feel free to fix it. At the time I fixed the namespace problem, 
>> this was
>> the only solution I had at hand and the umlauts worked for me (still 
>> do).
>
>
> its a blocker for 1.2. if you could do me a favor and stop with the 
> "feel free to fix" boilerplate. thanks


there's no code ownership. So if you have a patch, then go an fix it, else
I guess you just have to wait until someone else takes the time to fix it.

Michi

>
>


-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com              http://cocoon.apache.org/lenya/
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.
Michael Wechner wrote:

>> // Aggregate content
>> String encoding = parameters.getParameter("encoding");
>> String content = "<?xml version=\"1.0\" encoding=\"" + encoding  + 
>> "\"?>\n" +addNamespaces(namespaces, request.getParameter("content"));

 > it's not hardcoded, but rather depends on how the client
 > returns the data and hence configurable within the sitemap.

i hope you realize that you do not set encoding in a file by writing 
some text into a string?

>> you have to set it explicitly and not rely on defaults. the filewriter 
>> does not determine its encoding by looking at the string contents..
> 
> 
> 
> If you have an alternative solution which fixes your problems you might 
> encounter,
> then feel free to fix it. At the time I fixed the namespace problem, 
> this was
> the only solution I had at hand and the umlauts worked for me (still do).

its a blocker for 1.2. if you could do me a favor and stop with the 
"feel free to fix" boilerplate. thanks


-- 
Gregor J. Rothfuss
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://wyona.com                   http://cocoon.apache.org/lenya
gregor.rothfuss@wyona.com                       gregor@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org


Re: one form editor screws encoding

Posted by Michael Wechner <mi...@wyona.com>.
Gregor J. Rothfuss wrote:

> from OneFormEditorSaveAction:
>
> // Aggregate content
> String encoding = parameters.getParameter("encoding");
> String content = "<?xml version=\"1.0\" encoding=\"" + encoding  + 
> "\"?>\n" +addNamespaces(namespaces, request.getParameter("content"));
> FileWriter fw = new FileWriter(file);
> fw.write(content, 0, content.length())
>
> http://java.sun.com/j2se/1.4.2/docs/api/java/io/FileWriter.html
>
> "Convenience class for writing character files. The constructors of 
> this class assume that the default character encoding and the default 
> byte-buffer size are acceptable. To specify these values yourself, 
> construct an OutputStreamWriter on a FileOutputStream."
>
> this breaks roundtrips of files with umlauts. if you really want to 
> hardcode iso-8859-1 (why?),


it's not hardcoded, but rather depends on how the client returns the data
and hence configurable within the sitemap.

> you have to set it explicitly and not rely on defaults. the filewriter 
> does not determine its encoding by looking at the string contents..


If you have an alternative solution which fixes your problems you might 
encounter,
then feel free to fix it. At the time I fixed the namespace problem, 
this was
the only solution I had at hand and the umlauts worked for me (still do).

Michi


-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com              http://cocoon.apache.org/lenya/
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lenya-dev-unsubscribe@cocoon.apache.org
For additional commands, e-mail: lenya-dev-help@cocoon.apache.org