You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by "Brian Minchau (JIRA)" <xa...@xml.apache.org> on 2006/12/12 03:26:22 UTC

[jira] Assigned: (XALANJ-2352) Redirect Extension produces invalid UTF8 from a valid UTF8 stream

     [ http://issues.apache.org/jira/browse/XALANJ-2352?page=all ]

Brian Minchau reassigned XALANJ-2352:
-------------------------------------

    Assignee: Yash Talwar

Assiging to Yash Talwar, who agreed to look at this one at 
the Xalan-J JIRA triage on December 11, 2006 

> Redirect Extension produces invalid UTF8 from a valid UTF8 stream
> -----------------------------------------------------------------
>
>                 Key: XALANJ-2352
>                 URL: http://issues.apache.org/jira/browse/XALANJ-2352
>             Project: XalanJ2
>          Issue Type: Bug
>          Components: Xalan-extensions
>    Affects Versions: 2.4, 2.6, The Latest Development Code
>         Environment: Mac OSX 10.4.x , Java 1.5
>            Reporter: Ian Boston
>         Assigned To: Yash Talwar
>         Attachments: test.xml, testout.xml, testredirectoutput.xml, WriteOutput.xsl
>
>
> java -cp ~/.m2/repository/xalan/xalan/2.6.0/xalan-2.6.0. org.apache.xalan.xslt.Process -XML -IN test.xml -XSL WriteOutput.xsl -OUT testout.xml 
> test.xml contains some UTF-8  characters (raw, not &# Entities)
> WriteOutput.xsl perfroms a transform with the redirect extensin, the text.xml input is copied to testout.xml
> test.xml, testout.xml are valid xml (using xmllint)
> testredirectout.xml has broken UTF-8 encoding
> Command Line output from my box.
> Files attached
> Any ideas, I've looked at the source in SVN and cant see anything that will fix, and there the serializer is the same for both output streams.
> auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout test.xml 
> auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout testout.xml 
> auto9:~/Caret/darwin/letters/xdocgen ieb$ xmllint --noout testredirectoutput.xml 
> testredirectoutput.xml:9: parser error : Input is not proper UTF-8, indicate encoding !
> Bytes: 0xCA 0x5D 0x5B 0xD4
>   [?][?][?][?][?]
>       ^
> auto9:~/Caret/darwin/letters/xdocgen ieb$ 
> auto9:~/Caret/darwin/letters/xdocgen ieb$ more test.xml 
> <?xml version="1.0" encoding="UTF-8"?>
> <documents xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output">
> <links>
> <link>letters/339/letter1814.xml</link>
> <output:write-file file="testredirectoutput.xml">
>  <document>
>  <source type="letter" lognum="1814" calendarnum="339"/>
>  <header>
>     <title>Here is a Valud UTF Char[<E2><80><82>]</title>
>  </header>
> <body>
>   <section>
>   [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
>   </section>
> </body>
> </document>
> </output:write-file>
> <source>
>  <document>
>  <source type="letter" lognum="1814" calendarnum="339"/>
>  <header>
>     <title>Here is a Valud UTF Char[<E2><80><82>]</title>
>  </header>
> <body>
>   <section>
>   [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
>   </section>
> </body>
> </document>
> </source>
> </links>
> </documents>
> auto9:~/Caret/darwin/letters/xdocgen ieb$ 
> auto9:~/Caret/darwin/letters/xdocgen ieb$ more testout.xml 
> <?xml version="1.0" encoding="UTF-8"?><directoutput xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output" xmlns:xalan="http://xml.apache.org/xalan" xml
> ns:redirect="http://xml.apache.org/xalan/redirect">
> letters/339/letter1814.xml
> <redirect:write file="testredirectoutput.xml">
>  <document>
>  <source type="letter" lognum="1814" calendarnum="339"/>
>  <header>
>     <title>Here is a Valud UTF Char[<E2><80><82>]</title>
>  </header>
> <body>
>   <section>
>   [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
>   </section>
> </body>
> </document>
> </redirect:write>
> <source>
>  <document>
>  <source type="letter" lognum="1814" calendarnum="339"/>
>  <header>
>     <title>Here is a Valud UTF Char[<E2><80><82>]</title>
>  </header>
> <body>
>   <section>
>   [<E2><80><82>][<C2><A0>][<E2><80><98>][<E2><80><99>][<E2><80><93>]
>   </section>
> </body>
> </document>
> </source>
> </directoutput>
> auto9:~/Caret/darwin/letters/xdocgen ieb$ 
> auto9:~/Caret/darwin/letters/xdocgen ieb$ more testredirectoutput.xml 
> <?xml version="1.0" encoding="UTF-8"?>
>  <document xmlns:output="http://www.caret.cam.ac.uk/2006/TR/output">
>  <source type="letter" lognum="1814" calendarnum="339"/>
>  <header>
>     <title>Here is a Valud UTF Char[?]</title>
>  </header>
> <body>
>   <section>
>   [?][<CA>][<D4>][<D5>][<D0>]
>   </section>
> </body>
> </document>
> auto9:~/Caret/darwin/letters/xdocgen ieb$ 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: xalan-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xalan-dev-help@xml.apache.org