You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by "Karl Wright (Created) (JIRA)" <ji...@apache.org> on 2012/01/17 09:24:39 UTC

[jira] [Created] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
----------------------------------------------------------------------------------------------------------------------------------

                 Key: FOR-1231
                 URL: https://issues.apache.org/jira/browse/FOR-1231
             Project: Forrest
          Issue Type: Bug
          Components: Internationalisation (i18n)
    Affects Versions: 0.9, 0.10-dev
            Reporter: Karl Wright
            Priority: Critical


We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the html images do not load properly in a browser, even though the browser correctly presumes the page is utf-8.  It looks like many characters are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.

I checked out latest Forrest trunk and build that but there is no improvement.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Hitoshi Ozawa (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188130#comment-13188130 ] 

Hitoshi Ozawa commented on FOR-1231:
------------------------------------

While at this, would appreciate if it's possible to install Japanese fonts as well so pdf containing Japanese would show up correctly as well.
                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Hitoshi Ozawa (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188245#comment-13188245 ] 

Hitoshi Ozawa commented on FOR-1231:
------------------------------------

Sorry David, I thought the html pages were being dynamically generated on the Apache server.
It seems it's not. "forrest site" works fine on my Japanese OS.

Karl, is your system setup to use en_US-UTF-8?
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8
                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Karl Wright (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wright updated FOR-1231:
-----------------------------

    Attachment: FOR-1231.patch

This patch works, at least as far as generating Japanese correctly on an en_US Windows machine.

                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>         Attachments: FOR-1231.patch
>
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188380#comment-13188380 ] 

Karl Wright commented on FOR-1231:
----------------------------------

I figured it out. What we need to do is set the JAVA default encoding to UTF-8. The easy way to do this is (on Windows):

set JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8

 ... or on Linux: 

export JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8

Doing this before a Forrest invocation causes all JVMs it brings up to have the right encoding. (It's Cocoon that seems to be broken, by the way) 
                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "David Crossley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188150#comment-13188150 ] 

David Crossley commented on FOR-1231:
-------------------------------------

Please ask about separate usage issues on the user mailing list.

The PDF fonts are configurable. See that plugin's docs:
http://forrest.apache.org/docs/plugins/org.apache.forrest.plugin.output.pdf/
                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188216#comment-13188216 ] 

Karl Wright commented on FOR-1231:
----------------------------------

I'm told that the Japanese portion of the site is correctly generated on a system that has a default locale of ja_JP.  Obviously, though, this is not a good solution to the problem since we cannot select different locales when there is more than one language involved.

                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "David Crossley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190838#comment-13190838 ] 

David Crossley commented on FOR-1231:
-------------------------------------

Thanks. I was thinking of a similar patch. However i wondered if it would need to append this setting to any existing JAVA_TOOL_OPTIONS then reset at finish.

I have applied your patch as-is. Thanks.
If someone thinks that it needs more, then please do.

Regarding the Cocoon situation, i think that the doc comments refer to the fact that Cocoon/Forrest have many supporting products handling various parts of the system. Perhaps some of those treat the encoding differently. So this environment setting seems a good solution.
                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>         Attachments: FOR-1231.patch
>
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Karl Wright (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wright updated FOR-1231:
-----------------------------

    Description: 
We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.

I checked out latest Forrest trunk and built and used that but there has been no improvement.


  was:
We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.

I checked out latest Forrest trunk and build that but there is no improvement.


    
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Karl Wright (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188259#comment-13188259 ] 

Karl Wright commented on FOR-1231:
----------------------------------

bq. Karl, is your system setup to use en_US-UTF-8?
bq. export LC_ALL=en_US.UTF-8
bq. export LANG=en_US.UTF-8
bq. export LANGUAGE=en_US.UTF-8 

I set the equivalent Windows variables but no change in the generated code for me.  So it must be something else.

                
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and built and used that but there has been no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Karl Wright (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wright updated FOR-1231:
-----------------------------

    Description: 
We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.

I checked out latest Forrest trunk and build that but there is no improvement.


  was:
We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.

I checked out latest Forrest trunk and build that but there is no improvement.


    
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters in the source XML are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and build that but there is no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (FOR-1231) Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML

Posted by "Karl Wright (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/FOR-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karl Wright updated FOR-1231:
-----------------------------

    Description: 
We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.

I checked out latest Forrest trunk and build that but there is no improvement.


  was:
We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the html images do not load properly in a browser, even though the browser correctly presumes the page is utf-8.  It looks like many characters are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.

I checked out latest Forrest trunk and build that but there is no improvement.


    
> Forrest does not deal properly with UTF-8 .xml content, even with the proper XML content-type header, and generates corrupted HTML
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOR-1231
>                 URL: https://issues.apache.org/jira/browse/FOR-1231
>             Project: Forrest
>          Issue Type: Bug
>          Components: Internationalisation (i18n)
>    Affects Versions: 0.9, 0.10-dev
>            Reporter: Karl Wright
>            Priority: Critical
>
> We're using Forrest to generate the Apache ManifoldCF site.  We've added Japanese content.  The content worked fine via localhost:8888, but the generated html content does not load properly in a browser, even though the browser correctly divines that the HTML page has utf-8 encoding.  It looks like many utf-8 characters are handled correctly but some are corrupted.  I've also tried the fix in FORREST-668 but this does not help.  See http://incubator.apache.org/connectors and click on the tab in Japanese to see what I mean.  The current source for the site can be found in: https://svn.apache.org/repos/asf/incubator/lcf/trunk/site.
> I checked out latest Forrest trunk and build that but there is no improvement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira