You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2011/05/17 20:21:47 UTC

[jira] [Resolved] (TIKA-651) Unescaped attribute value generated

     [ https://issues.apache.org/jira/browse/TIKA-651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved TIKA-651.
--------------------------------

    Resolution: Won't Fix
      Assignee: Jukka Zitting

Resolving as Won't Fix as there's no obvious place where this would fit well inside Tika.

We can leave the code here for people to find and use, or if someone has the energy it can be pushed to Xerces/Xalan or perhaps somewhere in Apache Commons (there's a dormant Commons XML sandbox at http://svn.apache.org/repos/asf/commons/sandbox/xml/ that I started in 2009 for things like this).

> Unescaped attribute value generated
> -----------------------------------
>
>                 Key: TIKA-651
>                 URL: https://issues.apache.org/jira/browse/TIKA-651
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.9
>            Reporter: Raimund Merkert
>            Assignee: Jukka Zitting
>         Attachments: XHTMLSerializer.java
>
>
> I've converted a word document that contains hyperlinks with a complex query component. The & character is not escaped and mozilla complains about that when I write out the XHTML via a content handler that I wrote.
> It's not clear to me whether or not my contenthandler should assume attributes are properly escaped or not.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira