You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Simon Kitching (JIRA)" <ji...@apache.org> on 2008/03/15 16:24:24 UTC
[jira] Resolved: (DIGESTER-120) digesting xml content with
NodeCreateRule swallows spaces.
[ https://issues.apache.org/jira/browse/DIGESTER-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Simon Kitching resolved DIGESTER-120.
-------------------------------------
Resolution: Fixed
Fix Version/s: 1.8.1
Fixed.
> digesting xml content with NodeCreateRule swallows spaces.
> ----------------------------------------------------------
>
> Key: DIGESTER-120
> URL: https://issues.apache.org/jira/browse/DIGESTER-120
> Project: Commons Digester
> Issue Type: Bug
> Affects Versions: 1.8
> Environment: jdk 1.4.2_08, digester 1.8
> Reporter: Nguyen Thanh Son Daniel
> Fix For: 1.8.1
>
> Attachments: digester-patch.txt, simple.xml
>
>
> i need to process an xml file that contains entities: ie:
> <?xml version="1.0" encoding="UTF-8"?>
> <top>
> <body>A A</body>
> </top>
> i'm using digester as follows:
> Digester digester = new Digester ();
> digester.addRule ("top", new ObjectCreateRule (MyContent.class));
> digester.addRule ("top/body", new NodeCreateRule ());
> digester.addSetNext ("top/body", "setBody");
> then
> ...
> digester.parse (file);
> MyContent class transforms the node into text as follows:
> public class MyContent
> {
> public void setBody (Element node)
> {
> String content = serializeNode (node);
> System.out.println (content);
> }
> ...
> }
> the content displayed is in this case: <body>AA</body>
> if the body was encoded in the xml file as: <top><body>A A</body></top>, the content would then be correctly displayed as:
> <body>A A</body>
> looking at the NodeCreateRule.NodeBuilder.characters () implementation, the following code generates the problem:
> String str = new String(ch, start, length);
> if (str.trim().length() > 0) {
> top.appendChild(doc.createTextNode(str));
> when entities are being used; the characters () method is called for 'A', ' ' and 'A' in the first case. in the second case, it is called once with 'A A'.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.