You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by "Zhang JinYan (JIRA)" <ji...@apache.org> on 2013/01/24 08:19:12 UTC

[jira] [Updated] (VYSPER-338) XMLParser.unescape throw exception

     [ https://issues.apache.org/jira/browse/VYSPER-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhang JinYan updated VYSPER-338:
--------------------------------

    Description: 
If message stanza contains text: "&#x8FBD;&#x5B81;" (escape before send:"&amp;#x8FBD;&amp;#x5B81;")
exception will be throw out:

Caused by: org.xml.sax.SAXParseException: For input string: "8FBD;&#x5B81"
	at org.apache.vysper.xml.sax.impl.XMLParser.fatalError(XMLParser.java:499)
	at org.apache.vysper.xml.sax.impl.XMLParser.parse(XMLParser.java:124)
	at org.apache.vysper.xml.sax.impl.DefaultNonBlockingXMLReader.parse(DefaultNonBlockingXMLReader.java:185)
	at org.apache.vysper.xml.decoder.XMPPDecoder.doDecode(XMPPDecoder.java:117)

Caused by:String org.apache.vysper.xml.sax.impl.XMLParser.unescape(String s)

    private String unescape(String s) {
        s = s.replace("&amp;", "&").replace("&gt;", ">").replace("&lt;", "<").replace("&apos;", "'").replace("&quot;",
                "\"");

        StringBuffer sb = new StringBuffer();

        Matcher matcher = UNESCAPE_UNICODE_PATTERN.matcher(s);
        int end = 0;
        while (matcher.find()) {
            boolean isHex = matcher.group(1).equals("x");
            String unicodeCode = matcher.group(2);

            int base = isHex ? 16 : 10;
            int i = Integer.valueOf(unicodeCode, base).intValue();
            char[] c = Character.toChars(i);
            sb.append(s.substring(end, matcher.start()));
            end = matcher.end();
            sb.append(c);
        }
        sb.append(s.substring(end, s.length()));

        return sb.toString();
    }

Replace xml predefined entities before unescape change the context of escaped strings.
For example:
Input:  "&amp;#x8FBD;&amp;#x5B81;"
After replace: "&#x8FBD;&#x5B81;"

unescape use regex: Pattern.compile("\\&\\#(x?)(.+);");
match: 
group(1) = x
group(2) = 8FBD;&#x5B81

then Integer.valueOf(unicodeCode, base) will throw exception.
I fixed this bug, see the patch. 


  was:
If message stanza contains text: "&#x8FBD;&#x5B81;" exception will be throw out:

Caused by: org.xml.sax.SAXParseException: For input string: "8FBD;&#x5B81"
	at org.apache.vysper.xml.sax.impl.XMLParser.fatalError(XMLParser.java:499)
	at org.apache.vysper.xml.sax.impl.XMLParser.parse(XMLParser.java:124)
	at org.apache.vysper.xml.sax.impl.DefaultNonBlockingXMLReader.parse(DefaultNonBlockingXMLReader.java:185)
	at org.apache.vysper.xml.decoder.XMPPDecoder.doDecode(XMPPDecoder.java:117)

Caused by:String org.apache.vysper.xml.sax.impl.XMLParser.unescape(String s)

    
> XMLParser.unescape throw exception
> ----------------------------------
>
>                 Key: VYSPER-338
>                 URL: https://issues.apache.org/jira/browse/VYSPER-338
>             Project: VYSPER
>          Issue Type: Bug
>          Components: core protocol
>    Affects Versions: 0.7
>         Environment: java1.6, windows7
>            Reporter: Zhang JinYan
>
> If message stanza contains text: "&#x8FBD;&#x5B81;" (escape before send:"&amp;#x8FBD;&amp;#x5B81;")
> exception will be throw out:
> Caused by: org.xml.sax.SAXParseException: For input string: "8FBD;&#x5B81"
> 	at org.apache.vysper.xml.sax.impl.XMLParser.fatalError(XMLParser.java:499)
> 	at org.apache.vysper.xml.sax.impl.XMLParser.parse(XMLParser.java:124)
> 	at org.apache.vysper.xml.sax.impl.DefaultNonBlockingXMLReader.parse(DefaultNonBlockingXMLReader.java:185)
> 	at org.apache.vysper.xml.decoder.XMPPDecoder.doDecode(XMPPDecoder.java:117)
> Caused by:String org.apache.vysper.xml.sax.impl.XMLParser.unescape(String s)
>     private String unescape(String s) {
>         s = s.replace("&amp;", "&").replace("&gt;", ">").replace("&lt;", "<").replace("&apos;", "'").replace("&quot;",
>                 "\"");
>         StringBuffer sb = new StringBuffer();
>         Matcher matcher = UNESCAPE_UNICODE_PATTERN.matcher(s);
>         int end = 0;
>         while (matcher.find()) {
>             boolean isHex = matcher.group(1).equals("x");
>             String unicodeCode = matcher.group(2);
>             int base = isHex ? 16 : 10;
>             int i = Integer.valueOf(unicodeCode, base).intValue();
>             char[] c = Character.toChars(i);
>             sb.append(s.substring(end, matcher.start()));
>             end = matcher.end();
>             sb.append(c);
>         }
>         sb.append(s.substring(end, s.length()));
>         return sb.toString();
>     }
> Replace xml predefined entities before unescape change the context of escaped strings.
> For example:
> Input:  "&amp;#x8FBD;&amp;#x5B81;"
> After replace: "&#x8FBD;&#x5B81;"
> unescape use regex: Pattern.compile("\\&\\#(x?)(.+);");
> match: 
> group(1) = x
> group(2) = 8FBD;&#x5B81
> then Integer.valueOf(unicodeCode, base) will throw exception.
> I fixed this bug, see the patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira