You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mina.apache.org by "Zhang JinYan (JIRA)" <ji...@apache.org> on 2013/01/24 08:19:12 UTC
[jira] [Updated] (VYSPER-338) XMLParser.unescape throw exception
[ https://issues.apache.org/jira/browse/VYSPER-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhang JinYan updated VYSPER-338:
--------------------------------
Description:
If message stanza contains text: "辽宁" (escape before send:"&#x8FBD;&#x5B81;")
exception will be throw out:
Caused by: org.xml.sax.SAXParseException: For input string: "8FBD;宁"
at org.apache.vysper.xml.sax.impl.XMLParser.fatalError(XMLParser.java:499)
at org.apache.vysper.xml.sax.impl.XMLParser.parse(XMLParser.java:124)
at org.apache.vysper.xml.sax.impl.DefaultNonBlockingXMLReader.parse(DefaultNonBlockingXMLReader.java:185)
at org.apache.vysper.xml.decoder.XMPPDecoder.doDecode(XMPPDecoder.java:117)
Caused by:String org.apache.vysper.xml.sax.impl.XMLParser.unescape(String s)
private String unescape(String s) {
s = s.replace("&", "&").replace(">", ">").replace("<", "<").replace("'", "'").replace(""",
"\"");
StringBuffer sb = new StringBuffer();
Matcher matcher = UNESCAPE_UNICODE_PATTERN.matcher(s);
int end = 0;
while (matcher.find()) {
boolean isHex = matcher.group(1).equals("x");
String unicodeCode = matcher.group(2);
int base = isHex ? 16 : 10;
int i = Integer.valueOf(unicodeCode, base).intValue();
char[] c = Character.toChars(i);
sb.append(s.substring(end, matcher.start()));
end = matcher.end();
sb.append(c);
}
sb.append(s.substring(end, s.length()));
return sb.toString();
}
Replace xml predefined entities before unescape change the context of escaped strings.
For example:
Input: "&#x8FBD;&#x5B81;"
After replace: "辽宁"
unescape use regex: Pattern.compile("\\&\\#(x?)(.+);");
match:
group(1) = x
group(2) = 8FBD;宁
then Integer.valueOf(unicodeCode, base) will throw exception.
I fixed this bug, see the patch.
was:
If message stanza contains text: "辽宁" exception will be throw out:
Caused by: org.xml.sax.SAXParseException: For input string: "8FBD;宁"
at org.apache.vysper.xml.sax.impl.XMLParser.fatalError(XMLParser.java:499)
at org.apache.vysper.xml.sax.impl.XMLParser.parse(XMLParser.java:124)
at org.apache.vysper.xml.sax.impl.DefaultNonBlockingXMLReader.parse(DefaultNonBlockingXMLReader.java:185)
at org.apache.vysper.xml.decoder.XMPPDecoder.doDecode(XMPPDecoder.java:117)
Caused by:String org.apache.vysper.xml.sax.impl.XMLParser.unescape(String s)
> XMLParser.unescape throw exception
> ----------------------------------
>
> Key: VYSPER-338
> URL: https://issues.apache.org/jira/browse/VYSPER-338
> Project: VYSPER
> Issue Type: Bug
> Components: core protocol
> Affects Versions: 0.7
> Environment: java1.6, windows7
> Reporter: Zhang JinYan
>
> If message stanza contains text: "辽宁" (escape before send:"&#x8FBD;&#x5B81;")
> exception will be throw out:
> Caused by: org.xml.sax.SAXParseException: For input string: "8FBD;宁"
> at org.apache.vysper.xml.sax.impl.XMLParser.fatalError(XMLParser.java:499)
> at org.apache.vysper.xml.sax.impl.XMLParser.parse(XMLParser.java:124)
> at org.apache.vysper.xml.sax.impl.DefaultNonBlockingXMLReader.parse(DefaultNonBlockingXMLReader.java:185)
> at org.apache.vysper.xml.decoder.XMPPDecoder.doDecode(XMPPDecoder.java:117)
> Caused by:String org.apache.vysper.xml.sax.impl.XMLParser.unescape(String s)
> private String unescape(String s) {
> s = s.replace("&", "&").replace(">", ">").replace("<", "<").replace("'", "'").replace(""",
> "\"");
> StringBuffer sb = new StringBuffer();
> Matcher matcher = UNESCAPE_UNICODE_PATTERN.matcher(s);
> int end = 0;
> while (matcher.find()) {
> boolean isHex = matcher.group(1).equals("x");
> String unicodeCode = matcher.group(2);
> int base = isHex ? 16 : 10;
> int i = Integer.valueOf(unicodeCode, base).intValue();
> char[] c = Character.toChars(i);
> sb.append(s.substring(end, matcher.start()));
> end = matcher.end();
> sb.append(c);
> }
> sb.append(s.substring(end, s.length()));
> return sb.toString();
> }
> Replace xml predefined entities before unescape change the context of escaped strings.
> For example:
> Input: "&#x8FBD;&#x5B81;"
> After replace: "辽宁"
> unescape use regex: Pattern.compile("\\&\\#(x?)(.+);");
> match:
> group(1) = x
> group(2) = 8FBD;宁
> then Integer.valueOf(unicodeCode, base) will throw exception.
> I fixed this bug, see the patch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira