You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2020/12/13 20:43:06 UTC

[GitHub] [commons-text] ifly6 commented on a change in pull request #191: Windows-1252 encoding for HTML numeric entities

ifly6 commented on a change in pull request #191:
URL: https://github.com/apache/commons-text/pull/191#discussion_r542001395



##########
File path: src/main/java/org/apache/commons/text/translate/NumericEntityUnescaper.java
##########
@@ -143,8 +160,33 @@ public int translate(final CharSequence input, final int index, final Writer out
                 final char[] chrs = Character.toChars(entityValue);
                 out.write(chrs[0]);
                 out.write(chrs[1]);
+
+            } else if (128 <= entityValue && entityValue <= 159  // must be within the cp-1252 extension range
+                    && !isHex  // must be a NUMERIC entity, not hex entity (see StringEscapeUtilsTest for hex)
+                    && !INVALID_CP1252_POINTS.contains(entityValue)  // must not be an invalid code point for cp-1252
+            ) {
+                System.err.println(entityValue);

Review comment:
       Oh whoops that should have been removed!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org