You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2020/12/13 20:39:30 UTC

[GitHub] [commons-text] kinow commented on a change in pull request #191: Windows-1252 encoding for HTML numeric entities

kinow commented on a change in pull request #191:
URL: https://github.com/apache/commons-text/pull/191#discussion_r542000693



##########
File path: src/main/java/org/apache/commons/text/translate/NumericEntityUnescaper.java
##########
@@ -143,8 +160,33 @@ public int translate(final CharSequence input, final int index, final Writer out
                 final char[] chrs = Character.toChars(entityValue);
                 out.write(chrs[0]);
                 out.write(chrs[1]);
+
+            } else if (128 <= entityValue && entityValue <= 159  // must be within the cp-1252 extension range
+                    && !isHex  // must be a NUMERIC entity, not hex entity (see StringEscapeUtilsTest for hex)
+                    && !INVALID_CP1252_POINTS.contains(entityValue)  // must not be an invalid code point for cp-1252
+            ) {
+                System.err.println(entityValue);

Review comment:
       We probably don't want to have a `System.err` in functions called by users as this may clutter their logs.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org