You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2022/04/27 11:52:54 UTC

[GitHub] [commons-text] kinow commented on pull request #310: TEXT-215: Prevent decimal numeric entities from wrongly including hexadecimal characters

kinow commented on PR #310:
URL: https://github.com/apache/commons-text/pull/310#issuecomment-1110909559

   > https://www.w3.org/TR/REC-xml/#dt-charref
   > 
   > Why are illegal entities allowed in the first place? Am I reading the specification incorrectly? The ';' character should be required. IMO this feature creep on our end feels improper and should not be allowed or at the very least deprecated.
   
   Good point. I haven't checked any specification yet, but this:
   
   ```
   # File: test.html
   <iframe src="&#106avascript:alert(1)">
   ```
   
   Or this:
   
   ```
   # File: test.html
   <iframe src="&#106;avascript:alert(1)">
   ```
   
   Both trigger an alert (tested with `python3 -m http.server` and visit <http://localhost:8000/test.html>). I think the JIRA issue mentions how browsers handle this payload, so I suspect users could expect Commons Text to translate it in a similar way (not saying that it's correct or not, and whether we should do it or not :+1: , just FWIW)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org