You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/04/27 08:32:00 UTC

[jira] [Work logged] (TEXT-215) NumericEntityUnescaper may miss decimal entity

     [ https://issues.apache.org/jira/browse/TEXT-215?focusedWorklogId=762752&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762752 ]

ASF GitHub Bot logged work on TEXT-215:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 27/Apr/22 08:31
            Start Date: 27/Apr/22 08:31
    Worklog Time Spent: 10m 
      Work Description: rbunel35 commented on PR #310:
URL: https://github.com/apache/commons-text/pull/310#issuecomment-1110710442

   Hello !
   Do you have any news to give me about this fix ?
   Thanks in advance :)




Issue Time Tracking
-------------------

    Worklog Id:     (was: 762752)
    Time Spent: 40m  (was: 0.5h)

> NumericEntityUnescaper may miss decimal entity
> ----------------------------------------------
>
>                 Key: TEXT-215
>                 URL: https://issues.apache.org/jira/browse/TEXT-215
>             Project: Commons Text
>          Issue Type: Bug
>    Affects Versions: 1.0
>            Reporter: Richard Bunel
>            Priority: Major
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> *Description:*
> A security breach can be used in the NumericEntityUnescaper through the use of decimal character entities.
> At [line|https://github.com/apache/commons-text/blob/master/src/main/java/org/apache/commons/text/translate/NumericEntityUnescaper.java#L117] 117 a string of hexadecimal characters are searched, whether or not the entity is an hexadecimal one.
> Therefore, if the "semiColonOptional" option is enabled and a deicmal entity without semi-colon is immediately followed by one or several letters from A to F, these letters will be caught. The Integer parsing with a radix at 10 will then fail and the whole entity will be ignored.
> *Example:*
> If one uses the following string: 
> {code:java}
> <iframe src=\"&#106avascript:alert(1)\">{code}
> The sequence identifying the entity will wrongly be "&#106a" instead of "&#106".
> As "&#106a" is not a valid decimal entity, its Integer parsing fails and the whole entity remains escaped.
> Such code would then trigger the alert on all modern browsers.
> *Solution:*
> The fix for this is to restrict hexadecimal characters to hexadecimal entities and decimal characters to decimal entities.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)