You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Duncan Jones (JIRA)" <ji...@apache.org> on 2014/01/28 14:52:39 UTC

[jira] [Updated] (LANG-806) RandomStringUtils can enter infinite loop if chosen char does not meet letter/digit requirements

     [ https://issues.apache.org/jira/browse/LANG-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Duncan Jones updated LANG-806:
------------------------------

    Description: 
An infinite loop can result if the selection process never returns a char that passes the validation test.

This can occur if the subset specified by the start and end characters does not contain any valid characters.

For example:

{code:java}
RandomStringUtils.random(3, 5, 10, true, true); // 1

RandomStringUtils.random(3, 56192, 56319, false, false); // 2
{code}

There's also the case where only surrogates are allowed, but the buffer is not an even number of characters, for example:

{code:java}
RandomStringUtils.random(3, 56320, 57343, false, false); // 3
{code}

The second example is easy to detect, but in general it does not seem easy to determine in advance if the subset contains any valid characters - except by evaluating all the possible char values. This would be expensive if the subset range is large.

One possibility is to count the total number of loops (or retries), and throw an error if it exceeds a given value. Or count the number of consecutive retries.
In both cases the threshold value must be set high enough to allow for the cases where the allowable char range contains only a small proportion of valid characters. 

In the case of digits only, the default allowable range is currently set to digits + letters, so the proportion of valid chars is 10/90 i.e. approx 11%.

A minimum proportion of 1% or 0.1% would be necessary to reduce the number of false positives.

  was:
An infinite loop can result if the selection process never returns a char that passes the validation test.

This can occur if the subset specified by the start and end characters does not contain any valid characters.

For example:

RandomStringUtils.random(3, 5, 10, true, true); // 1

RandomStringUtils.random(3, 56192, 56319, false, false); // 2

There's also the case where only surrogates are allowed, but the buffer is not an even number of characters, for example:

RandomStringUtils.random(3, 56320, 57343, false, false); // 3

The second example is easy to detect, but in general it does not seem easy to determine in advance if the subset contains any valid characters - except by evaluating all the possible char values. This would be expensive if the subset range is large.

One possibility is to count the total number of loops (or retries), and throw an error if it exceeds a given value. Or count the number of consecutive retries.
In both cases the threshold value must be set high enough to allow for the cases where the allowable char range contains only a small proportion of valid characters. 

In the case of digits only, the default allowable range is currently set to digits + letters, so the proportion of valid chars is 10/90 i.e. approx 11%.

A minimum proportion of 1% or 0.1% would be necessary to reduce the number of false positives.


> RandomStringUtils can enter infinite loop if chosen char does not meet letter/digit requirements
> ------------------------------------------------------------------------------------------------
>
>                 Key: LANG-806
>                 URL: https://issues.apache.org/jira/browse/LANG-806
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.*
>    Affects Versions: 2.6, 3.1
>            Reporter: Sebb
>             Fix For: Review Patch
>
>         Attachments: LANG-806.patch, RandomStringException.java
>
>
> An infinite loop can result if the selection process never returns a char that passes the validation test.
> This can occur if the subset specified by the start and end characters does not contain any valid characters.
> For example:
> {code:java}
> RandomStringUtils.random(3, 5, 10, true, true); // 1
> RandomStringUtils.random(3, 56192, 56319, false, false); // 2
> {code}
> There's also the case where only surrogates are allowed, but the buffer is not an even number of characters, for example:
> {code:java}
> RandomStringUtils.random(3, 56320, 57343, false, false); // 3
> {code}
> The second example is easy to detect, but in general it does not seem easy to determine in advance if the subset contains any valid characters - except by evaluating all the possible char values. This would be expensive if the subset range is large.
> One possibility is to count the total number of loops (or retries), and throw an error if it exceeds a given value. Or count the number of consecutive retries.
> In both cases the threshold value must be set high enough to allow for the cases where the allowable char range contains only a small proportion of valid characters. 
> In the case of digits only, the default allowable range is currently set to digits + letters, so the proportion of valid chars is 10/90 i.e. approx 11%.
> A minimum proportion of 1% or 0.1% would be necessary to reduce the number of false positives.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)