You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Namgyu Kim (JIRA)" <ji...@apache.org> on 2019/02/18 17:00:02 UTC
[jira] [Updated] (LUCENE-8698) Fix replaceIgnoreCase method bug in EscapeQuerySyntaxImpl

     [ https://issues.apache.org/jira/browse/LUCENE-8698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namgyu Kim updated LUCENE-8698:
-------------------------------
    Attachment: LUCENE-8698.patch

> Fix replaceIgnoreCase method bug in EscapeQuerySyntaxImpl
> ---------------------------------------------------------
>
>                 Key: LUCENE-8698
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8698
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>            Reporter: Namgyu Kim
>            Priority: Major
>         Attachments: LUCENE-8698.patch
>
>
> It is a patch of LUCENE-8572 issue from [~tonicava].
>  
> There is a serious bug in the replaceIgnoreCase method of the EscapeQuerySyntaxImpl class.
> This issue can affect QueryNode. (StringIndexOutOfBoundsException)
> As I mentioned in comment of the issue, the String#toLowerCase() causes the array to grow in size.
> {code:java}
> private static CharSequence replaceIgnoreCase(CharSequence string,
>     CharSequence sequence1, CharSequence escapeChar, Locale locale) {
>   // string = "İpone " [304, 112, 111, 110, 101, 32],  size = 6
>   ...
>   while (start < count) {
>     // Convert by toLowerCase as follows.
>     // string = "i'̇pone " [105, 775, 112, 111, 110, 101, 32], size = 7
>     // firstIndex will be set 6.
>     if ((firstIndex = string.toString().toLowerCase(locale).indexOf(first,
>         start)) == -1)
>       break;
>     boolean found = true;
>     ...
>     if (found) {
>       // In this line, String.toString() will only have a range of 0 to 5.
>       // So here we get a StringIndexOutOfBoundsException.
>       result.append(string.toString().substring(copyStart, firstIndex));
>       ...
>     } else {
>       start = firstIndex + 1;
>     }
>   }
>   ...
> }{code}
> Maintaining the overall structure and fixing bug is very simple.
> If we change to the following code, the method works fine.
>  
> {code:java}
> // Line 135 ~ 136
> // BEFORE
> if ((firstIndex = string.toString().toLowerCase(locale).indexOf(first, start)) == -1)
> // AFTER
> if ((firstIndex = string.toString().indexOf(first, start)) == -1)
> {code}
>  
>  
> But I wonder if this is the best way.
> How do you think about using String#replace() instead?
>  
> {code:java}
> // SAMPLE : escapeWhiteChar (escapeChar and escapeQuoted are same)
> // BEFORE
> private static final CharSequence escapeWhiteChar(CharSequence str,
>     Locale locale) {
>   ...
>   for (int i = 0; i < escapableWhiteChars.length; i++) {
>     buffer = replaceIgnoreCase(buffer, escapableWhiteChars[i].toLowerCase(locale),
>         "\\", locale);
>   }
>   ...
> }
> // AFTER
> private static final CharSequence escapeWhiteChar(CharSequence str,
>     Locale locale) {
>   ...
>   for (int i = 0; i < escapableWhiteChars.length; i++) {
>     buffer = buffer.toString().replace(escapableWhiteChars[i], "\\" + escapableWhiteChars[i]);
>   }
>   ...
> }
> {code}
>  
> First, I upload the patch using String#replace().
> If you give me some feedback, I will check it :D
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org