You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "Aravind-Suresh (via GitHub)" <gi...@apache.org> on 2023/06/14 04:17:50 UTC

[GitHub] [pinot] Aravind-Suresh commented on a diff in pull request #10897: Fixes SQL wildcard escaping in LIKE queries

Aravind-Suresh commented on code in PR #10897:
URL: https://github.com/apache/pinot/pull/10897#discussion_r1228963312


##########
pinot-common/src/main/java/org/apache/pinot/common/utils/RegexpPatternConverterUtils.java:
##########
@@ -64,24 +71,42 @@ public static String likeToRegexpLike(String likePattern) {
         break;
     }
 
-    String escaped = escapeMetaCharacters(likePattern.substring(start, end));
-    StringBuilder sb = new StringBuilder(escaped.length() + 2);
+    likePattern = likePattern.substring(start, end);
+    StringBuilder sb = new StringBuilder();
     sb.append(prefix);
-    sb.append(escaped);
-    sb.append(suffix);
 
+    // handling SQL wildcards by replacing them with corresponding regex equivalents
+    // we ignore them if the SQL wildcards are escaped
     int i = 0;
-    while (i < sb.length()) {
-      char c = sb.charAt(i);
+    boolean isPrevCharBackSlash = false;
+    while (i < likePattern.length()) {
+      char c = likePattern.charAt(i);
       if (c == '_') {
-        sb.replace(i, i + 1, ".");
+        sb.append(isPrevCharBackSlash ? c : ".");
       } else if (c == '%') {
-        sb.replace(i, i + 1, ".*");
-        i++;
+        sb.append(isPrevCharBackSlash ? c : ".*");

Review Comment:
   So, escapeMetaCharacters was escaping every instance of a meta character. Now "\" was also a meta character, but we shouldn't escape that if that was used to escape the following character - like: \_, \%. Basically there are two flavours of "\" - one as a meta character, one as an escape character for the following character. We should escape the first and not the latter in the output. Since we can't replace every occurrence of "\" with "\\" - so I incorporated that logic here where if it's not before any SQL wildcard, we end up escaping it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org