You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/01 11:51:26 UTC

[GitHub] [arrow-rs] daniel-martinez-maqueda-sap opened a new pull request, #2258: Fix escaped like wildcards

daniel-martinez-maqueda-sap opened a new pull request, #2258:
URL: https://github.com/apache/arrow-rs/pull/2258

   # Which issue does this PR close?
   
   Closes #415 and #702
   
   # Rationale for this change
    
   It is explained in the Issues linked above.
   
   # What changes are included in this PR?
   
   Added a new function that replaces the like wildcards '%' and '_' for the regex counterparts before executing them. It also takes into account that the wildcards can be escaped, in that case, it does remove the escape characters and leaves the wildcards so that they are matched against the raw character.
   
   This is implemented iterating over all the characters of the pattern to figure out when it needs to be transformed or not.
   
   # Are there any user-facing changes?
   N/A


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] jhorstmann commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r934486413


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -296,6 +298,41 @@ pub fn like_utf8_scalar<OffsetSize: OffsetSizeTrait>(
     Ok(BooleanArray::from(data))
 }
 
+fn replace_like_wildcards(text: &str) -> Result<String> {
+    let text = escape(text);

Review Comment:
   Maybe this function could also already return whether we have one of the special cases "starts with", "ends with" or "contains". But I think the regex engine already contains some of those optimizations so maybe we don't really need to special case them in the kernels. Might be out of scope for this ticket.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r936501651


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -298,10 +298,15 @@ pub fn like_utf8_scalar<OffsetSize: OffsetSizeTrait>(
     Ok(BooleanArray::from(data))
 }
 
-fn replace_like_wildcards(text: &str) -> Result<String> {
+/// Transforms a like `pattern` to a regex compatible pattern. To achieve that, it does:

Review Comment:
   ❤️ 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] codecov-commenter commented on pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#issuecomment-1202313780

   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/2258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#2258](https://codecov.io/gh/apache/arrow-rs/pull/2258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (46f80b5) into [master](https://codecov.io/gh/apache/arrow-rs/commit/3032a521c9691d4569a9d277046304bd4e4098fb?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (3032a52) will **increase** coverage by `0.00%`.
   > The diff coverage is `94.54%`.
   
   ```diff
   @@           Coverage Diff            @@
   ##           master    #2258    +/-   ##
   ========================================
     Coverage   82.29%   82.29%            
   ========================================
     Files         243      245     +2     
     Lines       62443    62737   +294     
   ========================================
   + Hits        51387    51629   +242     
   - Misses      11056    11108    +52     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/2258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/compute/kernels/comparison.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2NvbXB1dGUva2VybmVscy9jb21wYXJpc29uLnJz) | `93.89% <94.54%> (-0.01%)` | :arrow_down: |
   | [...row/src/array/builder/string\_dictionary\_builder.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2J1aWxkZXIvc3RyaW5nX2RpY3Rpb25hcnlfYnVpbGRlci5ycw==) | `90.64% <0.00%> (-0.72%)` | :arrow_down: |
   | [parquet/src/encodings/encoding/dict\_encoder.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nL2RpY3RfZW5jb2Rlci5ycw==) | `90.74% <0.00%> (-0.49%)` | :arrow_down: |
   | [...rquet/src/arrow/record\_reader/definition\_levels.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvcmVjb3JkX3JlYWRlci9kZWZpbml0aW9uX2xldmVscy5ycw==) | `88.60% <0.00%> (-0.43%)` | :arrow_down: |
   | [parquet/src/column/writer/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvY29sdW1uL3dyaXRlci9tb2QucnM=) | `92.85% <0.00%> (-0.15%)` | :arrow_down: |
   | [arrow/src/array/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L21vZC5ycw==) | `100.00% <0.00%> (ø)` | |
   | [arrow/src/array/array\_fixed\_size\_list.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2ZpeGVkX3NpemVfbGlzdC5ycw==) | `92.91% <0.00%> (ø)` | |
   | [parquet/src/arrow/arrow\_writer/byte\_array.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfd3JpdGVyL2J5dGVfYXJyYXkucnM=) | `76.72% <0.00%> (ø)` | |
   | [parquet/src/arrow/arrow\_writer/mod.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfd3JpdGVyL21vZC5ycw==) | `97.66% <0.00%> (+0.01%)` | :arrow_up: |
   | [parquet/src/arrow/arrow\_reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfcmVhZGVyLnJz) | `95.12% <0.00%> (+0.12%)` | :arrow_up: |
   | ... and [9 more](https://codecov.io/gh/apache/arrow-rs/pull/2258/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] daniel-martinez-maqueda-sap commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
daniel-martinez-maqueda-sap commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r936298773


##########
arrow/Cargo.toml:
##########
@@ -49,6 +49,7 @@ half = { version = "2.0", default-features = false }
 hashbrown = { version = "0.12", default-features = false }
 csv_crate = { version = "1.1", default-features = false, optional = true, package="csv" }
 regex = { version = "1.5.6", default-features = false, features = ["std", "unicode"] }
+regex-syntax = { version = "0.6.27", default-features = false, features = ["unicode"] }

Review Comment:
   Yep, this is a dependency of `regex` and the function used, was already indirectly called. :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] daniel-martinez-maqueda-sap commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
daniel-martinez-maqueda-sap commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r936320964


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -3740,6 +3775,50 @@ mod tests {
         vec![false, true, false, false]
     );
 
+    test_utf8_scalar!(
+        test_utf8_scalar_like_escape,
+        vec!["a%", "a\\x"],
+        "a\\%",
+        like_utf8_scalar,
+        vec![true, false]
+    );
+
+    test_utf8!(
+        test_utf8_scalar_ilike_regex,
+        vec!["%%%"],
+        vec![r#"\%_\%"#],
+        ilike_utf8,
+        vec![true]
+    );
+
+    #[test]
+    fn test_replace_like_wildcards() {
+        let a_eq = "_%";
+        let expected = String::from("..*");
+        assert_eq!(replace_like_wildcards(a_eq).unwrap(), expected);
+    }
+
+    #[test]
+    fn test_replace_like_wildcards_leave_like_meta_chars() {
+        let a_eq = "\\%\\_";
+        let expected = String::from("%_");
+        assert_eq!(replace_like_wildcards(a_eq).unwrap(), expected);
+    }
+
+    #[test]
+    fn test_replace_like_wildcards_with_multiple_escape_chars() {
+        let a_eq = "\\\\%";

Review Comment:
   Rust complains with: `Syntax error: invalid escape`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] daniel-martinez-maqueda-sap commented on a diff in pull request #2258: Fix escaped like wildcards in `like_utf8` / `nlike_utf8` kernels

Posted by GitBox <gi...@apache.org>.
daniel-martinez-maqueda-sap commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r936661999


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -3740,6 +3775,50 @@ mod tests {
         vec![false, true, false, false]
     );
 
+    test_utf8_scalar!(

Review Comment:
   Thanks! :1st_place_medal: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] jhorstmann commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
jhorstmann commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r934472553


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -296,6 +298,41 @@ pub fn like_utf8_scalar<OffsetSize: OffsetSizeTrait>(
     Ok(BooleanArray::from(data))
 }
 
+fn replace_like_wildcards(text: &str) -> Result<String> {
+    let text = escape(text);

Review Comment:
   There is the `regex_syntax::is_meta_character` function, which is used by `escape`, using that we could handle both regex and like pattern characters in one loop. I think that could simplify the logic quite a bit.
   
   Pseudocode, not yet handling that `next` could be `None`:
   
   ```rust
   let mut chars_iter = text.chars().peekable()
   for c in chars_iter {
       if c == '\\' {
           append('\\'); 
           append('\\'); 
           let next = chars_iter.peek();
           if next == '%' || next == '_' {
              chars_iter.next(); // consume next, like character does not need to be transformed
               append(next);
           }
       } else if regex_syntax::is_meta_character(c) {
           append('\\');
           append(c);
       } else if c == '%' {
           append(".*");
       } else if c == '_' {
           append(".");
       } else {
           append(c);
       }
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r935892671


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -296,6 +298,39 @@ pub fn like_utf8_scalar<OffsetSize: OffsetSizeTrait>(
     Ok(BooleanArray::from(data))
 }
 
+fn replace_like_wildcards(text: &str) -> Result<String> {

Review Comment:
   It would help me review this code if we can include a docstring on this function explaining what replacements it was doing.
   
   Like explaining it was trying to escape `%` with `\\%` perhaps



##########
arrow/Cargo.toml:
##########
@@ -49,6 +49,7 @@ half = { version = "2.0", default-features = false }
 hashbrown = { version = "0.12", default-features = false }
 csv_crate = { version = "1.1", default-features = false, optional = true, package="csv" }
 regex = { version = "1.5.6", default-features = false, features = ["std", "unicode"] }
+regex-syntax = { version = "0.6.27", default-features = false, features = ["unicode"] }

Review Comment:
   I am always worried about new dependencies in arrow but I double checked and this is already a dependency of `regex` so I don't think it is "net-new" to arrow. 👍 
   
   https://crates.io/crates/regex-syntax/reverse_dependencies



##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -3740,6 +3775,50 @@ mod tests {
         vec![false, true, false, false]
     );
 
+    test_utf8_scalar!(
+        test_utf8_scalar_like_escape,
+        vec!["a%", "a\\x"],
+        "a\\%",
+        like_utf8_scalar,
+        vec![true, false]
+    );
+
+    test_utf8!(
+        test_utf8_scalar_ilike_regex,
+        vec!["%%%"],
+        vec![r#"\%_\%"#],
+        ilike_utf8,
+        vec![true]
+    );
+
+    #[test]
+    fn test_replace_like_wildcards() {
+        let a_eq = "_%";
+        let expected = String::from("..*");

Review Comment:
   You can probably do like this for some brevity:
   
   ```suggestion
           let expected = "..*";
   ```
   
   And if the assert below gives you issues, maybe it needs something like an extra `&`:
   
   ```rust
           assert_eq!(&replace_like_wildcards(a_eq).unwrap(), expected);
   ```



##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -342,7 +377,7 @@ pub fn nlike_utf8_scalar<OffsetSize: OffsetSizeTrait>(
             result.append(!left.value(i).ends_with(&right[1..]));
         }
     } else {
-        let re_pattern = escape(right).replace('%', ".*").replace('_', ".");
+        let re_pattern = replace_like_wildcards(right)?;

Review Comment:
   I really like how you have refactored this code 👍 



##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -3740,6 +3775,50 @@ mod tests {
         vec![false, true, false, false]
     );
 
+    test_utf8_scalar!(

Review Comment:
   ❤️ these test are very nice



##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -3740,6 +3775,50 @@ mod tests {
         vec![false, true, false, false]
     );
 
+    test_utf8_scalar!(
+        test_utf8_scalar_like_escape,
+        vec!["a%", "a\\x"],
+        "a\\%",
+        like_utf8_scalar,
+        vec![true, false]
+    );
+
+    test_utf8!(
+        test_utf8_scalar_ilike_regex,
+        vec!["%%%"],
+        vec![r#"\%_\%"#],
+        ilike_utf8,
+        vec![true]
+    );
+
+    #[test]
+    fn test_replace_like_wildcards() {
+        let a_eq = "_%";
+        let expected = String::from("..*");
+        assert_eq!(replace_like_wildcards(a_eq).unwrap(), expected);
+    }
+
+    #[test]
+    fn test_replace_like_wildcards_leave_like_meta_chars() {
+        let a_eq = "\\%\\_";
+        let expected = String::from("%_");
+        assert_eq!(replace_like_wildcards(a_eq).unwrap(), expected);
+    }
+
+    #[test]
+    fn test_replace_like_wildcards_with_multiple_escape_chars() {
+        let a_eq = "\\\\%";

Review Comment:
   I think it may be worth testing `\\\%` (aka unbalanced, three slashes)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] daniel-martinez-maqueda-sap commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
daniel-martinez-maqueda-sap commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r936299497


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -296,6 +298,39 @@ pub fn like_utf8_scalar<OffsetSize: OffsetSizeTrait>(
     Ok(BooleanArray::from(data))
 }
 
+fn replace_like_wildcards(text: &str) -> Result<String> {

Review Comment:
   Yes, that makes sense. I usually follow the rule of documenting all the public functions but that doesn't mean that the rest should not be documented. Adding it in the next commit!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #2258: Fix escaped like wildcards in `like_utf8` / `nlike_utf8` kernels

Posted by GitBox <gi...@apache.org>.
tustvold merged PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] daniel-martinez-maqueda-sap commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
daniel-martinez-maqueda-sap commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r936323592


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -3740,6 +3775,50 @@ mod tests {
         vec![false, true, false, false]
     );
 
+    test_utf8_scalar!(
+        test_utf8_scalar_like_escape,
+        vec!["a%", "a\\x"],
+        "a\\%",
+        like_utf8_scalar,
+        vec![true, false]
+    );
+
+    test_utf8!(
+        test_utf8_scalar_ilike_regex,
+        vec!["%%%"],
+        vec![r#"\%_\%"#],
+        ilike_utf8,
+        vec![true]
+    );
+
+    #[test]
+    fn test_replace_like_wildcards() {
+        let a_eq = "_%";
+        let expected = String::from("..*");

Review Comment:
   It works without adding `&`. :)



##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -3740,6 +3775,50 @@ mod tests {
         vec![false, true, false, false]
     );
 
+    test_utf8_scalar!(
+        test_utf8_scalar_like_escape,
+        vec!["a%", "a\\x"],
+        "a\\%",
+        like_utf8_scalar,
+        vec![true, false]
+    );
+
+    test_utf8!(
+        test_utf8_scalar_ilike_regex,
+        vec!["%%%"],
+        vec![r#"\%_\%"#],
+        ilike_utf8,
+        vec![true]
+    );
+
+    #[test]
+    fn test_replace_like_wildcards() {
+        let a_eq = "_%";
+        let expected = String::from("..*");

Review Comment:
   It works without adding `&` :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] ursabot commented on pull request #2258: Fix escaped like wildcards in `like_utf8` / `nlike_utf8` kernels

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#issuecomment-1204200098

   Benchmark runs are scheduled for baseline = 22185fd597334c1d3a2e36a2bfba09be02466df9 and contender = f78d2e6572c5b60d00a15084875d90d3e31c9b74. f78d2e6572c5b60d00a15084875d90d3e31c9b74 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/cf13ac0a01e1451a8d35340296641568...a521e0cd5004441fafd12c54b9187eb8/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/e2015713dc5b4b95b83080e97f5b502f...428ac2dcc34f4f39af688ef05af65f1c/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/8ee81a71077044ed8367265f4890e3ab...8f810ae157ea4b8eae79e54709578e79/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/40de878bf36d42be8dc8ca9a8173eb04...0de4338c0f7444a59c408e443366b434/)
   Buildkite builds:
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] daniel-martinez-maqueda-sap commented on a diff in pull request #2258: Fix escaped like wildcards

Posted by GitBox <gi...@apache.org>.
daniel-martinez-maqueda-sap commented on code in PR #2258:
URL: https://github.com/apache/arrow-rs/pull/2258#discussion_r934629886


##########
arrow/src/compute/kernels/comparison.rs:
##########
@@ -296,6 +298,41 @@ pub fn like_utf8_scalar<OffsetSize: OffsetSizeTrait>(
     Ok(BooleanArray::from(data))
 }
 
+fn replace_like_wildcards(text: &str) -> Result<String> {
+    let text = escape(text);

Review Comment:
   Thanks for the comments.
   
   This approach is much more readable. Already applied in the next commit.
   
   Regarding the second comment. I also think it is outside of the scope of this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org