You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "heronshoes (via GitHub)" <gi...@apache.org> on 2023/04/07 02:33:25 UTC

[GitHub] [arrow] heronshoes opened a new pull request, #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

heronshoes opened a new pull request, #34952:
URL: https://github.com/apache/arrow/pull/34952

   ### Rationale for this change
   
   Ruby binding supported `Arrow::Slicer::ColumnCondition#match_substring` in #34902.
   This PR will use same strategy as this to its family methods.
   
   ### What changes are included in this PR?
   
   - [ ] Add `Arrow::Slicer::ColumnCondition#starts_with`
   - [ ] Add `Arrow::Slicer::ColumnCondition#match_like`
   - [ ] Add `Arrow::Slicer::ColumnCondition#match_substring_regex`
   - [ ] Add `Arrow::Slicer::ColumnCondition#ends_with`
   
   ### Are these changes tested?
   
   Yes.
   
   ### Are there any user-facing changes?
   
   Yes.
   It will add new methods for `Arrow::Slicer`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161177811


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Hmm.
   I don't think that it's usable.
   Do you have any use case for it rather than `match_substring?(/[dr]ow/)` style?
   
   If you don't have any use case for it, how about not providing `match_substring_regexp?` for now? If we get a request for it from an user, we can reconsider this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1501211184

   Benchmark runs are scheduled for baseline = 8d6b17a8c71990523d10af5608618d15864382bc and contender = 311cc52d208034214c325983f416f2f847a2a015. 311cc52d208034214c325983f416f2f847a2a015 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/3bdd0255bbaa443395d120a19df9f7fc...d1efa0ece53e4651ba2a6949fcc5db45/)
   [Failed :arrow_down:0.5% :arrow_up:0.09%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/36398e52f25e45c684dc5ca9d81e6373...5001d46d72484211ac5429497f1752e8/)
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/53333671ad4c427f99108094036510f0...80854e40008e41ee8586abc477deb4d9/)
   [Finished :arrow_down:0.52% :arrow_up:0.03%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/5cf9e1be595844d1838a84cec9ab820e...019ee3c0a2a5482fad7a0c77735775b8/)
   Buildkite builds:
   [Finished] [`311cc52d` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2661)
   [Failed] [`311cc52d` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2694)
   [Finished] [`311cc52d` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2659)
   [Finished] [`311cc52d` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2685)
   [Finished] [`8d6b17a8` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2660)
   [Finished] [`8d6b17a8` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2693)
   [Finished] [`8d6b17a8` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2658)
   [Finished] [`8d6b17a8` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2684)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161043630


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Yes we can do that!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161344582


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   I agree. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082714


##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,41 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
-      def match_substring(substring, ignore_case: false)
-        MatchSubstringFamilyCondition.new("match_substring",
+      def end_with?(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like?(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_like",
+                                          @column, pattern, ignore_case)
+      end
+
+      def match_substring?(pattern, ignore_case: false)
+        case pattern
+        when String
+          MatchSubstringFamilyCondition.new("match_substring",
+                                            @column, pattern, ignore_case)
+        when Regexp
+          MatchSubstringFamilyCondition.new("match_substring_regex",
+                                            @column,
+                                            pattern.source,
+                                            ignore_case)

Review Comment:
   Thanks! It was a good lesson.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1499866583

   :warning: GitHub issue #34951 **has been automatically assigned in GitHub** to PR creator.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082542


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Is it usable when we pass a string regular expression directly?
   
   `match_substring_regex?("[dr]ow")`
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161205613


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   No I don't have any specific use case.
   But I feel it is wasteful to 'construct Regexp => pick up source => apply it as string to match_substring_regex()`.
   Anyway I will follow you. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1160398717


##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
+      def ends_with(substring, ignore_case: false)

Review Comment:
   Other method including existing `match_substring` should have `?` suffix because this is Ruby API. :-)
   
   ```suggestion
         def end_with?(substring, ignore_case: false)
   ```



##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
+      def ends_with(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_like",
+                                          @column, pattern, ignore_case)
+      end
+
       def match_substring(substring, ignore_case: false)
         MatchSubstringFamilyCondition.new("match_substring",
                                           @column, substring, ignore_case)
       end
+
+      def match_substring_regex(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_substring_regex",
+                                          @column, pattern, ignore_case)
+      end
+
+      def starts_with(substring, ignore_case: false)

Review Comment:
   ```suggestion
         def start_with?(substring, ignore_case: false)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161040527


##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
+      def end_with?(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like(pattern, ignore_case: false)

Review Comment:
   ```suggestion
         def match_like?(pattern, ignore_case: false)
   ```



##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
+      def end_with?(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_like",
+                                          @column, pattern, ignore_case)
+      end
+
       def match_substring(substring, ignore_case: false)

Review Comment:
   ```suggestion
         def match_substring?(substring, ignore_case: false)
   ```



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Can we accept `/[dr]ow/` too?
   
   If `slicer.string.match_substring_regex(/[dr]ow/)` works, how about adding support for `slicer.string.match_substring(/[dr]ow/)` (== `slicer.string.amtch_substring_regex(/[dr]ow/)`)`?
   



##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
+      def end_with?(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_like",
+                                          @column, pattern, ignore_case)
+      end
+
       def match_substring(substring, ignore_case: false)
         MatchSubstringFamilyCondition.new("match_substring",
                                           @column, substring, ignore_case)
       end
+
+      def match_substring_regex(pattern, ignore_case: false)

Review Comment:
   ```suggestion
         def match_substring_regex?(pattern, ignore_case: false)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1500757680

   Is it better to rename `#match_substring` to `#match_substring?` ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082542


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Is it usable when we pass a regular expression as a string directly?
   
   `match_substring_regex?("[dr]ow")`
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161281080


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   OK.
   Then we can improve the current APIs later.
   
   I think that `match_substring_regexp?` isn't a good API because existing built-in Ruby objects don't provide `match_substring_regexp?` (`String` provides `start_with?`. So `start_with?` is a good API for Rubyists.) And compute function name isn't `match_substring_regexp` (`..._regex`). So `match_substring_regexp?` confuses us.
   
   I think that we can find better names for this. For example:
   
   * `column.match?(string_pattern)` -> `match_substring_regex(column, string_pattern)` (`String#match?` exists.)
   * `column.match?(regexp)` -> `match_substring_regex(column, regexp.source)` (`String#match?` exists.)
   * `column.include?(string)` -> `match_substring(column, string)` (`String#include?` exists.)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161051567


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Implemented above.
   
   And renamed `match_substring_regex?` to `match_substring_regexp?`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161073990


##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,41 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
-      def match_substring(substring, ignore_case: false)
-        MatchSubstringFamilyCondition.new("match_substring",
+      def end_with?(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like?(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_like",
+                                          @column, pattern, ignore_case)
+      end
+
+      def match_substring?(pattern, ignore_case: false)
+        case pattern
+        when String
+          MatchSubstringFamilyCondition.new("match_substring",
+                                            @column, pattern, ignore_case)
+        when Regexp
+          MatchSubstringFamilyCondition.new("match_substring_regex",
+                                            @column,
+                                            pattern.source,
+                                            ignore_case)
+        else
+          message =
+             "pattern must be either String or Regexp not #{pattern}.class"

Review Comment:
   ```suggestion
             message =
                "pattern must be either String or Regexp: #{pattern.inspect}"
   ```



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,48 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_substring?(invalid_pattern)") do
+      assert_raise(ArgumentError) do

Review Comment:
   Could you also check error message?
   
   ```suggestion
         assert_raise(ArgumentError.new("...")) do
   ```



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -517,9 +541,21 @@ def setup
       TABLE
     end
 
-    test("!match_substring") do
+    test("!match_substring?") do
+      sliced_table = @table.slice do |slicer|
+        !slicer.string.match_substring?("arr")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	Arrow 
+1	(null)
+2	window
+      TABLE
+    end
+
+    test("match_substring?(regexp)") do

Review Comment:
   ```suggestion
       test("match_substring?(Regexp)") do
   ```



##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,41 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
-      def match_substring(substring, ignore_case: false)
-        MatchSubstringFamilyCondition.new("match_substring",
+      def end_with?(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like?(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_like",
+                                          @column, pattern, ignore_case)
+      end
+
+      def match_substring?(pattern, ignore_case: false)
+        case pattern
+        when String
+          MatchSubstringFamilyCondition.new("match_substring",
+                                            @column, pattern, ignore_case)
+        when Regexp
+          MatchSubstringFamilyCondition.new("match_substring_regex",
+                                            @column,
+                                            pattern.source,
+                                            ignore_case)

Review Comment:
   Can we use `pattern.casefold?` as the default value of `ignore_case`?



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,48 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_substring?(invalid_pattern)") do
+      assert_raise(ArgumentError) do
+        sliced_table = @table.slice do |slicer|

Review Comment:
   ```suggestion
           @table.slice do |slicer|
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161074590


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Ah, we may not need to provide `match_substring_regex*`.
   How about providing only `match_substring?`?



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Can we accept `/[dr]ow/` too?
   
   If `slicer.string.match_substring_regex(/[dr]ow/)` works, how about adding support for `slicer.string.match_substring(/[dr]ow/)` (== `slicer.string.match_substring_regex(/[dr]ow/)`)`?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082542


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_like") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_like("_rr%")
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	(null)
+      TABLE
+    end
+
+    test("match_substring_regex") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring_regex("[dr]ow")

Review Comment:
   Is it usable when we pass a regular expression as a string directly?
   
   `match_substring_regexp?("[dr]ow")`
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082740


##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,48 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_substring?(invalid_pattern)") do
+      assert_raise(ArgumentError) do

Review Comment:
   Done!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou merged pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou merged PR #34952:
URL: https://github.com/apache/arrow/pull/34952


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1499866561

   * Closes: #34951


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161177920


##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,44 @@ def reject(&block)
         RejectCondition.new(@column, block)
       end
 
-      def match_substring(substring, ignore_case: false)
-        MatchSubstringFamilyCondition.new("match_substring",
+      def end_with?(substring, ignore_case: false)
+        MatchSubstringFamilyCondition.new("ends_with",
+                                          @column, substring, ignore_case)
+      end
+
+      def match_like?(pattern, ignore_case: false)
+        MatchSubstringFamilyCondition.new("match_like",
+                                          @column, pattern, ignore_case)
+      end
+
+      def match_substring?(pattern, ignore_case: false)
+        case pattern
+        when String
+          MatchSubstringFamilyCondition.new("match_substring",
+                                            @column, pattern, ignore_case)
+        when Regexp
+          MatchSubstringFamilyCondition.new("match_substring_regex",
+                                            @column,
+                                            pattern.source,
+                                            pattern.casefold?)

Review Comment:
   ```suggestion
         def match_substring?(pattern, ignore_case: nil)
           case pattern
           when String
             ignore_case = false if ignore_case.nil?
             MatchSubstringFamilyCondition.new("match_substring",
                                               @column, pattern, ignore_case)
           when Regexp
             ignore_case = pattern.casefold? if ignore_case.nil?
             MatchSubstringFamilyCondition.new("match_substring_regex",
                                               @column,
                                               pattern.source,
                                               ignore_case)
   ```



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,63 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_substring?(/String/i") do

Review Comment:
   ```suggestion
       test("match_substring?(/String/i)") do
   ```



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,63 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_substring?(/String/i") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring?(/arr/i)
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	carrot
+3	(null)
+      TABLE
+    end
+
+    test("match_substring?(invalid_pattern)") do

Review Comment:
   ```suggestion
       test("match_substring?(Array)") do
   ```



##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,63 @@ def setup
 2	window
       TABLE
     end
+
+    test("match_substring?(/String/i") do
+      sliced_table = @table.slice do |slicer|
+        slicer.string.match_substring?(/arr/i)
+      end
+      assert_equal(<<~TABLE, sliced_table.to_s)
+	string
+0	array 
+1	Arrow 
+2	carrot
+3	(null)
+      TABLE
+    end
+
+    test("match_substring?(invalid_pattern)") do

Review Comment:
   Or
   
   ```suggestion
       test("match_substring? - invalid") do
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org