You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "heronshoes (via GitHub)" <gi...@apache.org> on 2023/04/07 02:33:25 UTC
[GitHub] [arrow] heronshoes opened a new pull request, #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
heronshoes opened a new pull request, #34952:
URL: https://github.com/apache/arrow/pull/34952
### Rationale for this change
Ruby binding supported `Arrow::Slicer::ColumnCondition#match_substring` in #34902.
This PR will use same strategy as this to its family methods.
### What changes are included in this PR?
- [ ] Add `Arrow::Slicer::ColumnCondition#starts_with`
- [ ] Add `Arrow::Slicer::ColumnCondition#match_like`
- [ ] Add `Arrow::Slicer::ColumnCondition#match_substring_regex`
- [ ] Add `Arrow::Slicer::ColumnCondition#ends_with`
### Are these changes tested?
Yes.
### Are there any user-facing changes?
Yes.
It will add new methods for `Arrow::Slicer`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161177811
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Hmm.
I don't think that it's usable.
Do you have any use case for it rather than `match_substring?(/[dr]ow/)` style?
If you don't have any use case for it, how about not providing `match_substring_regexp?` for now? If we get a request for it from an user, we can reconsider this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] ursabot commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1501211184
Benchmark runs are scheduled for baseline = 8d6b17a8c71990523d10af5608618d15864382bc and contender = 311cc52d208034214c325983f416f2f847a2a015. 311cc52d208034214c325983f416f2f847a2a015 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/3bdd0255bbaa443395d120a19df9f7fc...d1efa0ece53e4651ba2a6949fcc5db45/)
[Failed :arrow_down:0.5% :arrow_up:0.09%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/36398e52f25e45c684dc5ca9d81e6373...5001d46d72484211ac5429497f1752e8/)
[Finished :arrow_down:0.0% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/53333671ad4c427f99108094036510f0...80854e40008e41ee8586abc477deb4d9/)
[Finished :arrow_down:0.52% :arrow_up:0.03%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/5cf9e1be595844d1838a84cec9ab820e...019ee3c0a2a5482fad7a0c77735775b8/)
Buildkite builds:
[Finished] [`311cc52d` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2661)
[Failed] [`311cc52d` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2694)
[Finished] [`311cc52d` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2659)
[Finished] [`311cc52d` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2685)
[Finished] [`8d6b17a8` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2660)
[Finished] [`8d6b17a8` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2693)
[Finished] [`8d6b17a8` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2658)
[Finished] [`8d6b17a8` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2684)
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161043630
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Yes we can do that!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161344582
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
I agree. Thanks!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082714
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,41 @@ def reject(&block)
RejectCondition.new(@column, block)
end
- def match_substring(substring, ignore_case: false)
- MatchSubstringFamilyCondition.new("match_substring",
+ def end_with?(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like?(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_like",
+ @column, pattern, ignore_case)
+ end
+
+ def match_substring?(pattern, ignore_case: false)
+ case pattern
+ when String
+ MatchSubstringFamilyCondition.new("match_substring",
+ @column, pattern, ignore_case)
+ when Regexp
+ MatchSubstringFamilyCondition.new("match_substring_regex",
+ @column,
+ pattern.source,
+ ignore_case)
Review Comment:
Thanks! It was a good lesson.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1499866583
:warning: GitHub issue #34951 **has been automatically assigned in GitHub** to PR creator.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082542
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Is it usable when we pass a string regular expression directly?
`match_substring_regex?("[dr]ow")`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161205613
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
No I don't have any specific use case.
But I feel it is wasteful to 'construct Regexp => pick up source => apply it as string to match_substring_regex()`.
Anyway I will follow you. Thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1160398717
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
RejectCondition.new(@column, block)
end
+ def ends_with(substring, ignore_case: false)
Review Comment:
Other method including existing `match_substring` should have `?` suffix because this is Ruby API. :-)
```suggestion
def end_with?(substring, ignore_case: false)
```
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
RejectCondition.new(@column, block)
end
+ def ends_with(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_like",
+ @column, pattern, ignore_case)
+ end
+
def match_substring(substring, ignore_case: false)
MatchSubstringFamilyCondition.new("match_substring",
@column, substring, ignore_case)
end
+
+ def match_substring_regex(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_substring_regex",
+ @column, pattern, ignore_case)
+ end
+
+ def starts_with(substring, ignore_case: false)
Review Comment:
```suggestion
def start_with?(substring, ignore_case: false)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161040527
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
RejectCondition.new(@column, block)
end
+ def end_with?(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like(pattern, ignore_case: false)
Review Comment:
```suggestion
def match_like?(pattern, ignore_case: false)
```
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
RejectCondition.new(@column, block)
end
+ def end_with?(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_like",
+ @column, pattern, ignore_case)
+ end
+
def match_substring(substring, ignore_case: false)
Review Comment:
```suggestion
def match_substring?(substring, ignore_case: false)
```
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Can we accept `/[dr]ow/` too?
If `slicer.string.match_substring_regex(/[dr]ow/)` works, how about adding support for `slicer.string.match_substring(/[dr]ow/)` (== `slicer.string.amtch_substring_regex(/[dr]ow/)`)`?
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,10 +163,30 @@ def reject(&block)
RejectCondition.new(@column, block)
end
+ def end_with?(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_like",
+ @column, pattern, ignore_case)
+ end
+
def match_substring(substring, ignore_case: false)
MatchSubstringFamilyCondition.new("match_substring",
@column, substring, ignore_case)
end
+
+ def match_substring_regex(pattern, ignore_case: false)
Review Comment:
```suggestion
def match_substring_regex?(pattern, ignore_case: false)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1500757680
Is it better to rename `#match_substring` to `#match_substring?` ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082542
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Is it usable when we pass a regular expression as a string directly?
`match_substring_regex?("[dr]ow")`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161281080
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
OK.
Then we can improve the current APIs later.
I think that `match_substring_regexp?` isn't a good API because existing built-in Ruby objects don't provide `match_substring_regexp?` (`String` provides `start_with?`. So `start_with?` is a good API for Rubyists.) And compute function name isn't `match_substring_regexp` (`..._regex`). So `match_substring_regexp?` confuses us.
I think that we can find better names for this. For example:
* `column.match?(string_pattern)` -> `match_substring_regex(column, string_pattern)` (`String#match?` exists.)
* `column.match?(regexp)` -> `match_substring_regex(column, regexp.source)` (`String#match?` exists.)
* `column.include?(string)` -> `match_substring(column, string)` (`String#include?` exists.)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161051567
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Implemented above.
And renamed `match_substring_regex?` to `match_substring_regexp?`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161073990
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,41 @@ def reject(&block)
RejectCondition.new(@column, block)
end
- def match_substring(substring, ignore_case: false)
- MatchSubstringFamilyCondition.new("match_substring",
+ def end_with?(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like?(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_like",
+ @column, pattern, ignore_case)
+ end
+
+ def match_substring?(pattern, ignore_case: false)
+ case pattern
+ when String
+ MatchSubstringFamilyCondition.new("match_substring",
+ @column, pattern, ignore_case)
+ when Regexp
+ MatchSubstringFamilyCondition.new("match_substring_regex",
+ @column,
+ pattern.source,
+ ignore_case)
+ else
+ message =
+ "pattern must be either String or Regexp not #{pattern}.class"
Review Comment:
```suggestion
message =
"pattern must be either String or Regexp: #{pattern.inspect}"
```
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,48 @@ def setup
2 window
TABLE
end
+
+ test("match_substring?(invalid_pattern)") do
+ assert_raise(ArgumentError) do
Review Comment:
Could you also check error message?
```suggestion
assert_raise(ArgumentError.new("...")) do
```
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -517,9 +541,21 @@ def setup
TABLE
end
- test("!match_substring") do
+ test("!match_substring?") do
+ sliced_table = @table.slice do |slicer|
+ !slicer.string.match_substring?("arr")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 Arrow
+1 (null)
+2 window
+ TABLE
+ end
+
+ test("match_substring?(regexp)") do
Review Comment:
```suggestion
test("match_substring?(Regexp)") do
```
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,41 @@ def reject(&block)
RejectCondition.new(@column, block)
end
- def match_substring(substring, ignore_case: false)
- MatchSubstringFamilyCondition.new("match_substring",
+ def end_with?(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like?(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_like",
+ @column, pattern, ignore_case)
+ end
+
+ def match_substring?(pattern, ignore_case: false)
+ case pattern
+ when String
+ MatchSubstringFamilyCondition.new("match_substring",
+ @column, pattern, ignore_case)
+ when Regexp
+ MatchSubstringFamilyCondition.new("match_substring_regex",
+ @column,
+ pattern.source,
+ ignore_case)
Review Comment:
Can we use `pattern.casefold?` as the default value of `ignore_case`?
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,48 @@ def setup
2 window
TABLE
end
+
+ test("match_substring?(invalid_pattern)") do
+ assert_raise(ArgumentError) do
+ sliced_table = @table.slice do |slicer|
Review Comment:
```suggestion
@table.slice do |slicer|
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161074590
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Ah, we may not need to provide `match_substring_regex*`.
How about providing only `match_substring?`?
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Can we accept `/[dr]ow/` too?
If `slicer.string.match_substring_regex(/[dr]ow/)` works, how about adding support for `slicer.string.match_substring(/[dr]ow/)` (== `slicer.string.match_substring_regex(/[dr]ow/)`)`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082542
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +540,40 @@ def setup
2 window
TABLE
end
+
+ test("match_like") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_like("_rr%")
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 (null)
+ TABLE
+ end
+
+ test("match_substring_regex") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring_regex("[dr]ow")
Review Comment:
Is it usable when we pass a regular expression as a string directly?
`match_substring_regexp?("[dr]ow")`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] heronshoes commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "heronshoes (via GitHub)" <gi...@apache.org>.
heronshoes commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161082740
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,48 @@ def setup
2 window
TABLE
end
+
+ test("match_substring?(invalid_pattern)") do
+ assert_raise(ArgumentError) do
Review Comment:
Done!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou merged pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou merged PR #34952:
URL: https://github.com/apache/arrow/pull/34952
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34952:
URL: https://github.com/apache/arrow/pull/34952#issuecomment-1499866561
* Closes: #34951
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a diff in pull request #34952: GH-34951: [Ruby] Add methods using MatchSubStringFamilyCondition
Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34952:
URL: https://github.com/apache/arrow/pull/34952#discussion_r1161177920
##########
ruby/red-arrow/lib/arrow/slicer.rb:
##########
@@ -163,8 +163,44 @@ def reject(&block)
RejectCondition.new(@column, block)
end
- def match_substring(substring, ignore_case: false)
- MatchSubstringFamilyCondition.new("match_substring",
+ def end_with?(substring, ignore_case: false)
+ MatchSubstringFamilyCondition.new("ends_with",
+ @column, substring, ignore_case)
+ end
+
+ def match_like?(pattern, ignore_case: false)
+ MatchSubstringFamilyCondition.new("match_like",
+ @column, pattern, ignore_case)
+ end
+
+ def match_substring?(pattern, ignore_case: false)
+ case pattern
+ when String
+ MatchSubstringFamilyCondition.new("match_substring",
+ @column, pattern, ignore_case)
+ when Regexp
+ MatchSubstringFamilyCondition.new("match_substring_regex",
+ @column,
+ pattern.source,
+ pattern.casefold?)
Review Comment:
```suggestion
def match_substring?(pattern, ignore_case: nil)
case pattern
when String
ignore_case = false if ignore_case.nil?
MatchSubstringFamilyCondition.new("match_substring",
@column, pattern, ignore_case)
when Regexp
ignore_case = pattern.casefold? if ignore_case.nil?
MatchSubstringFamilyCondition.new("match_substring_regex",
@column,
pattern.source,
ignore_case)
```
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,63 @@ def setup
2 window
TABLE
end
+
+ test("match_substring?(/String/i") do
Review Comment:
```suggestion
test("match_substring?(/String/i)") do
```
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,63 @@ def setup
2 window
TABLE
end
+
+ test("match_substring?(/String/i") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring?(/arr/i)
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 carrot
+3 (null)
+ TABLE
+ end
+
+ test("match_substring?(invalid_pattern)") do
Review Comment:
```suggestion
test("match_substring?(Array)") do
```
##########
ruby/red-arrow/test/test-slicer.rb:
##########
@@ -528,5 +564,63 @@ def setup
2 window
TABLE
end
+
+ test("match_substring?(/String/i") do
+ sliced_table = @table.slice do |slicer|
+ slicer.string.match_substring?(/arr/i)
+ end
+ assert_equal(<<~TABLE, sliced_table.to_s)
+ string
+0 array
+1 Arrow
+2 carrot
+3 (null)
+ TABLE
+ end
+
+ test("match_substring?(invalid_pattern)") do
Review Comment:
Or
```suggestion
test("match_substring? - invalid") do
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org