You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "askoa (via GitHub)" <gi...@apache.org> on 2023/02/09 10:25:29 UTC
[GitHub] [arrow-rs] askoa opened a new pull request, #3681: fix: Handle sliced array in run array iterator
askoa opened a new pull request, #3681:
URL: https://github.com/apache/arrow-rs/pull/3681
# Which issue does this PR close?
# Rationale for this change
The users might want to iterate on sliced run array.
# What changes are included in this PR?
Built on top of yet to be merged PR #3662
The current version of run array iterator `run_iterator.rs` does not work for sliced array. This PR makes run array iterator work for sliced array.
# Are there any user-facing changes?
Run encoded arrays are still under development and most likely not yet used.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3681: fix: Handle sliced array in run array iterator
Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on code in PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#discussion_r1104471075
##########
arrow-array/src/run_iterator.rs:
##########
@@ -57,14 +59,31 @@ where
{
/// create a new iterator
pub fn new(array: TypedRunArray<'a, R, V>) -> Self {
- let logical_len = array.len();
- let physical_len: usize = array.values().len();
+ let current_front_physical: usize = if array.offset() > 0 {
+ array
+ .run_array()
+ .get_zero_offset_physical_index(array.offset())
+ .unwrap()
+ } else {
+ 0
+ };
+ let sliced_end: usize = array.offset() + array.len();
+ let current_back_physical: usize =
+ if sliced_end == RunArray::<R>::logical_len(array.run_ends()) {
+ array.run_ends().len()
+ } else {
+ array
+ .run_array()
+ .get_zero_offset_physical_index(sliced_end - 1)
+ .unwrap()
+ + 1
+ };
Review Comment:
```suggestion
let current_front_physical = array.run_array().get_physical_index(0).unwrap();
let current_back_physical = array.run_array().get_physical_index(array.len()).unwrap();
```
Perhaps?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] askoa commented on a diff in pull request #3681: fix: Handle sliced array in run array iterator
Posted by "askoa (via GitHub)" <gi...@apache.org>.
askoa commented on code in PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#discussion_r1104613424
##########
arrow-array/src/run_iterator.rs:
##########
@@ -57,14 +59,31 @@ where
{
/// create a new iterator
pub fn new(array: TypedRunArray<'a, R, V>) -> Self {
- let logical_len = array.len();
- let physical_len: usize = array.values().len();
+ let current_front_physical: usize = if array.offset() > 0 {
+ array
+ .run_array()
+ .get_zero_offset_physical_index(array.offset())
+ .unwrap()
+ } else {
+ 0
+ };
+ let sliced_end: usize = array.offset() + array.len();
+ let current_back_physical: usize =
+ if sliced_end == RunArray::<R>::logical_len(array.run_ends()) {
+ array.run_ends().len()
+ } else {
+ array
+ .run_array()
+ .get_zero_offset_physical_index(sliced_end - 1)
+ .unwrap()
+ + 1
+ };
Review Comment:
It's a good suggestion. I fixed it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3681: fix: Handle sliced array in run array iterator
Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on code in PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#discussion_r1104471075
##########
arrow-array/src/run_iterator.rs:
##########
@@ -57,14 +59,31 @@ where
{
/// create a new iterator
pub fn new(array: TypedRunArray<'a, R, V>) -> Self {
- let logical_len = array.len();
- let physical_len: usize = array.values().len();
+ let current_front_physical: usize = if array.offset() > 0 {
+ array
+ .run_array()
+ .get_zero_offset_physical_index(array.offset())
+ .unwrap()
+ } else {
+ 0
+ };
+ let sliced_end: usize = array.offset() + array.len();
+ let current_back_physical: usize =
+ if sliced_end == RunArray::<R>::logical_len(array.run_ends()) {
+ array.run_ends().len()
+ } else {
+ array
+ .run_array()
+ .get_zero_offset_physical_index(sliced_end - 1)
+ .unwrap()
+ + 1
+ };
Review Comment:
```suggestion
let current_front_physical = array.run_array().get_physical_index(0).unwrap();
let current_back_physical = array.run_array().get_physical_index(array.len()).unwrap();
```
Perhaps? Just thinking about how we can DRY up this logic, as I remember similar logic appearing elsewhere before. If there is some performance concern, perhaps we can address that?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold merged pull request #3681: fix: Handle sliced array in run array iterator
Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold merged PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] ursabot commented on pull request #3681: fix: Handle sliced array in run array iterator
Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#issuecomment-1428200956
Benchmark runs are scheduled for baseline = d011e6adc6c7ff0ae72784e89cf736112016c9c5 and contender = 5ffc0a87dd5abb4f7db1172bac6ed93da95827f7. 5ffc0a87dd5abb4f7db1172bac6ed93da95827f7 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/64eec01adb334d458ae3b99308e2a81c...aed04c9ec5c444449422ffdf1e829094/)
[Skipped :warning: Benchmarking of arrow-rs-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/3dfc8018a8924c248e793cdcc19a315a...22885bc735494e31a2c442370dd2eb20/)
[Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/d26cf514779545049c475e15993f8d52...a44d208c993747a190711d26d9ec49d9/)
[Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/0e6442473f294c00ac6b68202dad39d0...44718096d5064a4b9c2a4b2f406d3087/)
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org