You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "askoa (via GitHub)" <gi...@apache.org> on 2023/02/09 10:25:29 UTC

[GitHub] [arrow-rs] askoa opened a new pull request, #3681: fix: Handle sliced array in run array iterator

askoa opened a new pull request, #3681:
URL: https://github.com/apache/arrow-rs/pull/3681

   # Which issue does this PR close?
   
   # Rationale for this change
   The users might want to iterate on sliced run array.
   
   # What changes are included in this PR?
   
   Built on top of yet to be merged PR #3662
   
   The current version of run array iterator `run_iterator.rs` does not work for sliced array. This PR makes run array iterator work for sliced array.
   
   # Are there any user-facing changes?
   Run encoded arrays are still under development and most likely not yet used.  
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3681: fix: Handle sliced array in run array iterator

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on code in PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#discussion_r1104471075


##########
arrow-array/src/run_iterator.rs:
##########
@@ -57,14 +59,31 @@ where
 {
     /// create a new iterator
     pub fn new(array: TypedRunArray<'a, R, V>) -> Self {
-        let logical_len = array.len();
-        let physical_len: usize = array.values().len();
+        let current_front_physical: usize = if array.offset() > 0 {
+            array
+                .run_array()
+                .get_zero_offset_physical_index(array.offset())
+                .unwrap()
+        } else {
+            0
+        };
+        let sliced_end: usize = array.offset() + array.len();
+        let current_back_physical: usize =
+            if sliced_end == RunArray::<R>::logical_len(array.run_ends()) {
+                array.run_ends().len()
+            } else {
+                array
+                    .run_array()
+                    .get_zero_offset_physical_index(sliced_end - 1)
+                    .unwrap()
+                    + 1
+            };

Review Comment:
   ```suggestion
           let current_front_physical = array.run_array().get_physical_index(0).unwrap();
           let current_back_physical = array.run_array().get_physical_index(array.len()).unwrap();
   ```
   
   Perhaps?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] askoa commented on a diff in pull request #3681: fix: Handle sliced array in run array iterator

Posted by "askoa (via GitHub)" <gi...@apache.org>.
askoa commented on code in PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#discussion_r1104613424


##########
arrow-array/src/run_iterator.rs:
##########
@@ -57,14 +59,31 @@ where
 {
     /// create a new iterator
     pub fn new(array: TypedRunArray<'a, R, V>) -> Self {
-        let logical_len = array.len();
-        let physical_len: usize = array.values().len();
+        let current_front_physical: usize = if array.offset() > 0 {
+            array
+                .run_array()
+                .get_zero_offset_physical_index(array.offset())
+                .unwrap()
+        } else {
+            0
+        };
+        let sliced_end: usize = array.offset() + array.len();
+        let current_back_physical: usize =
+            if sliced_end == RunArray::<R>::logical_len(array.run_ends()) {
+                array.run_ends().len()
+            } else {
+                array
+                    .run_array()
+                    .get_zero_offset_physical_index(sliced_end - 1)
+                    .unwrap()
+                    + 1
+            };

Review Comment:
   It's a good suggestion. I fixed it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3681: fix: Handle sliced array in run array iterator

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on code in PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#discussion_r1104471075


##########
arrow-array/src/run_iterator.rs:
##########
@@ -57,14 +59,31 @@ where
 {
     /// create a new iterator
     pub fn new(array: TypedRunArray<'a, R, V>) -> Self {
-        let logical_len = array.len();
-        let physical_len: usize = array.values().len();
+        let current_front_physical: usize = if array.offset() > 0 {
+            array
+                .run_array()
+                .get_zero_offset_physical_index(array.offset())
+                .unwrap()
+        } else {
+            0
+        };
+        let sliced_end: usize = array.offset() + array.len();
+        let current_back_physical: usize =
+            if sliced_end == RunArray::<R>::logical_len(array.run_ends()) {
+                array.run_ends().len()
+            } else {
+                array
+                    .run_array()
+                    .get_zero_offset_physical_index(sliced_end - 1)
+                    .unwrap()
+                    + 1
+            };

Review Comment:
   ```suggestion
           let current_front_physical = array.run_array().get_physical_index(0).unwrap();
           let current_back_physical = array.run_array().get_physical_index(array.len()).unwrap();
   ```
   
   Perhaps? Just thinking about how we can DRY up this logic, as I remember similar logic appearing elsewhere before. If there is some performance concern, perhaps we can address that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #3681: fix: Handle sliced array in run array iterator

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold merged PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] ursabot commented on pull request #3681: fix: Handle sliced array in run array iterator

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #3681:
URL: https://github.com/apache/arrow-rs/pull/3681#issuecomment-1428200956

   Benchmark runs are scheduled for baseline = d011e6adc6c7ff0ae72784e89cf736112016c9c5 and contender = 5ffc0a87dd5abb4f7db1172bac6ed93da95827f7. 5ffc0a87dd5abb4f7db1172bac6ed93da95827f7 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/64eec01adb334d458ae3b99308e2a81c...aed04c9ec5c444449422ffdf1e829094/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on test-mac-arm] [test-mac-arm](https://conbench.ursa.dev/compare/runs/3dfc8018a8924c248e793cdcc19a315a...22885bc735494e31a2c442370dd2eb20/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/d26cf514779545049c475e15993f8d52...a44d208c993747a190711d26d9ec49d9/)
   [Skipped :warning: Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/0e6442473f294c00ac6b68202dad39d0...44718096d5064a4b9c2a4b2f406d3087/)
   Buildkite builds:
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org