You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/07/21 14:18:53 UTC

[GitHub] [arrow-rs] Jimexist commented on a change in pull request #585: use exponential search in lexico partition to speed up

Jimexist commented on a change in pull request #585:
URL: https://github.com/apache/arrow-rs/pull/585#discussion_r674016802



##########
File path: arrow/src/compute/kernels/partition.rs
##########
@@ -73,6 +73,25 @@ impl<'a> LexicographicalPartitionIterator<'a> {
     }
 }
 
+/// Exponential search is to remedy for the case when array size and cardinality are both large
+/// see <https://en.wikipedia.org/wiki/Exponential_search>
+#[inline]
+fn exponential_search(
+    indices: &[usize],
+    target: &usize,
+    comparator: &LexicographicalComparator<'_>,
+) -> usize {
+    let mut bound = 1;
+    while bound < indices.len()
+        && comparator.compare(&indices[bound], target) != Ordering::Greater
+    {
+        bound *= 2;
+    }
+    (bound / 2)

Review comment:
       invariant:
   
   `indices[bound / 2] <= target < indices[min(indices.len(), bound + 1)]` where `<=` and `<` are defined by the `comparator`; note the `bound + 1` because `while` might exit when `target = indices[bound]`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org