You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/19 07:40:49 UTC

[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1905: Refine the `bit_util` of Parquet.

tustvold commented on code in PR #1905:
URL: https://github.com/apache/arrow-rs/pull/1905#discussion_r901061978


##########
parquet/src/util/bit_util.rs:
##########
@@ -133,38 +133,23 @@ where
     memcpy(&source.as_bytes()[..num_bytes], target)
 }
 
-/// Returns the ceil of value/divisor
+/// Returns the ceil of value/divisor.
+///
+/// This function should be removed after
+/// [`int_roundings`](https://github.com/rust-lang/rust/issues/88581) is stable.
 #[inline]
 pub fn ceil(value: i64, divisor: i64) -> i64 {
-    value / divisor + ((value % divisor != 0) as i64)
-}
-
-/// Returns ceil(log2(x))
-#[inline]
-pub fn log2(mut x: u64) -> i32 {
-    if x == 1 {
-        return 0;
-    }
-    x -= 1;
-    let mut result = 0;
-    while x > 0 {
-        x >>= 1;
-        result += 1;
-    }
-    result
+    num::Integer::div_ceil(&value, &divisor)
 }
 
 /// Returns the `num_bits` least-significant bits of `v`
 #[inline]
 pub fn trailing_bits(v: u64, num_bits: usize) -> u64 {
-    if num_bits == 0 {
-        return 0;
-    }
     if num_bits >= 64 {
-        return v;
+        v
+    } else {

Review Comment:
   I'm not sure if this if block is actually necessary



##########
parquet/src/encodings/encoding.rs:
##########
@@ -307,12 +307,10 @@ impl<T: DataType> DictEncoder<T> {
     #[inline]
     fn bit_width(&self) -> u8 {
         let num_entries = self.uniques.len();
-        if num_entries == 0 {
-            0
-        } else if num_entries == 1 {
-            1
+        if num_entries <= 1 {
+            num_entries as u8
         } else {
-            log2(num_entries as u64) as u8
+            num_required_bits(num_entries as u64 - 1)

Review Comment:
   Is this actually correct? Or was this a pre-existing bug. Why is the bit width here 1 less?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org