You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/05 13:15:07 UTC

[GitHub] [arrow-rs] crepererum opened a new pull request #258: support full u64 roundtrip through parquet

crepererum opened a new pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258


   # Which issue does this PR close?
   
   Closes #254 .
   
   # Rationale for this change
   Up until now u64 values larger than `i64::MAX` were silently truncated to "invalid" when going through parquet. With this change we follow the C++ stack and 
   
   # What changes are included in this PR?
   
   - a bit of prep (first two commits)
   - map all u64 values to i64 via overflow/reinterpret cast
   - account of logical type `is_signed` flag in statistics calculation (as per spec)
   
   # Are there any user-facing changes?
   Yes, users can now expect to have full u64 storage support. Old files should still work since we previously marked values larger than `i64::MAX` as invalid.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-833456232


   The test failure
   https://github.com/apache/arrow-rs/pull/258/checks?check_run_id=2516139774 seems to be due to a network problem. Retriggering.
   
   ```
   
     nightly-2021-03-24-x86_64-unknown-linux-gnu installed - rustc 1.53.0-nightly (673d0db5e 2021-03-23)
   
   info: checking for self-updates
   error: could not download file from 'https://static.rust-lang.org/rustup/release-stable.toml' to '/tmp/rustup-updatemwNOD1/release-stable.toml'
   error: caused by: failed to make network request
   error: caused by: error sending request for url (https://static.rust-lang.org/rustup/release-stable.toml): connection error: Connection reset by peer (os error 104)
   error: caused by: connection error: Connection reset by peer (os error 104)
   error: caused by: Connection reset by peer (os error 104)
   Error: Process completed with exit code 1.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] crepererum commented on a change in pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
crepererum commented on a change in pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#discussion_r627158342



##########
File path: parquet/src/arrow/array_reader.rs
##########
@@ -380,6 +380,18 @@ impl<T: DataType> ArrayReader for PrimitiveArrayReader<T> {
                 }
                 Arc::new(builder.finish()) as ArrayRef
             }
+            ArrowType::UInt64 => match array.data_type() {

Review comment:
       done.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-832688195


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (550a48f) into [master](https://codecov.io/gh/apache/arrow-rs/commit/508f25c10032857da34ea88cc8166f0741616a32?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (508f25c) will **increase** coverage by `0.00%`.
   > The diff coverage is `89.47%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master     #258   +/-   ##
   =======================================
     Coverage   82.53%   82.54%           
   =======================================
     Files         162      162           
     Lines       43786    43824   +38     
   =======================================
   + Hits        36140    36174   +34     
   - Misses       7646     7650    +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvY29sdW1uL3dyaXRlci5ycw==) | `93.64% <84.00%> (-0.13%)` | :arrow_down: |
   | [parquet/src/arrow/array\_reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyYXlfcmVhZGVyLnJz) | `77.13% <88.88%> (-0.01%)` | :arrow_down: |
   | [parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfd3JpdGVyLnJz) | `98.30% <95.65%> (-0.12%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [508f25c...550a48f](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] crepererum commented on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
crepererum commented on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-834132371


   @nevi-me good catch. Indeed there was the same bug. Added a test and fixed as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] nevi-me commented on a change in pull request #258: support full u32 and u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#discussion_r629513251



##########
File path: parquet/src/column/writer.rs
##########
@@ -925,31 +934,51 @@ impl<T: DataType> ColumnWriterImpl<T> {
     fn update_page_min_max(&mut self, val: &T::T) {
         // simple "isNaN" check that works for all types
         if val == val {
-            if self.min_page_value.as_ref().map_or(true, |min| min > val) {
+            if self
+                .min_page_value
+                .as_ref()
+                .map_or(true, |min| self.compare_greater(min, val))
+            {
                 self.min_page_value = Some(val.clone());
             }
-            if self.max_page_value.as_ref().map_or(true, |max| max < val) {
+            if self
+                .max_page_value
+                .as_ref()
+                .map_or(true, |max| self.compare_greater(val, max))
+            {
                 self.max_page_value = Some(val.clone());
             }
         }
     }
 
     fn update_column_min_max(&mut self) {
-        if self
-            .min_column_value
-            .as_ref()
-            .map_or(true, |min| min > self.min_page_value.as_ref().unwrap())
-        {
+        let update_min = self.min_column_value.as_ref().map_or(true, |min| {
+            let page_value = self.min_page_value.as_ref().unwrap();
+            self.compare_greater(min, page_value)
+        });
+        if update_min {
             self.min_column_value = self.min_page_value.clone();
         }
-        if self
-            .max_column_value
-            .as_ref()
-            .map_or(true, |max| max < self.max_page_value.as_ref().unwrap())
-        {
+
+        let update_max = self.max_column_value.as_ref().map_or(true, |max| {
+            let page_value = self.max_page_value.as_ref().unwrap();
+            self.compare_greater(page_value, max)
+        });
+        if update_max {
             self.max_column_value = self.max_page_value.clone();
         }
     }
+
+    /// Evaluate `a > b` according to underlying logical type.
+    fn compare_greater(&self, a: &T::T, b: &T::T) -> bool {
+        if let Some(LogicalType::INTEGER(int_type)) = self.descr.logical_type() {

Review comment:
       Note: not all implementations might write LogicalType, even though ConvertedType is deprecated. I had to think a bit about this, but it's fine because it's on the write-side, where we will always write LogicalType going forward.
   
   If this was on the read side, we'd have to also check `ConvertedType` as a fallback.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on a change in pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#discussion_r626732147



##########
File path: parquet/src/arrow/array_reader.rs
##########
@@ -380,6 +380,18 @@ impl<T: DataType> ArrayReader for PrimitiveArrayReader<T> {
                 }
                 Arc::new(builder.finish()) as ArrayRef
             }
+            ArrowType::UInt64 => match array.data_type() {

Review comment:
       I was thinking of the additional copy that comes from the `arity` call -- specifically https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/arity.rs#L64
   
   The `arity` call will be in addition to any copying that `cast` does, I think.
   
   Also, when the types are the same, then cast is a [noop](https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/cast.rs#L300)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] crepererum commented on pull request #258: support full u32 and u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
crepererum commented on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-836321236


   ready :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] nevi-me commented on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-834080661


   @crepererum thanks for picking this up, and working on it. Do we have the same risk for `u32` <> `i32`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] crepererum commented on a change in pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
crepererum commented on a change in pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#discussion_r626724693



##########
File path: parquet/src/arrow/array_reader.rs
##########
@@ -380,6 +380,18 @@ impl<T: DataType> ArrayReader for PrimitiveArrayReader<T> {
                 }
                 Arc::new(builder.finish()) as ArrayRef
             }
+            ArrowType::UInt64 => match array.data_type() {

Review comment:
       As far as I can see we always cast the array, see default case in line 383 (pre-patch) / 395 (post-patch). So we would always copy, no?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter commented on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-832688195


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (53c1ece) into [master](https://codecov.io/gh/apache/arrow-rs/commit/508f25c10032857da34ea88cc8166f0741616a32?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (508f25c) will **increase** coverage by `0.00%`.
   > The diff coverage is `87.27%`.
   
   > :exclamation: Current head 53c1ece differs from pull request most recent head b9479b4. Consider uploading reports for the commit b9479b4 to get more accurate results
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master     #258   +/-   ##
   =======================================
     Coverage   82.53%   82.54%           
   =======================================
     Files         162      162           
     Lines       43786    43827   +41     
   =======================================
   + Hits        36140    36176   +36     
   - Misses       7646     7651    +5     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [parquet/src/arrow/array\_reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyYXlfcmVhZGVyLnJz) | `77.10% <71.42%> (-0.04%)` | :arrow_down: |
   | [parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvY29sdW1uL3dyaXRlci5ycw==) | `93.64% <84.00%> (-0.13%)` | :arrow_down: |
   | [parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfd3JpdGVyLnJz) | `98.30% <95.65%> (-0.12%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [508f25c...b9479b4](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #258: support full u32 and u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-834687708


   #267 has been merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #258: support full u32 and u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-832688195


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (2710f31) into [master](https://codecov.io/gh/apache/arrow-rs/commit/8bd769bc118f617580cfaa7b1e654159c0fb696b?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (8bd769b) will **increase** coverage by `0.01%`.
   > The diff coverage is `90.21%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #258      +/-   ##
   ==========================================
   + Coverage   82.53%   82.54%   +0.01%     
   ==========================================
     Files         162      162              
     Lines       43796    43862      +66     
   ==========================================
   + Hits        36149    36208      +59     
   - Misses       7647     7654       +7     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvY29sdW1uL3dyaXRlci5ycw==) | `93.64% <84.00%> (-0.13%)` | :arrow_down: |
   | [parquet/src/arrow/array\_reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyYXlfcmVhZGVyLnJz) | `77.12% <85.71%> (-0.02%)` | :arrow_down: |
   | [parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfd3JpdGVyLnJz) | `98.04% <94.33%> (-0.38%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [8bd769b...2710f31](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] nevi-me merged pull request #258: support full u32 and u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
nevi-me merged pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-832688195


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (550a48f) into [master](https://codecov.io/gh/apache/arrow-rs/commit/508f25c10032857da34ea88cc8166f0741616a32?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (508f25c) will **increase** coverage by `0.00%`.
   > The diff coverage is `89.47%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master     #258   +/-   ##
   =======================================
     Coverage   82.53%   82.54%           
   =======================================
     Files         162      162           
     Lines       43786    43824   +38     
   =======================================
   + Hits        36140    36174   +34     
   - Misses       7646     7650    +4     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvY29sdW1uL3dyaXRlci5ycw==) | `93.64% <84.00%> (-0.13%)` | :arrow_down: |
   | [parquet/src/arrow/array\_reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyYXlfcmVhZGVyLnJz) | `77.13% <88.88%> (-0.01%)` | :arrow_down: |
   | [parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfd3JpdGVyLnJz) | `98.30% <95.65%> (-0.12%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [508f25c...550a48f](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on a change in pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#discussion_r626663562



##########
File path: parquet/src/arrow/array_reader.rs
##########
@@ -380,6 +380,18 @@ impl<T: DataType> ArrayReader for PrimitiveArrayReader<T> {
                 }
                 Arc::new(builder.finish()) as ArrayRef
             }
+            ArrowType::UInt64 => match array.data_type() {

Review comment:
       This code was confusing to me (as it is manipulating the Arrow array after it has been created rather than manipulating the data prior to creating the array). 
   
   It also seems like it will result in a copy of all Int64 columns which may not be idea.
   
   I wonder if you considered creating the `Unit64Array` directly from the parquet data up here: https://github.com/apache/arrow-rs/pull/258/files#diff-0d6bed48d78c5a2472b7680a8185cabdc0bd259d6484e184439ed7830060661fR316-R319
    instead?
   
   That may be clearer to understand as well as more performant
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter edited a comment on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-832688195


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#258](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0a83a42) into [master](https://codecov.io/gh/apache/arrow-rs/commit/a870b24bd4eb76d3e0e5c718c9956a7dcdee52fd?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (a870b24) will **increase** coverage by `0.01%`.
   > The diff coverage is `90.21%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/258/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #258      +/-   ##
   ==========================================
   + Coverage   82.53%   82.54%   +0.01%     
   ==========================================
     Files         162      162              
     Lines       43799    43865      +66     
   ==========================================
   + Hits        36151    36210      +59     
   - Misses       7648     7655       +7     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvY29sdW1uL3dyaXRlci5ycw==) | `93.64% <84.00%> (-0.13%)` | :arrow_down: |
   | [parquet/src/arrow/array\_reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyYXlfcmVhZGVyLnJz) | `77.12% <85.71%> (-0.02%)` | :arrow_down: |
   | [parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow-rs/pull/258/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvYXJyb3cvYXJyb3dfd3JpdGVyLnJz) | `98.04% <94.33%> (-0.38%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [a870b24...0a83a42](https://codecov.io/gh/apache/arrow-rs/pull/258?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] crepererum commented on pull request #258: support full u64 roundtrip through parquet

Posted by GitBox <gi...@apache.org>.
crepererum commented on pull request #258:
URL: https://github.com/apache/arrow-rs/pull/258#issuecomment-834148722


   Will rebase once #267 is merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org