You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/04 07:00:24 UTC

[GitHub] [arrow] jorgecarleitao opened a new pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

jorgecarleitao opened a new pull request #8829:
URL: https://github.com/apache/arrow/pull/8829


   Creating this PR associated with the issue. More investigations are required here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-741922959


   @Dandandan , your tip helped. Thanks a lot! It compiles now. The semantics are still wrong and I need to figure out why.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-740440493


   @jorgecarleitao  Did not find an easy fix yet. What works, though ugly, is using the `self.def_levels` again in the two `if` branches. 
   
   ![image](https://user-images.githubusercontent.com/163737/101453525-c4077000-392f-11eb-9786-647a7ee19ad1.png)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #8829: ARROW-10804: Removed some unsafe code from the parquet crate

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-744002322


   FWIW I am not sure but this PR may conflict with https://github.com/apache/arrow/pull/8698


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-738614616


   https://issues.apache.org/jira/browse/ARROW-10804


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8829: ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743721311


   > @nevi-me , I would really appreciate your help here: there are major problems emerging in the parquet crate when I try to remove an unsafe struct.
   
   Hey Jorge, I've been out of town, got back this morning. I'm going to catch up on PRs this weekend. I'll have a look at this first.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-741929506


   @jorgecarleitao great to hear! I saw it depends at least on changing the value of `values_written` but hope it helps in finding a solution :+1: 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#discussion_r537871324



##########
File path: rust/parquet/src/arrow/record_reader.rs
##########
@@ -340,42 +297,41 @@ impl<T: DataType> RecordReader<T> {
         }
 
         // Convert mutable buffer spaces to mutable slices
-        let mut values_buf = FatPtr::<T::T>::with_offset_and_size(
-            &mut self.records,
-            self.values_written,
-            T::get_type_size(),
-        );
+        let (prefix, values, suffix) = unsafe { self.records.data_mut().align_to_mut::<T::T>() };
+        assert!(prefix.is_empty() && suffix.is_empty());
+        let values = &mut values[self.values_written..];
 
         let values_written = self.values_written;
-        let mut def_levels_buf = self
+        let def_levels = self
             .def_levels
             .as_mut()
-            .map(|buf| FatPtr::<i16>::with_offset(buf, values_written));
+            .map(|buf| {
+                let (prefix, values, suffix) = unsafe { buf.data_mut().align_to_mut::<i16>() };
+                assert!(prefix.is_empty() && suffix.is_empty());
+                &mut values[values_written..]
+            });
 
-        let mut rep_levels_buf = self
+        let rep_levels = self
             .rep_levels
             .as_mut()
-            .map(|buf| FatPtr::<i16>::with_offset(buf, values_written));
+            .map(|buf| {
+                let (prefix, values, suffix) = unsafe { buf.data_mut().align_to_mut::<i16>() };
+                assert!(prefix.is_empty() && suffix.is_empty());
+                &mut values[values_written..]
+            });
 
         let (values_read, levels_read) =
             self.column_reader.as_mut().unwrap().read_batch(
                 batch_size,
-                def_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
-                rep_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
-                values_buf.to_slice_mut(),
+                def_levels,

Review comment:
       I will have some time tomorrow to check out the code to see what can be done about it otherwise




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io commented on pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743716160


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=h1) Report
   > Merging [#8829](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=desc) (1b4bf9f) into [master](https://codecov.io/gh/apache/arrow/commit/2816f37ff01cfd31101b3ee6cc8574cc9246dd1b?el=desc) (2816f37) will **decrease** coverage by `23.01%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8829/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master    #8829       +/-   ##
   ===========================================
   - Coverage   76.99%   53.97%   -23.02%     
   ===========================================
     Files         174      170        -4     
     Lines       40392    30699     -9693     
   ===========================================
   - Hits        31099    16571    -14528     
   - Misses       9293    14128     +4835     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `0.00% <0.00%> (-94.54%)` | :arrow_down: |
   | [rust/parquet/src/column/page.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vcGFnZS5ycw==) | `0.00% <0.00%> (-98.69%)` | :arrow_down: |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
   | [rust/parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9hcnJvd193cml0ZXIucnM=) | `0.00% <0.00%> (-97.34%)` | :arrow_down: |
   | [rust/parquet/src/basic.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iYXNpYy5ycw==) | `0.00% <0.00%> (-97.27%)` | :arrow_down: |
   | [rust/parquet/src/file/properties.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3Byb3BlcnRpZXMucnM=) | `0.00% <0.00%> (-95.73%)` | :arrow_down: |
   | [rust/parquet/src/file/serialized\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3NlcmlhbGl6ZWRfcmVhZGVyLnJz) | `0.00% <0.00%> (-95.62%)` | :arrow_down: |
   | [rust/parquet/src/file/writer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3dyaXRlci5ycw==) | `0.00% <0.00%> (-95.11%)` | :arrow_down: |
   | [rust/parquet/src/file/statistics.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3N0YXRpc3RpY3MucnM=) | `0.00% <0.00%> (-94.41%)` | :arrow_down: |
   | [rust/parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vd3JpdGVyLnJz) | `0.00% <0.00%> (-93.90%)` | :arrow_down: |
   | ... and [52 more](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=footer). Last update [2816f37...1b4bf9f](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#discussion_r537870768



##########
File path: rust/parquet/src/arrow/record_reader.rs
##########
@@ -340,42 +297,41 @@ impl<T: DataType> RecordReader<T> {
         }
 
         // Convert mutable buffer spaces to mutable slices
-        let mut values_buf = FatPtr::<T::T>::with_offset_and_size(
-            &mut self.records,
-            self.values_written,
-            T::get_type_size(),
-        );
+        let (prefix, values, suffix) = unsafe { self.records.data_mut().align_to_mut::<T::T>() };
+        assert!(prefix.is_empty() && suffix.is_empty());
+        let values = &mut values[self.values_written..];
 
         let values_written = self.values_written;
-        let mut def_levels_buf = self
+        let def_levels = self
             .def_levels
             .as_mut()
-            .map(|buf| FatPtr::<i16>::with_offset(buf, values_written));
+            .map(|buf| {
+                let (prefix, values, suffix) = unsafe { buf.data_mut().align_to_mut::<i16>() };
+                assert!(prefix.is_empty() && suffix.is_empty());
+                &mut values[values_written..]
+            });
 
-        let mut rep_levels_buf = self
+        let rep_levels = self
             .rep_levels
             .as_mut()
-            .map(|buf| FatPtr::<i16>::with_offset(buf, values_written));
+            .map(|buf| {
+                let (prefix, values, suffix) = unsafe { buf.data_mut().align_to_mut::<i16>() };
+                assert!(prefix.is_empty() && suffix.is_empty());
+                &mut values[values_written..]
+            });
 
         let (values_read, levels_read) =
             self.column_reader.as_mut().unwrap().read_batch(
                 batch_size,
-                def_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
-                rep_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
-                values_buf.to_slice_mut(),
+                def_levels,

Review comment:
       The error originates from here, the error shows that it is moved here. A clone likely "solves" the issue? Usually taking the value by `&` in the function also solves this as it will stop it from taking ownership?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#discussion_r537870942



##########
File path: rust/parquet/src/arrow/record_reader.rs
##########
@@ -340,42 +297,41 @@ impl<T: DataType> RecordReader<T> {
         }
 
         // Convert mutable buffer spaces to mutable slices
-        let mut values_buf = FatPtr::<T::T>::with_offset_and_size(
-            &mut self.records,
-            self.values_written,
-            T::get_type_size(),
-        );
+        let (prefix, values, suffix) = unsafe { self.records.data_mut().align_to_mut::<T::T>() };
+        assert!(prefix.is_empty() && suffix.is_empty());
+        let values = &mut values[self.values_written..];
 
         let values_written = self.values_written;
-        let mut def_levels_buf = self
+        let def_levels = self
             .def_levels
             .as_mut()
-            .map(|buf| FatPtr::<i16>::with_offset(buf, values_written));
+            .map(|buf| {
+                let (prefix, values, suffix) = unsafe { buf.data_mut().align_to_mut::<i16>() };
+                assert!(prefix.is_empty() && suffix.is_empty());
+                &mut values[values_written..]
+            });
 
-        let mut rep_levels_buf = self
+        let rep_levels = self
             .rep_levels
             .as_mut()
-            .map(|buf| FatPtr::<i16>::with_offset(buf, values_written));
+            .map(|buf| {
+                let (prefix, values, suffix) = unsafe { buf.data_mut().align_to_mut::<i16>() };
+                assert!(prefix.is_empty() && suffix.is_empty());
+                &mut values[values_written..]
+            });
 
         let (values_read, levels_read) =
             self.column_reader.as_mut().unwrap().read_batch(
                 batch_size,
-                def_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
-                rep_levels_buf.as_mut().map(|ptr| ptr.to_slice_mut()),
-                values_buf.to_slice_mut(),
+                def_levels,

Review comment:
       @jorgecarleitao 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8829: ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743720998


   @nevi-me , I would really appreciate your help here: there are major problems emerging in the parquet crate when I try to remove an unsafe struct.
   
   The background of this change is that `FatPtr` is using `Buffer::capacity` and `Buffer::raw_data()` which is brittle and difficult to reason because a pointer is only valid within a container's `len`, not `capacity`. This is also causing the error that rust compiler safeguard us against when borrowing a mutable and immutable reference at the same time.
   
   This is blocking #8796 
   
   I verified that in all 4 tests the content passed to `ColumnReaderImpl::read_batch` is the same in master and after this PR, but for some reason the outcome of that call is different, which hints that there is some other assumption on that call beyond what we are telling the compiler about.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on a change in pull request #8829: ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail

Posted by GitBox <gi...@apache.org>.
nevi-me commented on a change in pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#discussion_r541782467



##########
File path: rust/parquet/src/arrow/record_reader.rs
##########
@@ -58,40 +55,6 @@ struct FatPtr<'a, T> {
     ptr: &'a mut [T],

Review comment:
       We can also remove this

##########
File path: rust/parquet/src/arrow/record_reader.rs
##########
@@ -424,13 +389,16 @@ impl<T: DataType> RecordReader<T> {
     /// Split values into records according repetition definition and returns number of
     /// records read.
     fn split_records(&mut self, records_to_read: usize) -> Result<usize> {
-        let rep_levels_buf = self
+        let rep_levels = self
             .rep_levels
             .as_mut()
-            .map(|buf| FatPtr::<i16>::with_offset(buf, 0));
-        let rep_levels_buf = rep_levels_buf.as_ref().map(|x| x.to_slice());
+            .map(|buf| {
+                let (prefix, rep_levels, suffix) = unsafe { buf.data_mut().align_to_mut::<i16>() };

Review comment:
       It looks like this doesn't need to be mutable. Compiles fine if I change to the below (and changing the `as_mut` to `as_ref`
   ```suggestion
                   let (prefix, rep_levels, suffix) = unsafe { buf.data().align_to::<i16>() };
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8829: ARROW-10804: Removed some unsafe code from the parquet crate

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-744040215


   @alamb they fortunately don't seem to conflict


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao closed pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao closed pull request #8829:
URL: https://github.com/apache/arrow/pull/8829


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-741907889


   > Did not find an easy fix yet. What works, though ugly, is using the `self.def_levels` again in the two `if` branches.
   > 
   > ![image](https://user-images.githubusercontent.com/163737/101453525-c4077000-392f-11eb-9786-647a7ee19ad1.png)
   
   Thanks for the help. I tried that, but it is causing a double mutable borrow on my code. Maybe I am doing something wrong. Could you push to the change to a branch in your repo and share it with me?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8829: ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail

Posted by GitBox <gi...@apache.org>.
jorgecarleitao edited a comment on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743720998


   @nevi-me , I would really appreciate your help here: there are major problems emerging in the parquet crate when I try to remove an unsafe struct.
   
   The background of this change is that `FatPtr` is using `Buffer::capacity` and `Buffer::raw_data()` which is brittle and difficult to reason because a pointer is only valid within a container's `len`, not `capacity`. This is also leading to the error that the rust compiler safeguard us against when borrowing a mutable and immutable reference at the same time.
   
   This is blocking #8796 
   
   I verified that in all 4 tests the content passed to `ColumnReaderImpl::read_batch` is the same in master and after this PR, but for some reason the outcome of that call is different, which hints that there is some other invariant beyond the ones that we are telling the compiler about through the typing and lifetime system.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao edited a comment on pull request #8829: ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail

Posted by GitBox <gi...@apache.org>.
jorgecarleitao edited a comment on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743720998


   @nevi-me , I would really appreciate your help here: there are major problems emerging in the parquet crate when I try to remove an unsafe struct.
   
   The background of this change is that `FatPtr` is using `Buffer::capacity` and `Buffer::raw_data()` which is brittle and difficult to reason because a pointer is only valid within a container's `len`, not `capacity`. This is also causing the error that rust compiler safeguard us against when borrowing a mutable and immutable reference at the same time.
   
   This is blocking #8796 
   
   I verified that in all 4 tests the content passed to `ColumnReaderImpl::read_batch` is the same in master and after this PR, but for some reason the outcome of that call is different, which hints that there is some other invariant beyond the ones that we are telling the compiler about through the typing and lifetime system.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #8829: ARROW-10804: [Rust] Removed some unsafe code from the parquet crate

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-745591301


   I just merged this code to master locally to test the effects of https://github.com/apache/arrow/pull/8698 -- I am happy to report, as @nevi-me  said, the tests still pass just fine


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nevi-me commented on pull request #8829: ARROW-10804: Removed undefined behavior from parquet crate, causing tests to fail

Posted by GitBox <gi...@apache.org>.
nevi-me commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743885020


   Hi @jorgecarleitao, I also spent most of today looking into this :( 
   
   The problem was a bit vs byte issue on `consume_record_data` and `consume_rep_levels`, so what was mostly needed was a fresh pair of eyes. I opened https://github.com/jorgecarleitao/arrow/pull/22. I didn't address the `rustfmt` lint, so you could see what I changed.
   
   > I verified that in all 4 tests the content passed to `ColumnReaderImpl::read_batch` is the same in master and after this PR, but for some reason the outcome of that call is different, which hints that there is some other invariant beyond the ones that we are telling the compiler about through the typing and lifetime system.
   
   This was hidden away by the way that we compare arrays. Because we only print the first 10 & last 10 values of arrays, the issue wasn't visible, but comparing `ArrayData` surfaced the issue with the failing tests. I had to change the test utilities that generate data to return sequential values instead of randomly generated that. Then I was able to see why `consume_record_data` was filling the first 12 and last 12 slots with data (then the part that we don't print was just 0s).
   
   I also changed the code slightly to rather initialise the buffers with `let new_buffer = MutableBuffer::new(0)` because we always perform an allocation on `new_buffer.resize(num_bytes)` as we never get the initial allocation right anyways.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #8829: ARROW-10804: Removed some unsafe code from the parquet crate

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743716160


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=h1) Report
   > Merging [#8829](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=desc) (66cfc83) into [master](https://codecov.io/gh/apache/arrow/commit/2816f37ff01cfd31101b3ee6cc8574cc9246dd1b?el=desc) (2816f37) will **decrease** coverage by `0.53%`.
   > The diff coverage is `88.88%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8829/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8829      +/-   ##
   ==========================================
   - Coverage   76.99%   76.45%   -0.54%     
   ==========================================
     Files         174      177       +3     
     Lines       40392    40800     +408     
   ==========================================
   + Hits        31099    31194      +95     
   - Misses       9293     9606     +313     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `96.25% <88.88%> (+1.71%)` | :arrow_up: |
   | [rust/arrow/src/ipc/writer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvaXBjL3dyaXRlci5ycw==) | `83.65% <0.00%> (-3.44%)` | :arrow_down: |
   | [rust/arrow/src/json/reader.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvanNvbi9yZWFkZXIucnM=) | `83.17% <0.00%> (-0.09%)` | :arrow_down: |
   | [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `97.96% <0.00%> (ø)` | |
   | [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `76.85% <0.00%> (ø)` | |
   | [rust/arrow-flight/src/utils.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy1mbGlnaHQvc3JjL3V0aWxzLnJz) | `0.00% <0.00%> (ø)` | |
   | [rust/datafusion/src/optimizer/utils.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9vcHRpbWl6ZXIvdXRpbHMucnM=) | `0.00% <0.00%> (ø)` | |
   | [rust/datafusion/src/datasource/memory.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9kYXRhc291cmNlL21lbW9yeS5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/datafusion/src/execution/context.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9leGVjdXRpb24vY29udGV4dC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/datafusion/src/logical\_plan/plan.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9sb2dpY2FsX3BsYW4vcGxhbi5ycw==) | `0.00% <0.00%> (ø)` | |
   | ... and [13 more](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=footer). Last update [2816f37...66cfc83](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao closed pull request #8829: ARROW-10804: [Rust] Removed some unsafe code from the parquet crate

Posted by GitBox <gi...@apache.org>.
jorgecarleitao closed pull request #8829:
URL: https://github.com/apache/arrow/pull/8829


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8829: ARROW-10804: Removed unsafe from parquet, causing a compilation error.

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-740192100


   @jhorstmann and @Dandandan , I notice that you are quite experienced with these types of problems by your PRs.
   
   I am trying to remove this `FatPtr` with a safe counter part, which is blocking the build of #8796 by making this code `safe` and not use pointers from the buffers, but I am struggling a lot here (as you can see by the red on the CI 😭), and I am a bit blocked. I would be grateful for your help here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #8829: ARROW-10804: Removed some unsafe code from the parquet crate

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743716160


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=h1) Report
   > Merging [#8829](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=desc) (e4a727f) into [master](https://codecov.io/gh/apache/arrow/commit/2816f37ff01cfd31101b3ee6cc8574cc9246dd1b?el=desc) (2816f37) will **decrease** coverage by `23.03%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8829/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master    #8829       +/-   ##
   ===========================================
   - Coverage   76.99%   53.96%   -23.04%     
   ===========================================
     Files         174      170        -4     
     Lines       40392    30710     -9682     
   ===========================================
   - Hits        31099    16572    -14527     
   - Misses       9293    14138     +4845     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `0.00% <0.00%> (-94.54%)` | :arrow_down: |
   | [rust/parquet/src/column/page.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vcGFnZS5ycw==) | `0.00% <0.00%> (-98.69%)` | :arrow_down: |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
   | [rust/parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9hcnJvd193cml0ZXIucnM=) | `0.00% <0.00%> (-97.34%)` | :arrow_down: |
   | [rust/parquet/src/basic.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iYXNpYy5ycw==) | `0.00% <0.00%> (-97.27%)` | :arrow_down: |
   | [rust/parquet/src/file/properties.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3Byb3BlcnRpZXMucnM=) | `0.00% <0.00%> (-95.73%)` | :arrow_down: |
   | [rust/parquet/src/file/serialized\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3NlcmlhbGl6ZWRfcmVhZGVyLnJz) | `0.00% <0.00%> (-95.62%)` | :arrow_down: |
   | [rust/parquet/src/file/writer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3dyaXRlci5ycw==) | `0.00% <0.00%> (-95.11%)` | :arrow_down: |
   | [rust/parquet/src/file/statistics.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3N0YXRpc3RpY3MucnM=) | `0.00% <0.00%> (-94.41%)` | :arrow_down: |
   | [rust/parquet/src/column/writer.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vd3JpdGVyLnJz) | `0.00% <0.00%> (-93.90%)` | :arrow_down: |
   | ... and [60 more](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=footer). Last update [2816f37...66cfc83](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #8829: ARROW-10804: Removed some unsafe code from the parquet crate

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #8829:
URL: https://github.com/apache/arrow/pull/8829#issuecomment-743716160


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=h1) Report
   > Merging [#8829](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=desc) (55974f6) into [master](https://codecov.io/gh/apache/arrow/commit/0c8b9903602e1cde0a20b825abf92d361af3c315?el=desc) (0c8b990) will **increase** coverage by `0.00%`.
   > The diff coverage is `88.88%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8829/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #8829   +/-   ##
   =======================================
     Coverage   76.77%   76.77%           
   =======================================
     Files         181      181           
     Lines       41009    40990   -19     
   =======================================
   - Hits        31485    31472   -13     
   + Misses       9524     9518    -6     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `96.25% <88.88%> (+1.71%)` | :arrow_up: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/8829/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `92.99% <0.00%> (-0.18%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=footer). Last update [0c8b990...55974f6](https://codecov.io/gh/apache/arrow/pull/8829?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org