You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/07 10:21:13 UTC

[GitHub] [arrow-rs] tustvold opened a new pull request, #1807: Don't trample existing data on snappy decompress (#1806)

tustvold opened a new pull request, #1807:
URL: https://github.com/apache/arrow-rs/pull/1807

   # Which issue does this PR close?
   
   Closes #1806
   
   # Rationale for this change
    
   Fixes a bug
   
   # What changes are included in this PR?
   
   Makes it so SnappyCodec doesn't trample existing data in the passed output buffer
   
   # Are there any user-facing changes?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on a diff in pull request #1807: Don't trample existing data on snappy decompress (#1806)

Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #1807:
URL: https://github.com/apache/arrow-rs/pull/1807#discussion_r891533251


##########
parquet/src/compression.rs:
##########
@@ -340,26 +341,46 @@ mod tests {
             .expect("Error when compressing");
 
         // Decompress with c2
-        let mut decompressed_size = c2
+        let decompressed_size = c2
             .decompress(compressed.as_slice(), &mut decompressed)
             .expect("Error when decompressing");
         assert_eq!(data.len(), decompressed_size);
-        decompressed.truncate(decompressed_size);
         assert_eq!(data, decompressed.as_slice());
 
+        decompressed.clear();
         compressed.clear();
 
         // Compress with c2
         c2.compress(data, &mut compressed)
             .expect("Error when compressing");
 
         // Decompress with c1
-        decompressed_size = c1
+        let decompressed_size = c1
             .decompress(compressed.as_slice(), &mut decompressed)
             .expect("Error when decompressing");
         assert_eq!(data.len(), decompressed_size);
-        decompressed.truncate(decompressed_size);
         assert_eq!(data, decompressed.as_slice());
+

Review Comment:
   I went through the changes in this test carefully and they look good to me 👍 



##########
parquet/src/compression.rs:
##########
@@ -111,9 +111,10 @@ mod snappy_codec {
             output_buf: &mut Vec<u8>,
         ) -> Result<usize> {
             let len = decompress_len(input_buf)?;
-            output_buf.resize(len, 0);
+            let offset = output_buf.len();

Review Comment:
   I reviewed the other `impl Codec` in this file and they appear to be appending to `output_buf` as well 👍 
   
   https://sourcegraph.com/github.com/apache/arrow-rs/-/blob/parquet/src/compression.rs?subtree=true



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1807: Don't trample existing data on snappy decompress (#1806)

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1807:
URL: https://github.com/apache/arrow-rs/pull/1807#discussion_r891044017


##########
parquet/src/compression.rs:
##########
@@ -340,26 +341,46 @@ mod tests {
             .expect("Error when compressing");
 
         // Decompress with c2
-        let mut decompressed_size = c2
+        let decompressed_size = c2
             .decompress(compressed.as_slice(), &mut decompressed)
             .expect("Error when decompressing");
         assert_eq!(data.len(), decompressed_size);
-        decompressed.truncate(decompressed_size);
         assert_eq!(data, decompressed.as_slice());
 
+        decompressed.clear();
         compressed.clear();
 
         // Compress with c2
         c2.compress(data, &mut compressed)
             .expect("Error when compressing");
 
         // Decompress with c1
-        decompressed_size = c1
+        let decompressed_size = c1
             .decompress(compressed.as_slice(), &mut decompressed)
             .expect("Error when decompressing");
         assert_eq!(data.len(), decompressed_size);
-        decompressed.truncate(decompressed_size);

Review Comment:
   This test was actually incorrect, it never cleared `decompressed` and so this was effectively just checking the same bytes it checked in the test above (except for Snappy which would truncate and trample)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on pull request #1807: Don't trample existing data on snappy decompress (#1806)

Posted by GitBox <gi...@apache.org>.
alamb commented on PR #1807:
URL: https://github.com/apache/arrow-rs/pull/1807#issuecomment-1148974440

   Nice work @tustvold 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold merged pull request #1807: Don't overwrite existing data on snappy decompress (#1806)

Posted by GitBox <gi...@apache.org>.
tustvold merged PR #1807:
URL: https://github.com/apache/arrow-rs/pull/1807


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org