You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/06 11:21:19 UTC

[GitHub] [arrow-rs] wolfv opened a new pull request, #3030: do not read 1-past-size

wolfv opened a new pull request, #3030:
URL: https://github.com/apache/arrow-rs/pull/3030

   # Which issue does this PR close?
   
   Closes #3029.
   
   # Rationale for this change
    
   Fixes a bug where parquet attempts to read past the buffer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on pull request #3030: do not read 1-past-size

Posted by GitBox <gi...@apache.org>.
tustvold commented on PR #3030:
URL: https://github.com/apache/arrow-rs/pull/3030#issuecomment-1304865839

   Thank you, I will take a look 👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on pull request #3030: do not read 1-past-size

Posted by GitBox <gi...@apache.org>.
tustvold commented on PR #3030:
URL: https://github.com/apache/arrow-rs/pull/3030#issuecomment-1304917579

   Ok I have found two issues related to this file, I will get a PR up later today


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] wolfv commented on pull request #3030: do not read 1-past-size

Posted by GitBox <gi...@apache.org>.
wolfv commented on PR #3030:
URL: https://github.com/apache/arrow-rs/pull/3030#issuecomment-1304865556

   My rust is not very good and I know next to nothing about parquet. 
   
   It was triggered when iterating over these parquet files: `https://s3.amazonaws.com/anaconda-package-data/conda/monthly/2022/2022-05.parquet`
   
   I don't know what condition triggers this code path.
   
   However, this is _clearly_ a logic error (since the size of the buffer is 1024 elements and Rust has a 0-based indexing the maximum value is 1023).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on pull request #3030: do not read 1-past-size

Posted by GitBox <gi...@apache.org>.
tustvold commented on PR #3030:
URL: https://github.com/apache/arrow-rs/pull/3030#issuecomment-1305083336

   Closing in favor of #3036 
   
   Thank you for the report, this file was a wonderful source of bugs :tada:


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on a diff in pull request #3030: do not read 1-past-size

Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #3030:
URL: https://github.com/apache/arrow-rs/pull/3030#discussion_r1014882986


##########
parquet/src/encodings/rle.rs:
##########
@@ -476,7 +476,7 @@ impl RleDecoder {
                 let mut num_values =
                     cmp::min(max_values - values_read, self.bit_packed_left as usize);
 
-                num_values = cmp::min(num_values, index_buf.len());
+                num_values = cmp::min(num_values, index_buf.len() - 1);

Review Comment:
   So this is not correct as `num_values` is the number of values to read and not the maximum numeric value.
   
   I suspect this is a similar issue to #1458



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed pull request #3030: do not read 1-past-size

Posted by GitBox <gi...@apache.org>.
tustvold closed pull request #3030: do not read 1-past-size
URL: https://github.com/apache/arrow-rs/pull/3030


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org