You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jorge Leitão (Jira)" <ji...@apache.org> on 2020/11/14 07:28:00 UTC

[jira] [Assigned] (ARROW-10562) [Rust] Potential UB on unsafe code

     [ https://issues.apache.org/jira/browse/ARROW-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jorge Leitão reassigned ARROW-10562:
------------------------------------

    Assignee: Jorge Leitão

> [Rust] Potential UB on unsafe code
> ----------------------------------
>
>                 Key: ARROW-10562
>                 URL: https://issues.apache.org/jira/browse/ARROW-10562
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust
>            Reporter: Jorge Leitão
>            Assignee: Jorge Leitão
>            Priority: Major
>
> The function {{set_bits_raw}} may be (unsafely) accessing memory out of bounds.
> Specifically, if I re-write it in a safe manner, it panics (instead of UB)
> {code:java}
> pub fn set_bits_raw(data: &mut [u8], start: usize, end: usize) {
>     let start_byte = start >> 3;
>     let end_byte = end >> 3;
>     let start_offset = (start & 7) as u8;
>     let end_offset = (end & 7) as u8;
>     // All set apart from lowest `start_offset` bits
>     let start_mask = !((1 << start_offset) - 1);
>     // All clear apart from lowest `end_offset` bits
>     let end_mask = (1 << end_offset) - 1;
>     if start_byte == end_byte {
>         data[start_byte] |= start_mask & end_mask;
>     } else {
>         data[start_byte] |= start_mask;
>         for i in (start_byte + 1)..end_byte {
>             data[i] = 0xFF;
>         }
>         data[end_byte] |= end_mask;
>     }
> }
> {code}
> A test that panics (and that represents an operation that we peform in `BufferBuilder` would be)
> {code:java}
>     #[test]
>     fn test_set_bits_raw_from_start() {
>         let mut buf = vec![0; 1];
>         unsafe {set_bits_raw(buf.as_mut(), 0, 0)};
>         // 00000000
>         assert_eq!(buf[0], 0);
>         unsafe {set_bits_raw(buf.as_mut(), 0, 1)};
>         // 00000001
>         assert_eq!(buf[0], 1);
>         unsafe {set_bits_raw(buf.as_mut(), 0, 2)};
>         // 00000001
>         assert_eq!(buf[0], 3);
>         unsafe {set_bits_raw(buf.as_mut(), 0, 7)};
>         // 01111111
>         assert_eq!(buf[0], 127);
>         unsafe {set_bits_raw(buf.as_mut(), 0, 8)};
>         // 11111111
>         assert_eq!(buf[0], 255);
>     }
> {code}
> I think that it is related to the fact that the documentation states that the bounds are non-inclusive, but the `end` seems to be inclusive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)