You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jorge Leitão (Jira)" <ji...@apache.org> on 2020/11/12 08:01:00 UTC
[jira] [Updated] (ARROW-10562) [Rust] Potential UB on unsafe code
[ https://issues.apache.org/jira/browse/ARROW-10562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jorge Leitão updated ARROW-10562:
---------------------------------
Description:
The function {{set_bits_raw}} may be (unsafely) accessing memory out of bounds.
Specifically, if I re-write it in a safe manner, it panics (instead of UB)
{code:java}
pub fn set_bits_raw(data: &mut [u8], start: usize, end: usize) {
let start_byte = start >> 3;
let end_byte = end >> 3;
let start_offset = (start & 7) as u8;
let end_offset = (end & 7) as u8;
// All set apart from lowest `start_offset` bits
let start_mask = !((1 << start_offset) - 1);
// All clear apart from lowest `end_offset` bits
let end_mask = (1 << end_offset) - 1;
if start_byte == end_byte {
data[start_byte] |= start_mask & end_mask;
} else {
data[start_byte] |= start_mask;
for i in (start_byte + 1)..end_byte {
data[i] = 0xFF;
}
data[end_byte] |= end_mask;
}
}
{code}
A test that panics (and that represents an operation that we peform in `BufferBuilder` would be)
{code:java}
#[test]
fn test_set_bits_raw_from_start() {
let mut buf = vec![0; 1];
unsafe {set_bits_raw(buf.as_mut(), 0, 0)};
// 00000000
assert_eq!(buf[0], 0);
unsafe {set_bits_raw(buf.as_mut(), 0, 1)};
// 00000001
assert_eq!(buf[0], 1);
unsafe {set_bits_raw(buf.as_mut(), 0, 2)};
// 00000001
assert_eq!(buf[0], 3);
unsafe {set_bits_raw(buf.as_mut(), 0, 7)};
// 01111111
assert_eq!(buf[0], 127);
unsafe {set_bits_raw(buf.as_mut(), 0, 8)};
// 11111111
assert_eq!(buf[0], 255);
}
{code}
I think that it is related to the fact that the documentation states that the bounds are non-inclusive, but the `end` seems to be inclusive.
was:
The function {{set_bits_raw}} may be (unsafely) accessing memory out of bounds.
Specifically, if I re-write it in a safe manner, it panics (instead of UB)
{code:java}
pub fn set_bits_raw(data: &mut [u8], start: usize, end: usize) {
let start_byte = start >> 3;
let end_byte = end >> 3;
let start_offset = (start & 7) as u8;
let end_offset = (end & 7) as u8;
// All set apart from lowest `start_offset` bits
let start_mask = !((1 << start_offset) - 1);
// All clear apart from lowest `end_offset` bits
let end_mask = (1 << end_offset) - 1;
if start_byte == end_byte {
data[start_byte] |= start_mask & end_mask;
} else {
data[start_byte] |= start_mask;
for i in (start_byte + 1)..end_byte {
data[i] = 0xFF;
}
data[end_byte] |= end_mask;
}
}
{code}
A test that panics (and that represents an operation that we peform in `BufferBuilder` would be)
```
#[test]
fn test_set_bits_raw_from_start() {
let mut buf = vec![0; 1];
unsafe {set_bits_raw(buf.as_mut(), 0, 0)};
// 00000000
assert_eq!(buf[0], 0);
unsafe {set_bits_raw(buf.as_mut(), 0, 1)};
// 00000001
assert_eq!(buf[0], 1);
unsafe {set_bits_raw(buf.as_mut(), 0, 2)};
// 00000001
assert_eq!(buf[0], 3);
unsafe {set_bits_raw(buf.as_mut(), 0, 7)};
// 01111111
assert_eq!(buf[0], 127);
unsafe {set_bits_raw(buf.as_mut(), 0, 8)};
// 11111111
assert_eq!(buf[0], 255);
}
```
I think that it is related to the fact that the documentation states that the bounds are non-inclusive, but the `end` seems to be inclusive.
> [Rust] Potential UB on unsafe code
> ----------------------------------
>
> Key: ARROW-10562
> URL: https://issues.apache.org/jira/browse/ARROW-10562
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Rust
> Reporter: Jorge Leitão
> Priority: Major
>
> The function {{set_bits_raw}} may be (unsafely) accessing memory out of bounds.
> Specifically, if I re-write it in a safe manner, it panics (instead of UB)
> {code:java}
> pub fn set_bits_raw(data: &mut [u8], start: usize, end: usize) {
> let start_byte = start >> 3;
> let end_byte = end >> 3;
> let start_offset = (start & 7) as u8;
> let end_offset = (end & 7) as u8;
> // All set apart from lowest `start_offset` bits
> let start_mask = !((1 << start_offset) - 1);
> // All clear apart from lowest `end_offset` bits
> let end_mask = (1 << end_offset) - 1;
> if start_byte == end_byte {
> data[start_byte] |= start_mask & end_mask;
> } else {
> data[start_byte] |= start_mask;
> for i in (start_byte + 1)..end_byte {
> data[i] = 0xFF;
> }
> data[end_byte] |= end_mask;
> }
> }
> {code}
> A test that panics (and that represents an operation that we peform in `BufferBuilder` would be)
> {code:java}
> #[test]
> fn test_set_bits_raw_from_start() {
> let mut buf = vec![0; 1];
> unsafe {set_bits_raw(buf.as_mut(), 0, 0)};
> // 00000000
> assert_eq!(buf[0], 0);
> unsafe {set_bits_raw(buf.as_mut(), 0, 1)};
> // 00000001
> assert_eq!(buf[0], 1);
> unsafe {set_bits_raw(buf.as_mut(), 0, 2)};
> // 00000001
> assert_eq!(buf[0], 3);
> unsafe {set_bits_raw(buf.as_mut(), 0, 7)};
> // 01111111
> assert_eq!(buf[0], 127);
> unsafe {set_bits_raw(buf.as_mut(), 0, 8)};
> // 11111111
> assert_eq!(buf[0], 255);
> }
> {code}
> I think that it is related to the fact that the documentation states that the bounds are non-inclusive, but the `end` seems to be inclusive.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)