You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/14 05:32:40 UTC
[GitHub] [arrow-rs] JasonLi-cn commented on a diff in pull request #2040: Truncate IPC record batch
JasonLi-cn commented on code in PR #2040:
URL: https://github.com/apache/arrow-rs/pull/2040#discussion_r920765737
##########
arrow/src/ipc/writer.rs:
##########
@@ -894,12 +1004,88 @@ fn write_array_data(
Some(buffer) => buffer.clone(),
};
- offset = write_buffer(&null_buffer, buffers, arrow_data, offset);
+ let sliced_null_buffer = if write_options.enable_truncate_array {
+ null_buffer.bit_slice(array_data.offset(), array_data.len())
+ } else {
+ null_buffer
+ };
+ offset = write_buffer(sliced_null_buffer.as_slice(), buffers, arrow_data, offset);
Review Comment:
I think there should be processed like this:
```rust
if has_validity_bitmap(array_data.data_type(), write_options) {
// write null buffer if exists
let null_buffer = match array_data.null_buffer() {
None => {
// create a buffer and fill it with valid bits
let num_bytes = bit_util::ceil(num_rows, 8);
let buffer = MutableBuffer::new(num_bytes);
let buffer = buffer.with_bitset(num_bytes, true);
buffer.into()
}
Some(buffer) => {
if write_options.enable_truncate_array {
buffer.bit_slice(array_data.offset(), array_data.len())
} else {
buffer.clone()
}
}
};
offset = write_buffer(null_buffer.as_slice(), buffers, arrow_data, offset);
}
```
Slice of RecordBatch which has no null_buffer may case Panic, for example: record_batch.num_rows() = 100, let rb_slice = record_batch.slice(64, 36), process rb_slice will case Panic("the offset of the new Buffer cannot exceed the existing length") when call null_buffer.bit_slice(array_data.offset(), array_data.len()).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org