You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/02 09:49:46 UTC
[GitHub] [arrow] jorgecarleitao opened a new pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
jorgecarleitao opened a new pull request #9076:
URL: https://github.com/apache/arrow/pull/9076
This PR refactors `MutableBuffer::extend_from_slice` to remove the need to use `to_byte_slice` on every call, thereby removing its level of indirection, that does not allow the compiler to optimize out some code.
This is the second performance improvement originally presented in #8796 and, together with #9027 , brings the performance of "MutableBuffer" to the same level as `Vec<u8>`, in particular to building buffers on the fly.
Basically, when converting to a byte slice `&[u8]`, the compiler lost the type size information, and thus needs to perform extra checks and can't just optimize out the code.
This PR adopts the same API as `Vec<T>::extend_from_slice`, but since our buffers are in `u8` (i.e. a la `Vec<u8>`), I made the signature
```
pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T])
pub fn push<T: ToByteSlice>(&mut self, item: &T)
```
i.e. it consumes something that can be converted to a byte slice, but internally makes the conversion to bytes (as `to_byte_slice` was doing).
Credits for the root cause analysis that lead to this PR go to @Dandandan, [originally fielded here](https://github.com/apache/arrow/pull/9016#discussion_r549110164).
> [...] current conversion to a byte slice may add some overhead? - @Dandandan
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753458832
Great! Besides being a speed up, being able to `push`.
I did some quick benchmarking of your branch on DataFusion - I see a speed up of ~5% on q1 / q12.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758263983
FWIW given the "touch most of the codepaths" property of this PR, it might be good to wait to merge this PR until after 3.0 is shipped (in the next few days) in case it introduces any subtle new bugs we would have longer to shake them out
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550932922
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -863,44 +866,49 @@ impl MutableBuffer {
/// View buffer as typed slice.
pub fn typed_data_mut<T: ArrowNativeType>(&mut self) -> &mut [T] {
- assert_eq!(self.len() % mem::size_of::<T>(), 0);
- assert!(memory::is_ptr_aligned::<T>(self.data.cast()));
- // JUSTIFICATION
- // Benefit
- // Many of the buffers represent specific types, and consumers of `Buffer` often need to re-interpret them.
- // Soundness
- // * The pointer is non-null by construction
- // * alignment asserted above
unsafe {
- std::slice::from_raw_parts_mut(
- self.as_ptr() as *mut T,
- self.len() / mem::size_of::<T>(),
- )
+ let (prefix, offsets, suffix) = self.as_slice_mut().align_to_mut::<T>();
+ assert!(prefix.is_empty() && suffix.is_empty());
+ offsets
}
}
- /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+ /// Extends the buffer from a slice, increasing its capacity if needed.
#[inline]
- pub fn extend_from_slice(&mut self, bytes: &[u8]) {
- let new_len = self.len + bytes.len();
+ pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+ let additional = items.len() * std::mem::size_of::<T>();
+ let new_len = self.len + additional;
if new_len > self.capacity {
- self.reserve(new_len);
+ self.reserve(additional);
}
unsafe {
- let dst = NonNull::new_unchecked(self.data.as_ptr().add(self.len));
- let src = NonNull::new_unchecked(bytes.as_ptr() as *mut u8);
- memory::memcpy(dst, src, bytes.len());
+ let dst = self.data.as_ptr().add(self.len) as *mut T;
+ let src = items.as_ptr() as *const T;
+ std::ptr::copy_nonoverlapping(src, dst, items.len())
}
self.len = new_len;
}
- /// Extends the buffer by `len` with all bytes equal to `0u8`, incrementing its capacity if needed.
- pub fn extend(&mut self, len: usize) {
- let remaining_capacity = self.capacity - self.len;
- if len > remaining_capacity {
- self.reserve(self.len + len);
+ /// Extends the buffer with a new item, increasing its capacity if needed.
+ #[inline]
+ pub fn push<T: ToByteSlice>(&mut self, item: T) {
+ let additional = std::mem::size_of::<T>();
+ let new_len = self.len + additional;
+ if new_len > self.capacity {
+ self.reserve(additional);
+ }
+ unsafe {
+ let dst = self.data.as_ptr().add(self.len) as *mut T;
+ let src = (&item) as *const T;
+ std::ptr::copy_nonoverlapping(src, dst, 1)
Review comment:
Is this the same as `ptr::write(dst, item)`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-763032677
🎉
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (1548011) into [master](https://codecov.io/gh/apache/arrow/commit/9fe375434ca4f90ea7cbafcbadd0e0250270105d?el=desc) (9fe3754) will **decrease** coverage by `0.05%`.
> The diff coverage is `94.48%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.61% 82.56% -0.06%
==========================================
Files 203 203
Lines 49942 49872 -70
==========================================
- Hits 41259 41175 -84
- Misses 8683 8697 +14
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
| [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
| [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
| [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.75% <91.52%> (-1.46%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
| ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [9fe3754...1548011](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorgecarleitao commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-763033846
Well, it broke master... 🤣 @Dandandan already has a fix https://github.com/apache/arrow/pull/9269 💯
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550910490
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
}
}
- /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+ /// Extends the buffer from a slice, increasing its capacity if needed.
+ #[inline]
+ pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+ let additional = items.len() * std::mem::size_of::<T>();
+ let new_len = self.len + additional;
+ if new_len > self.capacity {
+ self.reserve(additional);
+ }
+ unsafe {
+ let dst = self.data.as_ptr().add(self.len) as *mut T;
+ let src = items.as_ptr() as *const T;
+ std::ptr::copy_nonoverlapping(src, dst, items.len())
+ }
+ self.len = new_len;
+ }
+
+ /// Extends the buffer with a new item, increasing its capacity if needed.
#[inline]
- pub fn extend_from_slice(&mut self, bytes: &[u8]) {
- let additional = bytes.len();
- if self.len + additional > self.capacity() {
+ pub fn push<T: ToByteSlice>(&mut self, item: &T) {
+ let additional = std::mem::size_of::<T>();
+ let new_len = self.len + additional;
+ if new_len > self.capacity {
Review comment:
I though about that also, but it requires more investigations. We would need to make `MutableBuffer` a generic over `T` and that has implications in our ability to build generic structures.
For example, `builders::FieldData` relies on a generic `MutableBuffer`, and `MutableArrayData` also.
Note that growing and freezing a `MutableBuffer` is now faster than `Vec<u32>` to store `u32`. We do still have a performance problem on which growing a `MutableBuffer` is faster than pre-reserving it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (fb52aa2) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.05%`.
> The diff coverage is `94.48%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.61% 82.56% -0.06%
==========================================
Files 203 203
Lines 50140 50070 -70
==========================================
- Hits 41422 41338 -84
- Misses 8718 8732 +14
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
| [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
| [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
| [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.29% <91.52%> (-1.44%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
| ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...fb52aa2](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorgecarleitao commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753465093
@Dandandan , could you run this again? I rebased it against the PR with the one that removes the zero alloc. I see a 20% improv on the buffer_create bench, but the datafusion is the real deal :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758263983
FWIW it might be good to wait to merge this PR until after 3.0 is shipped in case it introduces any subtle new bugs we would have longer to shake the out
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (b9cd8fb) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.06%`.
> The diff coverage is `94.35%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.61% 82.55% -0.07%
==========================================
Files 203 204 +1
Lines 50140 50099 -41
==========================================
- Hits 41422 41357 -65
- Misses 8718 8742 +24
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
| [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
| [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.98% <89.09%> (-1.74%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
| [rust/arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfc3RyaW5nLnJz) | `89.94% <100.00%> (-0.22%)` | :arrow_down: |
| ... and [33 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...b9cd8fb](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550881085
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -44,18 +44,35 @@ use crate::util::bit_util::ceil;
#[cfg(any(feature = "simd", feature = "avx512"))]
use std::borrow::BorrowMut;
-/// Buffer is a contiguous memory region of fixed size and is aligned at a 64-byte
-/// boundary. Buffer is immutable.
+/// Buffer represents a contiguous memory region that can be shared with other buffers and across
+/// thread boundaries.
#[derive(Clone, PartialEq, Debug)]
pub struct Buffer {
- /// Reference-counted pointer to the internal byte buffer.
+ /// the internal byte buffer.
data: Arc<Bytes>,
/// The offset into the buffer.
offset: usize,
}
impl Buffer {
+ /// Initializes a [Buffer] from a slice of items.
+ pub fn from_slice_ref<U: ArrowNativeType, T: AsRef<[U]>>(items: &T) -> Self {
+ // allocate aligned memory buffer
+ let slice = items.as_ref();
+ let len = slice.len() * std::mem::size_of::<U>();
+ let capacity = bit_util::round_upto_multiple_of_64(len);
+ let buffer = memory::allocate_aligned(capacity);
+ unsafe {
Review comment:
Maybe document a justification for the unsafe here? :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753468173
This PR:
```
Query 1 iteration 0 took 608.4 ms
Query 1 iteration 1 took 602.8 ms
Query 1 iteration 2 took 601.4 ms
Query 1 iteration 3 took 604.6 ms
Query 1 iteration 4 took 604.0 ms
Query 1 iteration 5 took 603.6 ms
Query 1 iteration 6 took 605.5 ms
Query 1 iteration 7 took 610.7 ms
Query 1 iteration 8 took 607.5 ms
Query 1 iteration 9 took 607.1 ms
Query 1 avg time: 605.55 ms
```
Master:
```
Query 1 iteration 0 took 642.9 ms
Query 1 iteration 1 took 640.5 ms
Query 1 iteration 2 took 638.1 ms
Query 1 iteration 3 took 639.8 ms
Query 1 iteration 4 took 641.9 ms
Query 1 iteration 5 took 646.4 ms
Query 1 iteration 6 took 638.6 ms
Query 1 iteration 7 took 640.6 ms
Query 1 iteration 8 took 653.0 ms
Query 1 iteration 9 took 648.6 ms
Query 1 avg time: 643.04 ms
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorgecarleitao closed pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
jorgecarleitao closed pull request #9076:
URL: https://github.com/apache/arrow/pull/9076
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (823f15d) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.05%`.
> The diff coverage is `94.48%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.61% 82.55% -0.06%
==========================================
Files 203 203
Lines 50140 50070 -70
==========================================
- Hits 41422 41337 -85
- Misses 8718 8733 +15
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
| [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
| [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
| [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.29% <91.52%> (-1.44%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
| ... and [25 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...823f15d](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (eeb55d6) into [master](https://codecov.io/gh/apache/arrow/commit/cd22be6efedbf9832b5ea875ca59bb42de7b6c28?el=desc) (cd22be6) will **decrease** coverage by `0.00%`.
> The diff coverage is `96.05%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.55% 82.55% -0.01%
==========================================
Files 203 203
Lines 50043 50030 -13
==========================================
- Hits 41315 41303 -12
+ Misses 8728 8727 -1
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.66% <50.00%> (+2.79%)` | :arrow_up: |
| [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `83.81% <88.23%> (-0.19%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `98.01% <92.30%> (-0.20%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.61% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `93.04% <100.00%> (-0.06%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (+0.64%)` | :arrow_up: |
| [rust/arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfc3RyaW5nLnJz) | `90.27% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_union.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfdW5pb24ucnM=) | `89.53% <100.00%> (-0.05%)` | :arrow_down: |
| [rust/arrow/src/array/data.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvZGF0YS5ycw==) | `97.31% <100.00%> (ø)` | |
| ... and [17 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [cd22be6...eeb55d6](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550881141
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -44,18 +44,35 @@ use crate::util::bit_util::ceil;
#[cfg(any(feature = "simd", feature = "avx512"))]
use std::borrow::BorrowMut;
-/// Buffer is a contiguous memory region of fixed size and is aligned at a 64-byte
-/// boundary. Buffer is immutable.
+/// Buffer represents a contiguous memory region that can be shared with other buffers and across
+/// thread boundaries.
#[derive(Clone, PartialEq, Debug)]
pub struct Buffer {
- /// Reference-counted pointer to the internal byte buffer.
+ /// the internal byte buffer.
data: Arc<Bytes>,
/// The offset into the buffer.
offset: usize,
}
impl Buffer {
+ /// Initializes a [Buffer] from a slice of items.
+ pub fn from_slice_ref<U: ArrowNativeType, T: AsRef<[U]>>(items: &T) -> Self {
+ // allocate aligned memory buffer
+ let slice = items.as_ref();
+ let len = slice.len() * std::mem::size_of::<U>();
+ let capacity = bit_util::round_upto_multiple_of_64(len);
+ let buffer = memory::allocate_aligned(capacity);
+ unsafe {
Review comment:
I think it was there in the previous version of the code
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorgecarleitao commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-757492622
Ok, I have the benchmarks of this PR, which IMO is ready to review. FYI @alamb @nevi-me
```bash
critcmp master-08cccd68802c9ddc3ca0a5d4bad6e4ba382d74b4 perf_buffer-2dca3c85920c951d92992d343b69a5384178c6fd -t 10
```
```
group master-08cccd68802c9ddc3ca0a5d4bad6e4ba382d74b4 perf_buffer-2dca3c85920c951d92992d343b69a5384178c6fd
----- ----------------------------------------------- ----------------------------------------------------
array_slice 512 1.17 453.4±37.97ns ? B/sec 1.00 387.0±8.80ns ? B/sec
array_string_from_vec 128 1.28 4.2±0.22µs ? B/sec 1.00 3.3±0.09µs ? B/sec
array_string_from_vec 256 1.45 5.9±0.14µs ? B/sec 1.00 4.1±0.12µs ? B/sec
array_string_from_vec 512 1.66 9.6±0.28µs ? B/sec 1.00 5.8±0.18µs ? B/sec
buffer_bit_ops and 1.23 715.9±278.78ns ? B/sec 1.00 581.4±16.32ns ? B/sec
cast date64 to date32 512 1.11 7.6±0.40µs ? B/sec 1.00 6.9±0.32µs ? B/sec
cast float64 to float32 512 1.21 3.2±0.10µs ? B/sec 1.00 2.7±0.06µs ? B/sec
cast float64 to uint64 512 1.00 3.9±0.41µs ? B/sec 1.11 4.4±0.12µs ? B/sec
cast int32 to float32 512 1.12 3.2±0.11µs ? B/sec 1.00 2.8±0.08µs ? B/sec
cast int32 to int64 512 1.19 3.2±0.08µs ? B/sec 1.00 2.7±0.07µs ? B/sec
cast int32 to uint32 512 1.42 3.9±0.10µs ? B/sec 1.00 2.8±0.15µs ? B/sec
cast int64 to int32 512 1.96 5.1±0.27µs ? B/sec 1.00 2.6±0.07µs ? B/sec
cast time32s to time64us 512 1.11 5.4±0.23µs ? B/sec 1.00 4.9±0.15µs ? B/sec
concat i32 1024 1.00 4.2±0.09µs ? B/sec 1.12 4.7±0.17µs ? B/sec
concat str nulls 1024 1.00 21.3±0.74µs ? B/sec 1.14 24.3±0.73µs ? B/sec
eq scalar Float32 1.26 76.4±5.24µs ? B/sec 1.00 60.6±3.56µs ? B/sec
filter context f32 high selectivity 1.00 309.7±1.96µs ? B/sec 1.12 346.4±10.64µs ? B/sec
filter context u8 high selectivity 1.27 4.8±0.11µs ? B/sec 1.00 3.8±0.14µs ? B/sec
filter context u8 w NULLs high selectivity 1.00 299.2±2.82µs ? B/sec 1.12 336.0±7.21µs ? B/sec
filter u8 high selectivity 1.11 12.2±0.18µs ? B/sec 1.00 11.0±0.13µs ? B/sec
from_slice prepared 1.00 834.5±26.98µs ? B/sec 1.15 963.4±36.11µs ? B/sec
gt scalar Float32 1.13 42.9±1.52µs ? B/sec 1.00 37.9±1.83µs ? B/sec
length 1.58 57.8±0.58µs ? B/sec 1.00 36.5±9.01µs ? B/sec
like_utf8 scalar complex 1.00 1064.0±91.20µs ? B/sec 1.17 1241.6±320.92µs ? B/sec
like_utf8 scalar ends with 1.14 317.0±8.58µs ? B/sec 1.00 277.5±5.80µs ? B/sec
like_utf8 scalar equals 2.43 142.6±6.62µs ? B/sec 1.00 58.6±2.81µs ? B/sec
like_utf8 scalar starts with 1.41 294.5±16.24µs ? B/sec 1.00 208.9±7.83µs ? B/sec
min string 512 1.16 6.2±0.14µs ? B/sec 1.00 5.3±0.12µs ? B/sec
mutable 1.19 587.7±15.96µs ? B/sec 1.00 492.5±17.55µs ? B/sec
mutable prepared 1.17 640.6±26.75µs ? B/sec 1.00 549.0±21.62µs ? B/sec
mutable str nulls 1024 1.00 4.0±0.06ms ? B/sec 1.34 5.3±0.05ms ? B/sec
neq scalar Float32 1.23 76.4±3.41µs ? B/sec 1.00 61.9±6.27µs ? B/sec
nlike_utf8 scalar starts with 1.00 321.1±12.95µs ? B/sec 1.27 407.8±9.19µs ? B/sec
struct_array_from_vec 1024 1.53 20.8±0.55µs ? B/sec 1.00 13.6±0.66µs ? B/sec
struct_array_from_vec 128 1.16 7.2±0.21µs ? B/sec 1.00 6.2±0.23µs ? B/sec
struct_array_from_vec 256 1.27 9.1±0.29µs ? B/sec 1.00 7.1±0.20µs ? B/sec
struct_array_from_vec 512 1.40 13.0±0.27µs ? B/sec 1.00 9.3±0.32µs ? B/sec
take i32 1024 1.26 2.4±0.96µs ? B/sec 1.00 1883.9±35.60ns ? B/sec
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-761773033
I apologize for the delay in merging Rust PRs -- the 3.0 release is being finalized now and are planning to minimize entropy by postponing merging changes not critical for the release until the process was complete. I hope the process is complete in the next few days. There is more [discussion](https://lists.apache.org/list.html?dev@arrow.apache.org) in the mailing list
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (ecd2769) into [master](https://codecov.io/gh/apache/arrow/commit/fdf5e88a67f33c0a76673a32938274f063c9cb93?el=desc) (fdf5e88) will **decrease** coverage by `0.05%`.
> The diff coverage is `94.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.57% 82.52% -0.06%
==========================================
Files 204 204
Lines 50327 50247 -80
==========================================
- Hits 41560 41467 -93
- Misses 8767 8780 +13
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/array/equal/utils.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvZXF1YWwvdXRpbHMucnM=) | `74.50% <0.00%> (+0.72%)` | :arrow_up: |
| [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
| [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
| [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `77.66% <ø> (ø)` | |
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.95% <88.67%> (-1.77%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.98% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
| ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [fdf5e88...ecd2769](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (c8cd289) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.04%`.
> The diff coverage is `94.48%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.61% 82.56% -0.05%
==========================================
Files 203 203
Lines 50140 50070 -70
==========================================
- Hits 41422 41340 -82
- Misses 8718 8730 +12
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
| [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
| [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
| [rust/arrow/src/array/transform/mod.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL21vZC5ycw==) | `88.29% <83.33%> (ø)` | |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `96.79% <91.52%> (-0.94%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
| ... and [24 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...c8cd289](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on a change in pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r555376474
##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -184,23 +180,33 @@ fn is_like_pattern(c: char) -> bool {
pub fn like_utf8_scalar(left: &StringArray, right: &str) -> Result<BooleanArray> {
let null_bit_buffer = left.data().null_buffer().cloned();
- let mut result = BooleanBufferBuilder::new(left.len());
+ let bytes = bit_util::ceil(left.len(), 8);
Review comment:
this looks like it was an additional performance improvement?
##########
File path: rust/arrow/src/array/transform/utils.rs
##########
@@ -15,16 +15,14 @@
// specific language governing permissions and limitations
// under the License.
-use crate::{
- array::OffsetSizeTrait, buffer::MutableBuffer, datatypes::ToByteSlice, util::bit_util,
-};
+use crate::{array::OffsetSizeTrait, buffer::MutableBuffer, util::bit_util};
/// extends the `buffer` to be able to hold `len` bits, setting all bits of the new size to zero.
#[inline]
-pub(super) fn reserve_for_bits(buffer: &mut MutableBuffer, len: usize) {
+pub(super) fn resize_for_bits(buffer: &mut MutableBuffer, len: usize) {
Review comment:
this is just a rename to make the names match more closely what they do, right?
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -758,39 +793,51 @@ impl MutableBuffer {
}
}
- /// Ensures that this buffer has at least `capacity` slots in this buffer. This will
- /// also ensure the new capacity will be a multiple of 64 bytes.
- ///
- /// Returns the new capacity for this buffer.
- pub fn reserve(&mut self, capacity: usize) -> usize {
- if capacity > self.capacity {
- let new_capacity = bit_util::round_upto_multiple_of_64(capacity);
- let new_capacity = cmp::max(new_capacity, self.capacity * 2);
+ /// Ensures that this buffer has at least `self.len + additional` bytes. This re-allocates iff
+ /// `self.len + additional > capacity`.
+ /// # Example
+ /// ```
+ /// # use arrow::buffer::{Buffer, MutableBuffer};
+ /// let mut buffer = MutableBuffer::new(0);
+ /// buffer.reserve(253); // allocates for the first time
+ /// (0..253u8).for_each(|i| buffer.push(i)); // no reallocation
+ /// let buffer: Buffer = buffer.into();
+ /// assert_eq!(buffer.len(), 253);
+ /// ```
+ // For performance reasons, this must be inlined so that the `if` is executed inside the caller, and not as an extra call that just
+ // exits.
+ #[inline(always)]
Review comment:
I would be that the use of `inline(always)` is what causes this PR to slow down some kernels (as the extra code either disables some additional optimization or blows some cache level)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753454845
# [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=h1) Report
> Merging [#9076](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=desc) (f63e28e) into [master](https://codecov.io/gh/apache/arrow/commit/4b7cdcb9220b6d94b251aef32c21ef9b4097ecfa?el=desc) (4b7cdcb) will **decrease** coverage by `0.06%`.
> The diff coverage is `94.30%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9076/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #9076 +/- ##
==========================================
- Coverage 82.61% 82.54% -0.07%
==========================================
Files 203 204 +1
Lines 50140 50096 -44
==========================================
- Hits 41422 41353 -69
- Misses 8718 8743 +25
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/bytes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnl0ZXMucnM=) | `53.12% <ø> (-5.21%)` | :arrow_down: |
| [rust/arrow/src/compute/kernels/aggregate.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2FnZ3JlZ2F0ZS5ycw==) | `74.93% <ø> (-0.07%)` | :arrow_down: |
| [rust/arrow/src/datatypes.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvZGF0YXR5cGVzLnJz) | `75.02% <ø> (ø)` | |
| [rust/arrow/src/array/transform/list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2xpc3QucnM=) | `86.20% <66.66%> (+2.33%)` | :arrow_up: |
| [rust/arrow/src/array/builder.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYnVpbGRlci5ycw==) | `85.23% <86.79%> (-0.67%)` | :arrow_down: |
| [rust/arrow/src/buffer.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYnVmZmVyLnJz) | `95.95% <88.67%> (-1.77%)` | :arrow_down: |
| [rust/arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfYmluYXJ5LnJz) | `90.54% <100.00%> (ø)` | |
| [rust/arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfbGlzdC5ycw==) | `92.66% <100.00%> (-0.45%)` | :arrow_down: |
| [rust/arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfcHJpbWl0aXZlLnJz) | `92.28% <100.00%> (-0.04%)` | :arrow_down: |
| [rust/arrow/src/array/array\_string.rs](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvYXJyYXlfc3RyaW5nLnJz) | `89.94% <100.00%> (-0.22%)` | :arrow_down: |
| ... and [34 more](https://codecov.io/gh/apache/arrow/pull/9076/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=footer). Last update [4b7cdcb...f63e28e](https://codecov.io/gh/apache/arrow/pull/9076?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753453873
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb edited a comment on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758263983
FWIW given the "touch most of the codepaths" property of this PR, it might be good to wait to merge this PR until after 3.0 is shipped in case it introduces any subtle new bugs we would have longer to shake the out
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753458832
Great! Besides being a speed up, being able to `push` something of a certain type is nice.
I did some quick benchmarking of your branch on DataFusion - I see a speed up of ~5% on q1 / q12.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550909022
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
}
}
- /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+ /// Extends the buffer from a slice, increasing its capacity if needed.
+ #[inline]
+ pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+ let additional = items.len() * std::mem::size_of::<T>();
+ let new_len = self.len + additional;
+ if new_len > self.capacity {
+ self.reserve(additional);
+ }
+ unsafe {
+ let dst = self.data.as_ptr().add(self.len) as *mut T;
+ let src = items.as_ptr() as *const T;
+ std::ptr::copy_nonoverlapping(src, dst, items.len())
+ }
+ self.len = new_len;
+ }
+
+ /// Extends the buffer with a new item, increasing its capacity if needed.
#[inline]
- pub fn extend_from_slice(&mut self, bytes: &[u8]) {
- let additional = bytes.len();
- if self.len + additional > self.capacity() {
+ pub fn push<T: ToByteSlice>(&mut self, item: &T) {
Review comment:
Could this be `item: T` too?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorgecarleitao commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550870170
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -56,6 +57,23 @@ pub struct Buffer {
}
impl Buffer {
+ /// Initializes a [Buffer] from a slice of items.
+ pub fn from_slice_ref<U: ArrowNativeType, T: AsRef<[U]>>(items: &T) -> Self {
Review comment:
This was necessary because it is not possible to have this generic on `impl`, see https://users.rust-lang.org/t/how-to-implement-from-asref-u-for-generic-u/53533/5
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550881402
##########
File path: rust/arrow/src/compute/kernels/comparison.rs
##########
@@ -48,9 +48,8 @@ macro_rules! compare_op {
combine_option_bitmap($left.data_ref(), $right.data_ref(), $left.len())?;
let byte_capacity = bit_util::ceil($left.len(), 8);
- let actual_capacity = bit_util::round_upto_multiple_of_64(byte_capacity);
Review comment:
:+1: was just looking at this, was added without any need by me
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] maxburke commented on a change in pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
maxburke commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r560497145
##########
File path: rust/arrow/src/lib.rs
##########
@@ -139,15 +139,15 @@ mod arch;
pub mod array;
pub mod bitmap;
pub mod buffer;
-pub mod bytes;
+mod bytes;
pub mod compute;
pub mod csv;
pub mod datatypes;
pub mod error;
pub mod ffi;
pub mod ipc;
pub mod json;
-pub mod memory;
+mod memory;
Review comment:
@jorgecarleitao
This change has broken a bunch of our code downstream as we build buffers directly. Can you make the memory module public again?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-763050303
The fix is merged, so hopefully master is back 🟢 ✅
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan edited a comment on pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan edited a comment on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753458832
Great! Besides being a speed up, being able to `push` something of a certain type.
I did some quick benchmarking of your branch on DataFusion - I see a speed up of ~5% on q1 / q12.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-758262159
I would also like to note that this is EPIC work @jorgecarleitao -- nicely done.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #9076: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#discussion_r550909002
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
}
}
- /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+ /// Extends the buffer from a slice, increasing its capacity if needed.
+ #[inline]
+ pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+ let additional = items.len() * std::mem::size_of::<T>();
+ let new_len = self.len + additional;
+ if new_len > self.capacity {
+ self.reserve(additional);
+ }
+ unsafe {
+ let dst = self.data.as_ptr().add(self.len) as *mut T;
+ let src = items.as_ptr() as *const T;
+ std::ptr::copy_nonoverlapping(src, dst, items.len())
+ }
+ self.len = new_len;
+ }
+
+ /// Extends the buffer with a new item, increasing its capacity if needed.
#[inline]
- pub fn extend_from_slice(&mut self, bytes: &[u8]) {
- let additional = bytes.len();
- if self.len + additional > self.capacity() {
+ pub fn push<T: ToByteSlice>(&mut self, item: &T) {
+ let additional = std::mem::size_of::<T>();
+ let new_len = self.len + additional;
+ if new_len > self.capacity {
Review comment:
I am wondering if we in a followup can make `len` / `capacity` similar as `Vec` to be in number of items instead of number of bytes?
This makes some implementations easier, e.g. looking at Vec https://doc.rust-lang.org/src/alloc/vec.rs.html#1206
##########
File path: rust/arrow/src/buffer.rs
##########
@@ -852,17 +870,34 @@ impl MutableBuffer {
}
}
- /// Extends the buffer from a byte slice, incrementing its capacity if needed.
+ /// Extends the buffer from a slice, increasing its capacity if needed.
+ #[inline]
+ pub fn extend_from_slice<T: ToByteSlice>(&mut self, items: &[T]) {
+ let additional = items.len() * std::mem::size_of::<T>();
+ let new_len = self.len + additional;
+ if new_len > self.capacity {
+ self.reserve(additional);
+ }
+ unsafe {
+ let dst = self.data.as_ptr().add(self.len) as *mut T;
+ let src = items.as_ptr() as *const T;
+ std::ptr::copy_nonoverlapping(src, dst, items.len())
+ }
+ self.len = new_len;
+ }
+
+ /// Extends the buffer with a new item, increasing its capacity if needed.
#[inline]
- pub fn extend_from_slice(&mut self, bytes: &[u8]) {
- let additional = bytes.len();
- if self.len + additional > self.capacity() {
+ pub fn push<T: ToByteSlice>(&mut self, item: &T) {
Review comment:
Could this be `T` too?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #9076: ARROW-11108: [Rust] Fixed performance issue in mutableBuffer.
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9076:
URL: https://github.com/apache/arrow/pull/9076#issuecomment-753522240
https://issues.apache.org/jira/browse/ARROW-11108
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org