You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/07/28 04:34:55 UTC

[GitHub] [arrow-datafusion] liukun4515 commented on pull request #2968: fix `RowWriter` index out of bounds error

liukun4515 commented on PR #2968:
URL: https://github.com/apache/arrow-datafusion/pull/2968#issuecomment-1197648845

   > Thanks @comphead!
   > 
   > I propose we fix the followings:
   > 
   > * The `new_width > to.data.len()` change makes sense to me, but on the resize part:
   > 
   > ```rust
   >     if new_width > to.data.len() {
   >         to.data.resize(max(to.data.capacity(), new_width), 0);
   >     }
   > ```
   > 
   > I think we could just resize to capacity to avoid reallocating, instead of the previous double capacity way.
   > 
   > * `write_field_binary` should be adapted accordingly as well.
   > * The capacity() bug also exists in the `end_padding` logic, it should be:
   > 
   > ```rust
   >     /// End each row at 8-byte word boundary.
   >     pub(crate) fn end_padding(&mut self) {
   >         let payload_width = self.current_width();
   >         self.row_width = round_upto_power_of_2(payload_width, 8);
   >         if self.data.len() < self.row_width {
   >             self.data.resize(self.row_width, 0);
   >         }
   >     }
   > ```
   > 
   > @liukun4515 I cannot quite get your `+8` logic while calculating size in the comments above, I didn't get panic with your 110 empty string case either. Please correct me if I have missed something important.
   
   Sorry for the error comments with my wrong understanding for row layout.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org