You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "pitrou (via GitHub)" <gi...@apache.org> on 2023/05/18 14:34:14 UTC

[GitHub] [arrow] pitrou commented on a diff in pull request #35670: GH-35665: [C++][Parquet] DeltaLengthByteArrayEncoder::Put reserve too much space

pitrou commented on code in PR #35670:
URL: https://github.com/apache/arrow/pull/35670#discussion_r1197905549


##########
cpp/src/parquet/encoding.cc:
##########
@@ -2656,19 +2656,23 @@ void DeltaLengthByteArrayEncoder<DType>::Put(const T* src, int num_values) {
 
   constexpr int kBatchSize = 256;
   std::array<int32_t, kBatchSize> lengths;
+  uint32_t total_increment_size = 0;
   for (int idx = 0; idx < num_values; idx += kBatchSize) {
     const int batch_size = std::min(kBatchSize, num_values - idx);
     for (int j = 0; j < batch_size; ++j) {
       const int32_t len = src[idx + j].len;
-      if (AddWithOverflow(encoded_size_, len, &encoded_size_)) {
+      if (AddWithOverflow(total_increment_size, len, &total_increment_size)) {
         throw ParquetException("excess expansion in DELTA_LENGTH_BYTE_ARRAY");
       }
       lengths[j] = len;
     }
     length_encoder_.Put(lengths.data(), batch_size);
   }
 
-  PARQUET_THROW_NOT_OK(sink_.Reserve(encoded_size_));
+  PARQUET_THROW_NOT_OK(sink_.Reserve(total_increment_size));
+  if (AddWithOverflow(encoded_size_, total_increment_size, &encoded_size_)) {
+    throw ParquetException("excess expansion in DELTA_LENGTH_BYTE_ARRAY");
+  }

Review Comment:
   Let's check before reallocating?
   ```suggestion
     if (AddWithOverflow(encoded_size_, total_increment_size, &encoded_size_)) {
       throw ParquetException("excess expansion in DELTA_LENGTH_BYTE_ARRAY");
     }
     PARQUET_THROW_NOT_OK(sink_.Reserve(total_increment_size));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org