You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "js8544 (via GitHub)" <gi...@apache.org> on 2023/03/23 09:48:51 UTC

[GitHub] [arrow] js8544 opened a new issue, #34702: [C++] BaseBinaryBuilder::Reserve doesn't reserve offsets_builder_

js8544 opened a new issue, #34702:
URL: https://github.com/apache/arrow/issues/34702

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   It seems to me that `BaseBinaryBuilder` doesn't override `Reserve`, so it's using `ArrayBuilder::Reserve`, which doesn't reserve space for `BaseBinaryBuilder::offsets_builder_`.
   
   However, some methods assume that `Reserve` does extend capacity for `offsets_builder_` such as 
   ```cpp
     Status AppendNulls(int64_t length) final {
       const int64_t num_bytes = value_data_builder_.length();
       ARROW_RETURN_NOT_OK(Reserve(length));
       for (int64_t i = 0; i < length; ++i) {
         offsets_builder_.UnsafeAppend(static_cast<offset_type>(num_bytes));
       }
       UnsafeAppendToBitmap(length, false);
       return Status::OK();
     }
   ```
   and
   ```cpp
     Status AppendEmptyValues(int64_t length) final {
       const int64_t num_bytes = value_data_builder_.length();
       ARROW_RETURN_NOT_OK(Reserve(length));
       for (int64_t i = 0; i < length; ++i) {
         offsets_builder_.UnsafeAppend(static_cast<offset_type>(num_bytes));
       }
       UnsafeAppendToBitmap(length, true);
       return Status::OK();
     }
   ```
   , while some methods don't: 
   ```cpp
     Status AppendValues(const std::vector<std::string>& values,
                         const uint8_t* valid_bytes = NULLPTR) {
       std::size_t total_length = std::accumulate(
           values.begin(), values.end(), 0ULL,
           [](uint64_t sum, const std::string& str) { return sum + str.size(); });
       ARROW_RETURN_NOT_OK(Reserve(values.size()));
       ARROW_RETURN_NOT_OK(value_data_builder_.Reserve(total_length));
       ARROW_RETURN_NOT_OK(offsets_builder_.Reserve(values.size())); // explicitly reserve offsets_builder_
       // ...
   ```
   and
   ```cpp
     Status Append(const uint8_t* value, offset_type length) {
       ARROW_RETURN_NOT_OK(Reserve(1));
       ARROW_RETURN_NOT_OK(AppendNextOffset()); // AppendNextOffset is used instead of UnsafeAppendNextOffset
       // Safety check for UBSAN.
       if (ARROW_PREDICT_TRUE(length > 0)) {
         ARROW_RETURN_NOT_OK(ValidateOverflow(length));
         ARROW_RETURN_NOT_OK(value_data_builder_.Append(value, length));
       }
   
       UnsafeAppendToBitmap(true);
       return Status::OK();
     }
   ```
   If I understand correctly, this means:
   1. Some of the methods like `AppendNulls` and `AppendEmptyValues` are unsafe.
   2. Users cannot safely use the `UnsafeAppend` methods because they can't reserve `offsets_builder_`.
   
   However, I am really skepetical of myself: there should have been many memory error reports if my understand is correct, but there aren't any. Am I missing something?
   
   
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] js8544 closed issue #34702: [C++] BaseBinaryBuilder::Reserve doesn't reserve offsets_builder_

Posted by "js8544 (via GitHub)" <gi...@apache.org>.
js8544 closed issue #34702: [C++] BaseBinaryBuilder::Reserve doesn't reserve offsets_builder_
URL: https://github.com/apache/arrow/issues/34702


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org