You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "felipecrv (via GitHub)" <gi...@apache.org> on 2023/07/18 01:03:06 UTC

[GitHub] [arrow] felipecrv commented on issue #36708: `run_end_encode` segfaults on chunked arrays.

felipecrv commented on issue #36708:
URL: https://github.com/apache/arrow/issues/36708#issuecomment-1639118127

   Chunked arrays are handled by the kernel and even part of the unit tests, but the repro code above does indeed trigger the SIGSEGV.
   
   I isolated the cause of the SIGSEGV to some expectations in the allocation code not being preserved.
   
   ```cpp
   Result<std::shared_ptr<ArrayData>> PreallocateValuesArray(
       const std::shared_ptr<DataType>& value_type, bool has_validity_buffer, int64_t length,
       int64_t null_count, MemoryPool* pool, int64_t data_buffer_size) {
     std::vector<std::shared_ptr<Buffer>> values_data_buffers;
     std::shared_ptr<Buffer> validity_buffer = NULLPTR;
     if (has_validity_buffer) {
       ARROW_ASSIGN_OR_RAISE(validity_buffer, AllocateEmptyBitmap(length, pool));
       DCHECK(validity_buffer);
     }
     ARROW_ASSIGN_OR_RAISE(auto values_buffer, AllocateValuesBuffer(length, *value_type,
                                                                    pool, data_buffer_size));
     if (is_base_binary_like(value_type->id())) {
       const int offset_byte_width = offset_bit_width(value_type->id()) / 8;
       ARROW_ASSIGN_OR_RAISE(auto offsets_buffer,
                             AllocateBuffer((length + 1) * offset_byte_width, pool));
       // Ensure the first offset is zero
       memset(offsets_buffer->mutable_data(), 0, offset_byte_width);
       offsets_buffer->ZeroPadding();
       values_data_buffers = {validity_buffer, std::move(offsets_buffer),
                              std::move(values_buffer)};
     } else {
       values_data_buffers = {validity_buffer, std::move(values_buffer)};
     }
     auto data = ArrayData::Make(value_type, length, values_data_buffers, null_count);
     DCHECK(!has_validity_buffer || validity_buffer.use_count() == 2);
     DCHECK(!has_validity_buffer || validity_buffer != NULLPTR);
     DCHECK(!has_validity_buffer || data->buffers[0] != NULLPTR);
     return data;
   }
   ```
   
   `DCHECK(!has_validity_buffer || data->buffers[0] != NULLPTR);` fails after all the previous checks pass and I still don't understand why.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org