You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Rok Mihevc (Jira)" <ji...@apache.org> on 2022/04/21 17:07:00 UTC
[jira] [Assigned] (ARROW-16147) [C++] ParquetFileWriter doesn't call sink_.Close when using GcsRandomAccessFile
[ https://issues.apache.org/jira/browse/ARROW-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rok Mihevc reassigned ARROW-16147:
----------------------------------
Assignee: Micah Kornfield
> [C++] ParquetFileWriter doesn't call sink_.Close when using GcsRandomAccessFile
> -------------------------------------------------------------------------------
>
> Key: ARROW-16147
> URL: https://issues.apache.org/jira/browse/ARROW-16147
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Rok Mihevc
> Assignee: Micah Kornfield
> Priority: Major
> Labels: GCP
>
> On parquet::arrow::FileWriter::Close the underlying sink is not closed. The implementation goes to FileSerializer::Close:
> {code:cpp}
> void Close() override {
> if (is_open_) {
> // If any functions here raise an exception, we set is_open_ to be false
> // so that this does not get called again (possibly causing segfault)
> is_open_ = false;
> if (row_group_writer_) {
> num_rows_ += row_group_writer_->num_rows();
> row_group_writer_->Close();
> }
> row_group_writer_.reset();
> // Write magic bytes and metadata
> auto file_encryption_properties = properties_->file_encryption_properties();
> if (file_encryption_properties == nullptr) { // Non encrypted file.
> file_metadata_ = metadata_->Finish();
> WriteFileMetaData(*file_metadata_, sink_.get());
> } else { // Encrypted file
> CloseEncryptedFile(file_encryption_properties);
> }
> }
> }
> {code}
> It doesn't call sink_->Close(), which leads to resource leaking and bugs.
> With files (they have own close() in destructor) it works fine, but doesn't work with fs::GcsRandomAccessFile. When I calling parquet::arrow::FileWriter::Close the data is not flushed to storage, until manual close of a sink stream (or stack space change).
> Is it done by intention or a bug?
--
This message was sent by Atlassian Jira
(v8.20.7#820007)