You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Rok Mihevc (Jira)" <ji...@apache.org> on 2022/04/21 17:07:00 UTC

[jira] [Assigned] (ARROW-16147) [C++] ParquetFileWriter doesn't call sink_.Close when using GcsRandomAccessFile

     [ https://issues.apache.org/jira/browse/ARROW-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rok Mihevc reassigned ARROW-16147:
----------------------------------

    Assignee: Micah Kornfield

> [C++] ParquetFileWriter doesn't call sink_.Close when using GcsRandomAccessFile
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-16147
>                 URL: https://issues.apache.org/jira/browse/ARROW-16147
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Rok Mihevc
>            Assignee: Micah Kornfield
>            Priority: Major
>              Labels: GCP
>
> On parquet::arrow::FileWriter::Close the underlying sink is not closed. The implementation goes to FileSerializer::Close:
> {code:cpp}
> void Close() override {
>     if (is_open_) {
>       // If any functions here raise an exception, we set is_open_ to be false
>       // so that this does not get called again (possibly causing segfault)
>       is_open_ = false;
>       if (row_group_writer_) {
>         num_rows_ += row_group_writer_->num_rows();
>         row_group_writer_->Close();
>       }
>       row_group_writer_.reset();
>       // Write magic bytes and metadata
>       auto file_encryption_properties = properties_->file_encryption_properties();
>       if (file_encryption_properties == nullptr) {  // Non encrypted file.
>         file_metadata_ = metadata_->Finish();
>         WriteFileMetaData(*file_metadata_, sink_.get());
>       } else {  // Encrypted file
>         CloseEncryptedFile(file_encryption_properties);
>       }
>     }
>   }
> {code}
> It doesn't call sink_->Close(), which leads to resource leaking and bugs.
> With files (they have own close() in destructor) it works fine, but doesn't work with fs::GcsRandomAccessFile. When I calling parquet::arrow::FileWriter::Close the data is not flushed to storage, until manual close of a sink stream (or stack space change).
> Is it done by intention or a bug?



--
This message was sent by Atlassian Jira
(v8.20.7#820007)