You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "conradludgate (via GitHub)" <gi...@apache.org> on 2023/12/29 10:40:51 UTC

[I] parquet: add method to get both the inner writer and the file metadata when closing SerializedFileWriter [arrow-rs]

conradludgate opened a new issue, #5253:
URL: https://github.com/apache/arrow-rs/issues/5253

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   <!--
   A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 
   (This section helps Arrow developers understand the context and *why* for this feature, in addition to  the *what*)
   -->
   
   I want to access the `FileMetadata` from a closed parquet file so that I can add some logging, but I also need to access the inner writer for further processing.
   
   **Describe the solution you'd like**
   <!--
   A clear and concise description of what you want to happen.
   -->
   
   `SerializedFileWriter` offers
   * `into_inner() -> Result<W>`
   * `close() -> Result<FileMetadata>`.
   
   The bodies of both functions are almost identical. Perhaps close can return `Result<(FileMetadata, W)>`.
   
   **Describe alternatives you've considered**
   <!--
   A clear and concise description of any alternative solutions or features you've considered.
   -->
   
   For now, I will use `into_inner()` and then open the file with `SerializedFileReader` to get the metadata.
   
   **Additional context**
   
   1. `close()` does not flush the file, which will ignore errors.
   2. I would like async support, but I don't want to go through arrow. For now I am writing to an in memory buffer and then flushing the buffer over the network after I close the file.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] parquet: add method to get both the inner writer and the file metadata when closing SerializedFileWriter [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold closed issue #5253: parquet: add method to get both the inner writer and the file metadata when closing SerializedFileWriter
URL: https://github.com/apache/arrow-rs/issues/5253


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] parquet: add method to get both the inner writer and the file metadata when closing SerializedFileWriter [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #5253:
URL: https://github.com/apache/arrow-rs/issues/5253#issuecomment-1871972067

   Adding a finish method that returns both makes sense to me


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] parquet: add method to get both the inner writer and the file metadata when closing SerializedFileWriter [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #5253:
URL: https://github.com/apache/arrow-rs/issues/5253#issuecomment-1998862618

   `label_issue.py` automatically added labels {'parquet'} from #5471


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org