You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/30 06:18:32 UTC

[GitHub] [arrow-rs] liyongjing opened a new issue, #1969: close function instead of mutable reference

liyongjing opened a new issue, #1969:
URL: https://github.com/apache/arrow-rs/issues/1969

   **Describe the bug**
   A clear and concise description of what the bug is.
   ```rust
   use super::builder;
   use arrow::record_batch::RecordBatch;
   use parquet::{
       arrow::ArrowWriter,
       basic::Compression,
       file::properties::{WriterProperties, WriterVersion},
   };
   use std::{fs::File, path::Path};
   
   pub struct Writer {
       writer: ArrowWriter<File>,
   }
   
   impl Writer {
       pub fn new<P: AsRef<Path>>(path: P) -> Self {
           let schema = builder::get_schema();
           let file = File::create(&path).unwrap();
           let props = WriterProperties::builder().build();
           let writer = ArrowWriter::try_new(file, schema, Some(props)).unwrap();
           Self { writer }
       }
   
       pub fn write(&mut self, batch: &RecordBatch) {
           self.writer.write(batch).expect("Writing batch");
       }
   
       pub fn close(&mut self) {
           // writer must be closed to write footer
           self.writer.close().unwrap();
       }
   }
   
   ```
   
   **To Reproduce**
   Steps to reproduce the behavior:
   ```
      |         self.writer.close().unwrap();
      |         ^^^^^^^^^^^ move occurs because `self.writer` has type `ArrowWriter<std::fs::File>`, which does not implement the `Copy` trait
   ```
   
   **Expected behavior**
   A clear and concise description of what you expected to happen.
   
   `write` and `close` use the same parameters
   
   
   parquet-17.0.0/src/arrow/arrow_writer/mod.rs 222 line
   ```rust
       /// Close and finalize the underlying Parquet writer
       pub fn close(&mut self) -> Result<parquet_format::FileMetaData> {
           self.flush()?;
           self.writer.close()
       }
   ```
   parquet-17.0.0/src/file/writer.rs 167 line
   ```rust
       pub fn close(&mut self) -> Result<parquet::FileMetaData> {
           self.assert_previous_writer_closed()?;
           let metadata = self.write_metadata()?;
           Ok(metadata)
       }
   ```
   
   **Additional context**
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] liyongjing commented on issue #1969: close function instead of mutable reference

Posted by GitBox <gi...@apache.org>.
liyongjing commented on issue #1969:
URL: https://github.com/apache/arrow-rs/issues/1969#issuecomment-1172503242

   thanks for you comment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1969: close function instead of mutable reference

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1969:
URL: https://github.com/apache/arrow-rs/issues/1969#issuecomment-1172023711

   How about?
   ```
   pub fn close(path: &str) -> Result<()> {
       let mut g = global_channel().lock().unwrap()
       if let Some(v) = g.remove(path) {
           v.close()?;
       }
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] liyongjing commented on issue #1969: close function instead of mutable reference

Posted by GitBox <gi...@apache.org>.
liyongjing commented on issue #1969:
URL: https://github.com/apache/arrow-rs/issues/1969#issuecomment-1171850627

   I call `write` multiple times, call `close` one times using static HashMap.
   ```rust
   fn global() -> &'static Mutex<HashMap<String, Writer>> {
       static INSTANCE: OnceCell<Mutex<HashMap<String, Writer>>> = OnceCell::new();
       INSTANCE.get_or_init(|| Mutex::new(HashMap::new()))
   }
   pub fn write(path: &str, val: Vec<u8>) {
       let mut g = global_channel().lock().unwrap();
       let w = g.entry(path.to_string()).or_insert(Writer::new(path));
       let batch =  ...;
       w.write(&batch);
   }
   pub fn close(path: &str) {
       let mut g = global_channel().lock().unwrap();
       if let Some(v) = g.get_mut(path) {
           // *v don't copy
           v.close();
       }
       g.remove(path);
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1969: close function instead of mutable reference

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1969:
URL: https://github.com/apache/arrow-rs/issues/1969#issuecomment-1171016207

   This was an intentional change, to truncate the lifetime of the writer on calling close. See this thread for more context https://github.com/apache/arrow-rs/pull/1719#discussion_r878742614
   
   The change is more obviously beneficial on the ColumnWriter, etc... but you can also see it on ArrowWriter. Consider
   
   ```
   let file = File::create(&path).unwrap();
   let writer = ArrowWriter::try_new(&mut file, ...);
   
   ...
   
   writer.close();
   
   file.rewind().unwrap();
   ```
   
   Without this change you need to manually drop the `writer` in order to reuse the file. We also benefit from the borrow checker making sure we can't call close multiple times.
   
   If you are integrating with some system that requires `&mut self` for some reason, Option::take should allow you to get the previous semantics.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] liyongjing closed issue #1969: close function instead of mutable reference

Posted by GitBox <gi...@apache.org>.
liyongjing closed issue #1969: close function instead of mutable reference
URL: https://github.com/apache/arrow-rs/issues/1969


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org