You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/11/20 19:53:26 UTC

[GitHub] [arrow] carols10cents opened a new pull request #8726: ARROW-10667: [Rust] [Parquet] Add a convenience type for writing Parquet to memory

carols10cents opened a new pull request #8726:
URL: https://github.com/apache/arrow/pull/8726


   Similar to the [`SliceableCursor`](https://github.com/apache/arrow/blob/0e841aa666b637e24e2889acab7621aa01fb7bcf/rust/parquet/src/util/cursor.rs#L22-L27) type that provides a convenience for reading Parquet from memory, I would like to propose a type to make it convenient to write Parquet to memory.
   
   This is possible for clients to implement today, but seems common enough to want to provide for everyone to use.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] sunchao commented on a change in pull request #8726: ARROW-10667: [Rust] [Parquet] Add a convenience type for writing Parquet to memory

Posted by GitBox <gi...@apache.org>.
sunchao commented on a change in pull request #8726:
URL: https://github.com/apache/arrow/pull/8726#discussion_r527982715



##########
File path: rust/parquet/src/util/cursor.rs
##########
@@ -129,6 +132,56 @@ impl Seek for SliceableCursor {
     }
 }
 
+/// Use this type to write Parquet to memory rather than a file.
+#[derive(Debug, Default, Clone)]
+pub struct WriteableCursor {

Review comment:
       nit: it's not easy to see this writes to memory - maybe we can make it more explicit in naming like `InMemoryWriteableCursor` or similar?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb closed pull request #8726: ARROW-10667: [Rust] [Parquet] Add a convenience type for writing Parquet to memory

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #8726:
URL: https://github.com/apache/arrow/pull/8726


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] carols10cents commented on pull request #8726: ARROW-10667: [Rust] [Parquet] Add a convenience type for writing Parquet to memory

Posted by GitBox <gi...@apache.org>.
carols10cents commented on pull request #8726:
URL: https://github.com/apache/arrow/pull/8726#issuecomment-732260657


   @sunchao Thank you for the suggestion for renaming the type to `InMemoryWriteableCursor`! I do think that's clearer. I've renamed the type, rebased this branch, and resolved merge conflicts, so I think this is ready to go pending CI.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8726: ARROW-10667: [Rust] [Parquet] Add a convenience type for writing Parquet to memory

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8726:
URL: https://github.com/apache/arrow/pull/8726#issuecomment-731385061


   https://issues.apache.org/jira/browse/ARROW-10667


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on a change in pull request #8726: ARROW-10667: [Rust] [Parquet] Add a convenience type for writing Parquet to memory

Posted by GitBox <gi...@apache.org>.
alamb commented on a change in pull request #8726:
URL: https://github.com/apache/arrow/pull/8726#discussion_r527978914



##########
File path: rust/parquet/src/util/cursor.rs
##########
@@ -129,6 +132,56 @@ impl Seek for SliceableCursor {
     }
 }
 
+/// Use this type to write Parquet to memory rather than a file.
+#[derive(Debug, Default, Clone)]
+pub struct WriteableCursor {
+    buffer: Arc<Mutex<Cursor<Vec<u8>>>>,
+}
+
+impl WriteableCursor {
+    /// Consume this instance and return the underlying buffer as long as there are no other
+    /// references to this instance.
+    pub fn into_inner(self) -> Option<Vec<u8>> {
+        Arc::try_unwrap(self.buffer)
+            .ok()
+            .and_then(|mutex| mutex.into_inner().ok())
+            .map(|cursor| cursor.into_inner())
+    }
+
+    /// Returns a clone of the underlying buffer
+    pub fn data(&self) -> Vec<u8> {
+        let inner = self.buffer.lock().unwrap();
+        inner.get_ref().to_vec()
+    }
+}
+
+impl TryClone for WriteableCursor {
+    fn try_clone(&self) -> std::io::Result<Self> {
+        Ok(Self {
+            buffer: self.buffer.clone(),

Review comment:
       THis is clever. 👍 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org