You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/11 19:38:00 UTC

[GitHub] [arrow-rs] pacman82 opened a new issue, #1687: Support writing parquet to stdout

pacman82 opened a new issue, #1687:
URL: https://github.com/apache/arrow-rs/issues/1687

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 
   (This section helps Arrow developers understand the context and *why* for this feature, in addition to  the *what*)
   
   I would like to write parquet files to "true" streams. E.g. stdout. This is in the context of the downstream `odbc2parquet` tool, for which I would like provide the option. This would allow my users to stream the parquet directly into a key value store or other sink just using pipes in their shell.
   
   **Describe the solution you'd like**
   I would like to see the `Seek` + `TryClone` requirement dropped as a requirement to initialize a `SerializedFileWriter`. From what I've seen at least the `Seek` requirement is used to determine the length of the Metadata written into the stream. Or tracking stream position in general. I feel `Seek` is to strong a requirement to just keep track of a position, or bytes written.
   
   **Describe alternatives you've considered**
   I have not considered any alternatives. Happy to hear about them, though.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1133632599

   I'm currently working on this as part of fixing #1717 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold closed issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
tustvold closed issue #1687: Support writing parquet to stdout
URL: https://github.com/apache/arrow-rs/issues/1687


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] tustvold commented on issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1124872693

   As described in https://github.com/apache/arrow-rs/issues/937 it should be relatively straightforward to drop the seek requirement. A PR would be most welcome, otherwise I can try to take a stab when I have time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] pacman82 commented on issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
pacman82 commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1124970780

   Great. Sorry for missing the existing issue. I looked, but not thouroghly enough it seems. I also found it would be rather straight forward to drop it. Might become my first contribution, if I am not kept busy with issues on the downstream artefacts. My intention with this issue, was exactly to verify that such a PR indee would be welcome.
   
   Thanks for the quick response!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1124867487

   This may be related to some work @tustvold  has planned for the parquet reader such as https://github.com/apache/arrow-rs/issues/1605
   
   I haven't heard him discuss anything about the writer yet though


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on issue #1687: Support writing parquet to `stdout`

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1151134975

   planned for release in 16.0.0 (eta early next week)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] pacman82 commented on issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
pacman82 commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1133664190

   @tustvold Great! This will unblock new features in downstream crate `odbc2parqet`. 🙇 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] pacman82 commented on issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
pacman82 commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1134680886

   This is great. I'm not sure what the etiquette here is. Am I supposed to close this issue, or do the maintainers do so? For me the change in signature is enough to verify that it solves my use-case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on issue #1687: Support writing parquet to stdout

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1687:
URL: https://github.com/apache/arrow-rs/issues/1687#issuecomment-1134535755

   Specifically, https://github.com/apache/arrow-rs/pull/1719 allows `SerializedFileWriter` to write to anything that implements `std::io::Write` as is common in the rust ecosystem 🎉 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org