You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/22 00:41:48 UTC

[GitHub] [beam] Shiv22Wabale opened a new issue, #21970: [Feature Request]: Python SDK | File IO | Dynamically update the parent directory.

Shiv22Wabale opened a new issue, #21970:
URL: https://github.com/apache/beam/issues/21970

   ### What would you like to happen?
   
   We are looking into writing into separate directories using the python apache beam using fileio.WriteToFiles. But it only gives option to write into separate [files](https://github.com/apache/beam/blob/316c969b9c373177986ee0fbc794ed2887244328/sdks/python/apache_beam/io/fileio.py#L678-L679), not in separate parent directories.
   
   We are writing into GCS (Google Cloud Storage) were we really don't have any directory/folder, it is a flat structure. So writing into a new file with different prefix doesn't require any folder creation. 
   
   I ran small tests by updating the 
   ```
         final_full_path = filesystems.FileSystems.join(
             self.path.get(), final_file_name)
   ```
   to 
   ```
   final_full_path = final_file_name
   ```
   
   Provided the whole path through `file_naming=`, it successfully wrote to two separate paths - 
   ```
   1. /bigstore/customer-A/dfasfasdfasdfas/...
   
   2. /bigstore/customer-B/jlkjjoijmnyiouuy/...
   ```
   
   We would like to update have this as part of the feature which would be beneficial while using the flat structure file systems like GCS.
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: io-py-files


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] kennknowles commented on issue #21970: [Feature Request]: Python SDK | File IO | Dynamically update the parent directory.

Posted by GitBox <gi...@apache.org>.
kennknowles commented on issue #21970:
URL: https://github.com/apache/beam/issues/21970#issuecomment-1335862818

   @johnjcasey @tvalentyn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org