You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Mihir Borkar (Jira)" <ji...@apache.org> on 2020/05/22 20:28:00 UTC
[jira] [Created] (BEAM-10068) Modify behavior of Dynamic
Destinations
Mihir Borkar created BEAM-10068:
-----------------------------------
Summary: Modify behavior of Dynamic Destinations
Key: BEAM-10068
URL: https://issues.apache.org/jira/browse/BEAM-10068
Project: Beam
Issue Type: Improvement
Components: sdk-java-core
Reporter: Mihir Borkar
The writeDynamic() method, implementing Dynamic Destinations writes files per destination per window per pane.
This leads to an increase in the number of files generated.
The request is as follows:
A way to make it possible for the user to modify the behavior of Dynamic Destinations to control the number of output files being produced.
a.) We can consider adding user-configurable parameters like writers per bundle, increasing number of records processed per bundle
and/or
b.) Introduce a method implementing Dynamic Destinations but more dependent on the data passing through the pipeline, instead of windows/panes.
So instead of splitting every output file into roughly the number of destinations being written to, we let the user configure how output files should be divided across destinations.
Links:
[1] [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html]
[2] [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)