You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Mihir Borkar (Jira)" <ji...@apache.org> on 2020/05/22 20:29:00 UTC

[jira] [Updated] (BEAM-10068) Modify behavior of Dynamic Destinations

     [ https://issues.apache.org/jira/browse/BEAM-10068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mihir Borkar updated BEAM-10068:
--------------------------------
    Description: 
The writeDynamic() method, implementing Dynamic Destinations writes files per destination per window per pane. 

This leads to an increase in the number of files generated.

The request is as follows:

A way to make it possible for the user to modify the behavior of Dynamic Destinations to control the number of output files being produced.

a.) We can consider adding user-configurable parameters like writers per bundle, increasing number of records processed per bundle

and/or

b.) Introduce a method implementing Dynamic Destinations but more dependent on the data passing through the pipeline, instead of windows/panes.

So instead of splitting every output file into roughly the number of destinations being written to, we let the user configure how output files should be divided across destinations.

Links:

[1] [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html]

[2] [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java]

 

 

  was:
The writeDynamic() method, implementing Dynamic Destinations writes files per destination per window per pane. 

This leads to an increase in the number of files generated.

The request is as follows:

A way to make it possible for the user to modify the behavior of Dynamic Destinations to control the number of output files being produced.

a.) We can consider adding user-configurable parameters like writers per bundle, increasing number of records processed per bundle

and/or

b.) Introduce a method implementing Dynamic Destinations but more dependent on the data passing through the pipeline, instead of windows/panes.

So instead of splitting every output file into roughly the number of destinations being written to, we let the user configure how output files should be divided across destinations.

Links:

 [1] [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html]

[2] [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java]

 

 


> Modify behavior of Dynamic Destinations
> ---------------------------------------
>
>                 Key: BEAM-10068
>                 URL: https://issues.apache.org/jira/browse/BEAM-10068
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Mihir Borkar
>            Priority: P2
>
> The writeDynamic() method, implementing Dynamic Destinations writes files per destination per window per pane. 
> This leads to an increase in the number of files generated.
> The request is as follows:
> A way to make it possible for the user to modify the behavior of Dynamic Destinations to control the number of output files being produced.
> a.) We can consider adding user-configurable parameters like writers per bundle, increasing number of records processed per bundle
> and/or
> b.) Introduce a method implementing Dynamic Destinations but more dependent on the data passing through the pipeline, instead of windows/panes.
> So instead of splitting every output file into roughly the number of destinations being written to, we let the user configure how output files should be divided across destinations.
> Links:
> [1] [https://beam.apache.org/releases/javadoc/2.19.0/index.html?org/apache/beam/sdk/io/FileIO.html]
> [2] [https://github.com/apache/beam/blob/da9e17288e8473925674a4691d9e86252e67d7d7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)