You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/09/10 00:24:00 UTC

[jira] [Work logged] (BEAM-10703) Add support for auto-sharded GroupIntoBatches in Dataflow runner

     [ https://issues.apache.org/jira/browse/BEAM-10703?focusedWorklogId=481174&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-481174 ]

ASF GitHub Bot logged work on BEAM-10703:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 10/Sep/20 00:23
            Start Date: 10/Sep/20 00:23
    Worklog Time Spent: 10m 
      Work Description: nehsyc commented on a change in pull request #12678:
URL: https://github.com/apache/beam/pull/12678#discussion_r485991869



##########
File path: runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
##########
@@ -1264,6 +1268,10 @@ private static void translateFn(
     // in streaming but does not work in batch
     if (context.getPipelineOptions().isStreaming() && isStateful) {
       stepContext.addInput(PropertyNames.USES_KEYED_STATE, "true");

Review comment:
       Update on this:
   
   I added an experiment to gate the auto-sharding so this can be merged without waiting for the backend. It will also make the testing easier.
   
   I also added a check for the experiment, "beam_fn_api". My intention was to disable the feature for unified worker but I guess this way we would disable auto-sharding for both unified worker and java worker using fn api - I remember that we are not going to support the latter so it seems fine to me. But let me know if my understanding is incorrect.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 481174)
    Time Spent: 6h  (was: 5h 50m)

> Add support for auto-sharded GroupIntoBatches in Dataflow runner
> ----------------------------------------------------------------
>
>                 Key: BEAM-10703
>                 URL: https://issues.apache.org/jira/browse/BEAM-10703
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-dataflow
>            Reporter: Siyuan Chen
>            Assignee: Siyuan Chen
>            Priority: P2
>          Time Spent: 6h
>  Remaining Estimate: 0h
>
> The proposal of improving GroupIntoBatches transform is in BEAM-10475
> This tracks the support in Cloud Dataflow Runner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)