You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/04 13:44:37 UTC

[GitHub] [beam] damccorm opened a new issue, #19842: Verify that SDF-related functionality is compatible with multiworker environment for FnApiRunner

damccorm opened a new issue, #19842:
URL: https://github.com/apache/beam/issues/19842

    
   
   The FnApiRunner currently avoids splitting deferred inputs for multiple workers: [https://github.com/apache/beam/blob/release-2.16.0/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L771-L793](https://github.com/apache/beam/blob/release-2.16.0/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L771-L793)
   
    
   
   These issues are being surfaced as I convert the FnApiRunner to work based on ready elements instead of executing stage per stage: [https://github.com/apache/beam/pull/10067](https://github.com/apache/beam/pull/10067)
   
   We should verify that the work items coming back from parallel SDFs are being merged properly.
   
   Symptoms that I'm seeing are duplication of element processing for SDF tests:
   ```
   
   TEST: apache_beam.runners.portability.fn_api_runner_test.FnApiRunnerSplitTestWithMultiWorkers.test_split_crazy_sdf
   EXPECTED:
   [
   
   (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), 
    (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6), (9,
   7), (9, 8), 
    (8, 0), (8, 1), (8, 2), (8, 3), (8, 4), (8, 5), (8, 6), (8, 7), 
    (8, 0), (8, 1), (8,
   2), (8, 3), (8, 4), (8, 5), (8, 6), (8, 7), 
    (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6),
   (9, 7), (9, 8)] == 
   
   ACTUAL OUTPUT:
   [
    (8, 0), (8, 1), (8, 2), (8, 3), (8, 4), (8, 5), (8, 6), (8,
   7), 
    (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6), (9, 7), (9, 8), 
    (8, 0), (8, 1), (8,
   2), (8, 3), (8, 4), (8, 5), (8, 6), (8, 7), 
    (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 7), (9, 8),
   (9, 5), (9, 6), 
    (5, 0), (5, 1), (5, 2), (5, 3), (5, 4),  
    (8, 0), (8, 1), (8, 2), (8, 3), (8, 4),
   (8, 5), (8, 6), (8, 7), 
    (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6), (9, 7), (9, 8)]
   
   ```
   
   Some comments from Robert:
   
    
   ```
   
   Or the _add_residuals_and_channel_splits_to_deferred_inputs method. Looks like it has side effects?Hmm....
   looks like this code was added for the multi-worker case. (And the comments and the TODO are unrelated.)I
   think this is in reference to the fact that https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L1826 does
   not do the right thing yet, but I wonder how that's OK the first time around....there might be a bug
   lurking here. 
   
   ```
   
   cc: [~boyuanz], [~robertwb]
   
    
   
   Imported from Jira [BEAM-8833](https://issues.apache.org/jira/browse/BEAM-8833). Original Jira may contain additional context.
   Reported by: pabloem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org