You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/07/31 05:15:37 UTC

[GitHub] [beam] reuvenlax opened a new pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

reuvenlax opened a new pull request #15256:
URL: https://github.com/apache/beam/pull/15256


   Reverts apache/beam#15178


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn closed pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

Posted by GitBox <gi...@apache.org>.
tvalentyn closed pull request #15256:
URL: https://github.com/apache/beam/pull/15256


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn commented on pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on pull request #15256:
URL: https://github.com/apache/beam/pull/15256#issuecomment-894043622


   closing in favor of https://github.com/apache/beam/pull/15287


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] nehsyc commented on pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

Posted by GitBox <gi...@apache.org>.
nehsyc commented on pull request #15256:
URL: https://github.com/apache/beam/pull/15256#issuecomment-891195968


   If I understand it correctly the flakiness was caused by running a streaming pipeline on a bounded source in the word count pipeline. Note that the implementation for runner determined sharding in the `WriteFiles` is different for bounded and unbounded source. So the override was only used by bounded source in streaming and without the override the word count pipeline unexpectedly picked up the implementation for batch.
   
   Perhaps a better fix might be to modify the override to also check the bounded-ness of the input so unbounded datacan still adopt proper runner determined sharding implementation for streaming.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn commented on pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on pull request #15256:
URL: https://github.com/apache/beam/pull/15256#issuecomment-893695593


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] reuvenlax commented on pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

Posted by GitBox <gi...@apache.org>.
reuvenlax commented on pull request #15256:
URL: https://github.com/apache/beam/pull/15256#issuecomment-891198489


   I'm not convinced that the flakiness was caused by this PR, as it happens
   even when this PR is rolled back.
   
   I'm still confused about how WordCount ever succeeds after this PR was
   submitted as it writes to files, it does not specify numShards, and it also
   runs on windmill appliance.
   
   On Mon, Aug 2, 2021 at 10:18 AM Siyuan Chen ***@***.***>
   wrote:
   
   > If I understand it correctly the flakiness was caused by running a
   > streaming pipeline on a bounded source in the word count pipeline. Note
   > that the implementation for runner determined sharding in the WriteFiles
   > is different for bounded and unbounded source. So the override was only
   > used by bounded source in streaming and without the override the word count
   > pipeline unexpectedly picked up the implementation for batch.
   >
   > Perhaps a better fix might be to modify the override to also check the
   > bounded-ness of the input so unbounded datacan still adopt proper runner
   > determined sharding implementation for streaming.
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/beam/pull/15256#issuecomment-891195968>, or
   > unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AFAYJVKIN6KIRFV66I2WC63T23HOTANCNFSM5BJU4KJQ>
   > .
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] reuvenlax commented on pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

Posted by GitBox <gi...@apache.org>.
reuvenlax commented on pull request #15256:
URL: https://github.com/apache/beam/pull/15256#issuecomment-890296194


   Run Java_Examples_Dataflow PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kileys commented on pull request #15256: Revert "[BEAM-11934] Remove Dataflow override of streaming WriteFiles with runner determined sharding"

Posted by GitBox <gi...@apache.org>.
kileys commented on pull request #15256:
URL: https://github.com/apache/beam/pull/15256#issuecomment-893762193


   There's an ongoing failure for org.apache.beam.examples.WordCountIT.testE2EWordCount unrelated to this change.
   
   https://issues.apache.org/jira/browse/BEAM-12699


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org