You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "damccorm (via GitHub)" <gi...@apache.org> on 2023/01/20 20:02:06 UTC
[GitHub] [beam] damccorm opened a new pull request, #25106: Move CombineValues override to non portable overrides
damccorm opened a new pull request, #25106:
URL: https://github.com/apache/beam/pull/25106
Right now, running the following pipeline on Dataflow:
```
import logging
import apache_beam as beam
from apache_beam.options.pipeline_options import DebugOptions
def merge(vals):
out = ""
for v in vals:
out += v
return out
def run():
pipeline_options = DebugOptions()
pipeline_options.dataflow_job_file = "path/to/dump/proto"
with beam.Pipeline(options=pipeline_options) as p:
(p
| beam.Create([("key1", "foo"), ("key2", "bar"), ("key1", "baz")])
| beam.GroupByKey()
| beam.CombineValues(merge)
)
if __name__ == '__main__':
logging.getLogger().setLevel(logging.INFO)
run()
```
succeeds when `use_runner_v2` is not included as an experiment, but fails when it is. This is because the translation being done generates an invalid graph for the `use_runner_v2` case: a portable pipeline shouldn't be using the overridden combine here.
This fixes the problem by moving the override to only occur when `use_runner_v2` is not included
------------------------
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
- [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
- [ ] Update `CHANGES.md` with noteworthy changes.
- [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
GitHub Actions Tests Status (on master branch)
------------------------------------------------------------------------------------------------
[![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
[![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
[![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
[![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] AnandInguva commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "AnandInguva (via GitHub)" <gi...@apache.org>.
AnandInguva commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400640898
stop reviewer notifications
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400530273
Run Python Dataflow ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400567471
Run Python 3.8 PostCommit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400585112
Run Python 3.8 PostCommit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400585287
Run Python Dataflow V2 ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400519065
Run Python 3.8 PostCommit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400530083
Run Python 3.8 PostCommit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400658184
Run Python Dataflow ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400657789
Run Python 3.8 PostCommit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] lukecwik commented on a diff in pull request #25106: Move CombineValues override to non portable overrides
Posted by "lukecwik (via GitHub)" <gi...@apache.org>.
lukecwik commented on code in PR #25106:
URL: https://github.com/apache/beam/pull/25106#discussion_r1084470132
##########
sdks/python/apache_beam/transforms/combiners_test.py:
##########
@@ -873,6 +873,24 @@ def test_with_input_types_decorator_violation(self):
_ = pc | beam.CombineGlobally(self.fn)
+@pytest.mark.it_validatesrunner
+class CombineValuesTest(unittest.TestCase):
+ def test_combinevalues(self):
Review Comment:
nit:
```suggestion
def test_gbk_immediately_followed_by_combine(self):
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400787011
Run Python Dataflow ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400786799
Run Python Dataflow V2 ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400520266
Run Python Dataflow ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1399150490
R: @lukecwik
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1398892074
Run Python 3.9 PostCommit
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm merged pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm merged PR #25106:
URL: https://github.com/apache/beam/pull/25106
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400530465
Run Python Dataflow V2 ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400657994
Run Python Dataflow V2 ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] lukecwik commented on a diff in pull request #25106: Move CombineValues override to non portable overrides
Posted by "lukecwik (via GitHub)" <gi...@apache.org>.
lukecwik commented on code in PR #25106:
URL: https://github.com/apache/beam/pull/25106#discussion_r1084315923
##########
sdks/python/apache_beam/examples/snippets/transforms/aggregation/combinevalues_it_test.py:
##########
@@ -0,0 +1,39 @@
+import logging
+import pytest
+import unittest
+
+from apache_beam.testing.test_pipeline import TestPipeline
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
+
+from . import combinevalues
+
+class CombineValuesIT(unittest.TestCase):
Review Comment:
I would suggest making this a validates runner test like:
https://github.com/apache/beam/blob/897fffdeda59cc2f121dd502e569770a897fc4ac/sdks/python/apache_beam/transforms/ptransform_test.py#L702
in ptransform_test.py
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400519867
Run Python Dataflow V2 ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400585467
Run Python Dataflow ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400567657
Run Python Dataflow V2 ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [beam] damccorm commented on pull request #25106: Move CombineValues override to non portable overrides
Posted by "damccorm (via GitHub)" <gi...@apache.org>.
damccorm commented on PR #25106:
URL: https://github.com/apache/beam/pull/25106#issuecomment-1400567826
Run Python Dataflow ValidatesRunner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@beam.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org