You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/10/19 22:25:06 UTC

[GitHub] [beam] tszerszen opened a new pull request #13142: #5939 deduplicate constants

tszerszen opened a new pull request #13142:
URL: https://github.com/apache/beam/pull/13142


   This should solve [this Jira issue](https://issues.apache.org/jira/browse/BEAM-5939).
   
   apache_beam/runners/dataflow/internal/names.py
   apache_beam/runners/portability/stager.py
   
   had same constants defined in both files, this pull request should resolve it.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-717849131


   @tvalentyn could you please take a look now?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu merged pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
kamilwu merged pull request #13142:
URL: https://github.com/apache/beam/pull/13142


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-716408129


   @tvalentyn @aaltay could you please take a look?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn commented on a change in pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on a change in pull request #13142:
URL: https://github.com/apache/beam/pull/13142#discussion_r513802306



##########
File path: sdks/python/apache_beam/runners/dataflow/internal/names.py
##########
@@ -27,7 +27,13 @@
 # Standard file names used for staging files.
 from builtins import object
 
-DATAFLOW_SDK_TARBALL_FILE = 'dataflow_python_sdk.tar'
+# pylint: disable=unused-import
+from apache_beam.runners.internal.names import DATAFLOW_SDK_TARBALL_FILE
+from apache_beam.runners.internal.names import PICKLED_MAIN_SESSION_FILE

Review comment:
       I think we could avoid importing all of these except for PICKLED_MAIN_SESSION_FILE, and update all references in Beam codeabase to use apache_beam.runners.internal.names instead, similar what we do in apiclient.py.
   
   ```
   # pylint: disable=unused-import
   # Used by Dataflow legacy worker. 
   from apache_beam.runners.internal.names import PICKLED_MAIN_SESSION_FILE
   ```

##########
File path: sdks/python/apache_beam/runners/internal/names.py
##########
@@ -20,8 +20,11 @@
 # All constants are for internal use only; no backwards-compatibility
 # guarantees.
 
+DATAFLOW_SDK_TARBALL_FILE = 'dataflow_python_sdk.tar'

Review comment:
       Let's rename this to:
    
   `STAGED_SDK_SOURCES_FILENAME = 'dataflow_python_sdk.tar'  # Current value is hardcoded in Dataflow internal infrastructure; please don't change without a review from Dataflow maintainers.`
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-716887038


   Thanks for your help, @tszerszen .
   
   1. DATAFLOW_SDK_TARBALL_FILE != WORKFLOW_TARBALL_FILE.
   DATAFLOW_SDK_TARBALL_FILE is the name used to stage sources of apache beam sdk to staging location.
   WORKFLOW_TARBALL_FILE is a tarball file representing sources of the pipeline in case the pipeline is wrapped into a python package.
   
   2. It would be easier to review if you separated yapf changes in a separate commit. Are changes in groupby_test.py relevant?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-718962218


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn commented on a change in pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on a change in pull request #13142:
URL: https://github.com/apache/beam/pull/13142#discussion_r514505595



##########
File path: sdks/python/apache_beam/runners/dataflow/internal/names.py
##########
@@ -27,7 +27,13 @@
 # Standard file names used for staging files.
 from builtins import object
 
-DATAFLOW_SDK_TARBALL_FILE = 'dataflow_python_sdk.tar'
+# pylint: disable=unused-import
+from apache_beam.runners.internal.names import DATAFLOW_SDK_TARBALL_FILE
+from apache_beam.runners.internal.names import PICKLED_MAIN_SESSION_FILE

Review comment:
       Please add the  comment `# Referenced by Dataflow legacy worker. ` before `from apache_beam.runners.internal.names import PICKLED_MAIN_SESSION_FILE`
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13142:
URL: https://github.com/apache/beam/pull/13142#discussion_r514181412



##########
File path: sdks/python/apache_beam/runners/dataflow/internal/names.py
##########
@@ -27,7 +27,13 @@
 # Standard file names used for staging files.
 from builtins import object
 
-DATAFLOW_SDK_TARBALL_FILE = 'dataflow_python_sdk.tar'
+# pylint: disable=unused-import
+from apache_beam.runners.internal.names import DATAFLOW_SDK_TARBALL_FILE
+from apache_beam.runners.internal.names import PICKLED_MAIN_SESSION_FILE

Review comment:
       @tvalentyn  Thank you for your help. Is it ok now? 

##########
File path: sdks/python/apache_beam/runners/dataflow/internal/names.py
##########
@@ -27,7 +27,13 @@
 # Standard file names used for staging files.
 from builtins import object
 
-DATAFLOW_SDK_TARBALL_FILE = 'dataflow_python_sdk.tar'
+# pylint: disable=unused-import
+from apache_beam.runners.internal.names import DATAFLOW_SDK_TARBALL_FILE
+from apache_beam.runners.internal.names import PICKLED_MAIN_SESSION_FILE

Review comment:
       @tvalentyn  Thank you for your help. Are recent changes ok?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kkucharc commented on pull request #13142: #5939 deduplicate constants

Posted by GitBox <gi...@apache.org>.
kkucharc commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-713397389


   Run Python_PVR_Flink PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
kamilwu commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-720378701






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-720341070


   Run Python_PVR_Flink PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-719436063


   Run Python 3.7 PostCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn edited a comment on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tvalentyn edited a comment on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-718966035






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kkucharc commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
kkucharc commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-716380560


   Run Python PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen edited a comment on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tszerszen edited a comment on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-717849131


   @tvalentyn could you please take a look after recent changes?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tvalentyn commented on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tvalentyn commented on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-718966035


   LGTM if tests pass. PTAL at docs failure and rerun if this is a flake. @kamilwu can help merge.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen edited a comment on pull request #13142: [BEAM-5939] - Deduplicate constants

Posted by GitBox <gi...@apache.org>.
tszerszen edited a comment on pull request #13142:
URL: https://github.com/apache/beam/pull/13142#issuecomment-717849131


   @tvalentyn thank you for your review, could you please take a look after recent changes?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org