You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/04/01 23:50:28 UTC

[GitHub] [beam] TheNeuralBit opened a new pull request #11286: [BEAM-4374] Short IDs for the Python SDK

TheNeuralBit opened a new pull request #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286
 
 
   
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] TheNeuralBit commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286#discussion_r402466621
 
 

 ##########
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##########
 @@ -76,6 +79,50 @@
 DEFAULT_BUNDLE_PROCESSOR_CACHE_SHUTDOWN_THRESHOLD_S = 60
 
 
+class ShortIdCache(object):
+  """ Cache for MonitoringInfo "short ids"
+  """
+  def __init__(self):
+    self._lock = threading.Lock()
+    self._lastShortId = 0
+    self._infoKeyToShortId = {}  # type: Dict[FrozenSet, str]
+    self._shortIdToInfo = {}  # type: Dict[str, metrics_pb2.MonitoringInfo]
+
+  def getShortId(self, monitoring_info):
+    """ Returns the assigned shortId for a given MonitoringInfo, assigns one if
+    not assigned already.
+    """
+    # type: (metrics_pb2.MonitoringInfo) -> str
+    with self._lock:
+      key = monitoring_infos.to_key(monitoring_info)
 
 Review comment:
   This function exists already: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/metrics/monitoring_infos.py#L329-L338

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286#discussion_r402515324
 
 

 ##########
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##########
 @@ -76,6 +79,50 @@
 DEFAULT_BUNDLE_PROCESSOR_CACHE_SHUTDOWN_THRESHOLD_S = 60
 
 
+class ShortIdCache(object):
+  """ Cache for MonitoringInfo "short ids"
+  """
+  def __init__(self):
+    self._lock = threading.Lock()
+    self._lastShortId = 0
+    self._infoKeyToShortId = {}  # type: Dict[FrozenSet, str]
+    self._shortIdToInfo = {}  # type: Dict[str, metrics_pb2.MonitoringInfo]
+
+  def getShortId(self, monitoring_info):
+    """ Returns the assigned shortId for a given MonitoringInfo, assigns one if
+    not assigned already.
+    """
+    # type: (metrics_pb2.MonitoringInfo) -> str
+    with self._lock:
+      key = monitoring_infos.to_key(monitoring_info)
 
 Review comment:
   yup

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on issue #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
lukecwik commented on issue #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286#issuecomment-608082411
 
 
   Linter complains:
   ```
   10:03:16 ************* Module apache_beam.runners.portability.fn_api_runner.worker_handlers
   10:03:16 apache_beam/runners/portability/fn_api_runner/worker_handlers.py:62:0: E0001: Cannot import 'apache_beam.runners.worker.sdk_worker' due to syntax error 'misplaced type annotation (<unknown>, line 95)' (syntax-error)
   10:03:16 ************* Module apache_beam.runners.worker.sdk_worker
   10:03:16 apache_beam/runners/worker/sdk_worker.py:95:47: E0001: misplaced type annotation (<unknown>, line 95) (syntax-error)
   10:03:16 ************* Module apache_beam.runners.worker.sdk_worker_main
   10:03:16 apache_beam/runners/worker/sdk_worker_main.py:43:0: E0001: Cannot import 'apache_beam.runners.worker.sdk_worker' due to syntax error 'misplaced type annotation (<unknown>, line 95)' (syntax-error)
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286#discussion_r402462053
 
 

 ##########
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##########
 @@ -76,6 +79,50 @@
 DEFAULT_BUNDLE_PROCESSOR_CACHE_SHUTDOWN_THRESHOLD_S = 60
 
 
+class ShortIdCache(object):
+  """ Cache for MonitoringInfo "short ids"
+  """
+  def __init__(self):
+    self._lock = threading.Lock()
+    self._lastShortId = 0
+    self._infoKeyToShortId = {}  # type: Dict[FrozenSet, str]
+    self._shortIdToInfo = {}  # type: Dict[str, metrics_pb2.MonitoringInfo]
+
+  def getShortId(self, monitoring_info):
+    """ Returns the assigned shortId for a given MonitoringInfo, assigns one if
+    not assigned already.
+    """
+    # type: (metrics_pb2.MonitoringInfo) -> str
+    with self._lock:
+      key = monitoring_infos.to_key(monitoring_info)
 
 Review comment:
   I think you missed adding your monitoring_infos changes.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286#discussion_r402463213
 
 

 ##########
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##########
 @@ -76,6 +79,50 @@
 DEFAULT_BUNDLE_PROCESSOR_CACHE_SHUTDOWN_THRESHOLD_S = 60
 
 
+class ShortIdCache(object):
+  """ Cache for MonitoringInfo "short ids"
+  """
+  def __init__(self):
+    self._lock = threading.Lock()
+    self._lastShortId = 0
+    self._infoKeyToShortId = {}  # type: Dict[FrozenSet, str]
+    self._shortIdToInfo = {}  # type: Dict[str, metrics_pb2.MonitoringInfo]
+
+  def getShortId(self, monitoring_info):
+    """ Returns the assigned shortId for a given MonitoringInfo, assigns one if
+    not assigned already.
+    """
+    # type: (metrics_pb2.MonitoringInfo) -> str
+    with self._lock:
+      key = monitoring_infos.to_key(monitoring_info)
 
 Review comment:
   ```suggestion
       key = monitoring_infos.to_key(monitoring_info)
       with self._lock:
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] TheNeuralBit commented on issue #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on issue #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286#issuecomment-607544923
 
 
   R: @lukecwik does this look correct?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
lukecwik commented on a change in pull request #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286#discussion_r402515324
 
 

 ##########
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##########
 @@ -76,6 +79,50 @@
 DEFAULT_BUNDLE_PROCESSOR_CACHE_SHUTDOWN_THRESHOLD_S = 60
 
 
+class ShortIdCache(object):
+  """ Cache for MonitoringInfo "short ids"
+  """
+  def __init__(self):
+    self._lock = threading.Lock()
+    self._lastShortId = 0
+    self._infoKeyToShortId = {}  # type: Dict[FrozenSet, str]
+    self._shortIdToInfo = {}  # type: Dict[str, metrics_pb2.MonitoringInfo]
+
+  def getShortId(self, monitoring_info):
+    """ Returns the assigned shortId for a given MonitoringInfo, assigns one if
+    not assigned already.
+    """
+    # type: (metrics_pb2.MonitoringInfo) -> str
+    with self._lock:
+      key = monitoring_infos.to_key(monitoring_info)
 
 Review comment:
   yup and I was editing that code a week ago

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] TheNeuralBit merged pull request #11286: [BEAM-4374] Short IDs for the Python SDK

Posted by GitBox <gi...@apache.org>.
TheNeuralBit merged pull request #11286: [BEAM-4374] Short IDs for the Python SDK
URL: https://github.com/apache/beam/pull/11286
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services