You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/10/13 00:52:47 UTC

[GitHub] [beam] davidyan74 commented on a change in pull request #13080: [BEAM-11056] Fix warning message and rename old APIs

davidyan74 commented on a change in pull request #13080:
URL: https://github.com/apache/beam/pull/13080#discussion_r503607479



##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -19,21 +19,21 @@
 
 For internal use only; no backwards-compatibility guarantees.
 
-A background caching job is a job that captures events for all capturable
+A background caching job is a job that records events for all recordable

Review comment:
       "Background caching job" -> "Background source recording job". Please check all occurrences.

##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -56,7 +56,7 @@ class BackgroundCachingJob(object):
   """A simple abstraction that controls necessary components of a timed and
   space limited background caching job.
 
-  A background caching job successfully completes source data capture in 2
+  A background caching job successfully completes source data record in 2

Review comment:
       recording

##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -165,9 +165,9 @@ def is_background_caching_job_needed(user_pipeline):
   # If this is True, we can invalidate a previous done/running job if there is
   # one.
   cache_changed = is_source_to_cache_changed(user_pipeline)
-  # When capture replay is disabled, cache is always needed for capturable
+  # When record replay is disabled, cache is always needed for recordable

Review comment:
       recording

##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -165,9 +165,9 @@ def is_background_caching_job_needed(user_pipeline):
   # If this is True, we can invalidate a previous done/running job if there is
   # one.
   cache_changed = is_source_to_cache_changed(user_pipeline)
-  # When capture replay is disabled, cache is always needed for capturable
+  # When record replay is disabled, cache is always needed for recordable
   # sources (if any).
-  if need_cache and not ie.current_env().options.enable_capture_replay:
+  if need_cache and not ie.current_env().options.enable_record_replay:

Review comment:
       enable_recording_replay

##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -19,21 +19,21 @@
 
 For internal use only; no backwards-compatibility guarantees.
 
-A background caching job is a job that captures events for all capturable
+A background caching job is a job that records events for all recordable
 sources of a given pipeline. With Interactive Beam, one such job is started when
 a pipeline run happens (which produces a main job in contrast to the background
 caching job) and meets the following conditions:
 
-  #. The pipeline contains capturable sources, configured through
-     interactive_beam.options.capturable_sources.
+  #. The pipeline contains recordable sources, configured through
+     interactive_beam.options.recordable_sources.
   #. No such background job is running.
   #. No such background job has completed successfully and the cached events are
-     still valid (invalidated when capturable sources change in the pipeline).
+     still valid (invalidated when recordable sources change in the pipeline).
 
 Once started, the background caching job runs asynchronously until it hits some
-capture limit configured in interactive_beam.options. Meanwhile, the main job
+record limit configured in interactive_beam.options. Meanwhile, the main job

Review comment:
       recording

##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -301,13 +301,13 @@ def sizeof_fmt(num, suffix='B'):
             'In order to have a deterministic replay, a segment of data will '
             'be recorded from all sources for %s seconds or until a total of '
             '%s have been written to disk.',
-            options.capture_duration.total_seconds(),
-            sizeof_fmt(options.capture_size_limit))
+            options.record_duration.total_seconds(),

Review comment:
       recording_duration

##########
File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
##########
@@ -52,98 +53,116 @@
 class Options(interactive_options.InteractiveOptions):
   """Options that guide how Interactive Beam works."""
   @property
-  def enable_capture_replay(self):
-    """Whether replayable source data capture should be replayed for multiple
-    PCollection evaluations and pipeline runs as long as the data captured is
+  def enable_record_replay(self):

Review comment:
       enable_recording_replay. Basically, if "capture" is used as a noun, change it to "recording" instead of "record", since "record" might have a notion of an individual record. Please check all occurrences.

##########
File path: sdks/python/apache_beam/runners/interactive/background_caching_job.py
##########
@@ -301,13 +301,13 @@ def sizeof_fmt(num, suffix='B'):
             'In order to have a deterministic replay, a segment of data will '
             'be recorded from all sources for %s seconds or until a total of '
             '%s have been written to disk.',
-            options.capture_duration.total_seconds(),
-            sizeof_fmt(options.capture_size_limit))
+            options.record_duration.total_seconds(),
+            sizeof_fmt(options.record_size_limit))

Review comment:
       recording_size_limit




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org