You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by GitBox <gi...@apache.org> on 2020/03/03 00:05:05 UTC

[GitHub] [beam] KevinGG opened a new pull request #11020: [BEAM-7926] Update Data Visualization

KevinGG opened a new pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020
 
 
   1. Added include_window_info and visualize_data as **kwargs passed into
      `show`.
   2. Updated javascripts to make the data visualization smooth and
      resilient to DOM changes. Now datatable is loaded dynamically without
      flickering nor changing of user's page/search state; javascripts also
      work when refreshing the browser.
   3. Resolved the jQuery+Datatable loading issue by forcing chained
      loading. Any customized javascripts relying on jQuery should only use
      `window.jquery341`. Always carry out a check for `window.jquery341`.
      Run javascripts in the last onload of the jQuery loading chain if
      `window.jquery341` is not available.
   4. All HTML imports are chained at onload of webcomponents (if HTML import
      is not supported) or plainly imported (if HTML import supported) in a
      single place in document.head. This makes HTML import resilient to
      DOM changes caused by normal notebook usages.
   5. Updated some logging statements.
   6. Added `show_graph` API to render DAG of a pipeline. `pipeline.run`
      does not render DAG now.
   
   Change-Id: Id2ca548860fb2d30e1557a35e7b14d2e61b5f1a4
   
   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-594117405
 
 
   As discussed offline, switching @pabloem to
   R: @aaltay 
   Thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387259752
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
+          var jqueryScript = document.createElement('script');
+          jqueryScript.src = 'https://code.jquery.com/jquery-3.4.1.slim.min.js';
+          jqueryScript.type = 'text/javascript';
+          jqueryScript.onload = function() {{
+            var datatableScript = document.createElement('script');
+            datatableScript.src = 'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';
+            datatableScript.type = 'text/javascript';
+            datatableScript.onload = function() {{
+              window.jquery341 = jQuery.noConflict(true);
+              window.jquery341(document).ready(function($){{
+                {customized_script}
+              }});
+            }}
+            document.head.appendChild(datatableScript);
+          }};
+          document.head.appendChild(jqueryScript);
+        }} else {{
+          window.jquery341(document).ready(function($){{
+            {customized_script}
+          }});
+        }}"""
+
+_HTML_IMPORT_TEMPLATE = """
 
 Review comment:
   what does this do?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595934264
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387857258
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   I think it's most useful only when we do debugging because end users of notebooks wouldn't open a developer tool.
   I'll remove the logging, but keep the catch clause.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387334402
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
 ##########
 @@ -211,20 +280,27 @@ def show(*pcolls):
         watched_pcollections.add(val)
   for pcoll in pcolls:
     if pcoll not in watched_pcollections:
-      watch({re.sub(r'[\[\]\(\)]', '_', str(pcoll)): pcoll})
+      watch({'anonymous_pcollection_{}'.format(id(pcoll)): pcoll})
 
+  import warnings
 
 Review comment:
   Done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay merged pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay merged pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387422917
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
+          var jqueryScript = document.createElement('script');
+          jqueryScript.src = 'https://code.jquery.com/jquery-3.4.1.slim.min.js';
+          jqueryScript.type = 'text/javascript';
+          jqueryScript.onload = function() {{
+            var datatableScript = document.createElement('script');
+            datatableScript.src = 'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';
+            datatableScript.type = 'text/javascript';
+            datatableScript.onload = function() {{
+              window.jquery341 = jQuery.noConflict(true);
+              window.jquery341(document).ready(function($){{
+                {customized_script}
+              }});
+            }}
+            document.head.appendChild(datatableScript);
+          }};
+          document.head.appendChild(jqueryScript);
+        }} else {{
+          window.jquery341(document).ready(function($){{
+            {customized_script}
+          }});
+        }}"""
+
+_HTML_IMPORT_TEMPLATE = """
 
 Review comment:
   Added comments.
   
   https://developer.mozilla.org/en-US/docs/Web/Web_Components/HTML_Imports explains it.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595374391
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-596696586
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387357575
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
+              }}"""
+_OVERVIEW_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-overview id="{display_id}"></facets-overview>
             <script>
               document.querySelector("#{display_id}").protoInput = "{protostr}";
 
 Review comment:
   Maybe I do not understand. The script in the script tag is it executed on the server side?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387330714
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/options/interactive_options.py
 ##########
 @@ -24,6 +24,8 @@
 
 from __future__ import absolute_import
 
+from dateutil import tz
 
 Review comment:
   The user can use both. Here internally we use dateutil.tz get the local timezone info.
   Externally, the user can use `pytz.timezone` or `dateutil.tz.gettz` because the `to_tz` just needs to be a subclass of datetime.tzinfo.
   
   Added the comments in the exposed API.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595933910
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-593701514
 
 
   R: @rohdesamuel 
   R: @pabloem 
   
   PTAL, thanks!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595373616
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595374391
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595429224
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-594915596
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387258903
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
 ##########
 @@ -211,20 +280,27 @@ def show(*pcolls):
         watched_pcollections.add(val)
   for pcoll in pcolls:
     if pcoll not in watched_pcollections:
-      watch({re.sub(r'[\[\]\(\)]', '_', str(pcoll)): pcoll})
+      watch({'anonymous_pcollection_{}'.format(id(pcoll)): pcoll})
 
+  import warnings
+  warnings.filterwarnings('ignore', category=DeprecationWarning)
 
 Review comment:
   This will filter out all deprecation warnings. Not a good outcome, if we would like user to see deprecation warnings.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387256604
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -291,3 +402,81 @@ def _to_dataframe(self):
 
   def _is_one_dimension_type(self, val):
     return type(val) in _one_dimension_types
+
+
+def format_window_info_in_dataframe(data):
+  if 'event_time' in data.columns:
+    data['event_time'] = data['event_time'].apply(event_time_formatter)
+  if 'windows' in data.columns:
+    data['windows'] = data['windows'].apply(windows_formatter)
+  if 'pane_info' in data.columns:
+    data['pane_info'] = data['pane_info'].apply(pane_info_formatter)
+
+
+def event_time_formatter(event_time_us):
+  options = ie.current_env().options
+  to_tz = options.display_timezone
+  try:
+    return (
+        datetime.datetime.utcfromtimestamp(event_time_us / 1000000).replace(
+            tzinfo=tz.tzutc()).astimezone(to_tz).strftime(
+                options.display_timestamp_format))
+  except ValueError:
+    if event_time_us < 0:
+      return 'Min Timestamp'
+    return 'Max Timestamp'
+
+
+def windows_formatter(windows):
+  result = []
+  for w in windows:
+    if isinstance(w, GlobalWindow):
+      result.append(str(w))
+    elif isinstance(w, IntervalWindow):
+      # First get the duration in terms of hours, minutes, seconds, and
+      # micros.
+      duration = w.end.micros - w.start.micros
+      duration_secs = duration // 1000000
+      hours, remainder = divmod(duration_secs, 3600)
+      minutes, seconds = divmod(remainder, 60)
+      micros = (duration - duration_secs * 1000000) % 1000000
+
+      # Construct the duration string. Try and write the string in such a
 
 Review comment:
   is it possible to use strftime? 
   
   https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595373700
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595500738
 
 
   Run PythonLint PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595376165
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387281998
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -238,31 +322,57 @@ def _display_dive(self, data, update=None):
       display(HTML(html))
 
   def _display_overview(self, data, update=None):
+    if (not data.empty and self._include_window_info and
+        all(column in data.columns
+            for column in ('event_time', 'windows', 'pane_info'))):
+      data = data.drop(['event_time', 'windows', 'pane_info'], axis=1)
+
     gfsg = GenericFeatureStatisticsGenerator()
     proto = gfsg.ProtoFromDataFrames([{'name': 'data', 'table': data}])
     protostr = base64.b64encode(proto.SerializeToString()).decode('utf-8')
     if update:
       script = _OVERVIEW_SCRIPT_TEMPLATE.format(
-          display_id=update, protostr=protostr)
+          display_id=update._overview_display_id, protostr=protostr)
       display_javascript(Javascript(script))
     else:
       html = _OVERVIEW_HTML_TEMPLATE.format(
           display_id=self._overview_display_id, protostr=protostr)
       display(HTML(html))
 
   def _display_dataframe(self, data, update=None):
-    if update:
-      table_id = 'table_{}'.format(update)
-      html = _DATAFRAME_PAGINATION_TEMPLATE.format(
-          dataframe_html=data.to_html(notebook=True, table_id=table_id),
-          table_id=table_id)
-      update_display(HTML(html), display_id=update)
+    table_id = 'table_{}'.format(
+        update._df_display_id if update else self._df_display_id)
+    columns = [{
+        'title': ''
+    }] + [{
+        'title': str(column)
+    } for column in data.columns]
+    format_window_info_in_dataframe(data)
+    rows = data.applymap(lambda x: str(x)).to_dict('split')['data']
 
 Review comment:
   First, we get all the string `data` from the `split` [orient](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_dict.html) of `dataframe.to_dict`. 
   Now the `rows` is a `list` of `row`s of values.
   Each `row` looks like `[column_1_val, column_2_val, ...]`
   
   Then we are going to add datatable column index for the values in each `row`.
   The index starts from 1 because we are also going to add a column `0` later., so we have `{k+1: v}`.
   Each `row` now becomes `{1: column_1_val, 2: column_2_val, ...}`
   
   Then we add column `0` (`row[0] = k`) of the datatable with values of int based index (which will be the default order column just as the original dataframe).
   Each `row` now becomes `{1: column_1_val, 2: column_2_val, ..., 0: int_index_in_dataframe}`
   
   Then the list of above `row`s get supplied as string in the Javascript to load the data into the table.
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387252940
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   Could we wait for document onLoad?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-594845063
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-594988636
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387359585
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -238,31 +322,57 @@ def _display_dive(self, data, update=None):
       display(HTML(html))
 
   def _display_overview(self, data, update=None):
+    if (not data.empty and self._include_window_info and
+        all(column in data.columns
+            for column in ('event_time', 'windows', 'pane_info'))):
+      data = data.drop(['event_time', 'windows', 'pane_info'], axis=1)
+
     gfsg = GenericFeatureStatisticsGenerator()
     proto = gfsg.ProtoFromDataFrames([{'name': 'data', 'table': data}])
     protostr = base64.b64encode(proto.SerializeToString()).decode('utf-8')
     if update:
       script = _OVERVIEW_SCRIPT_TEMPLATE.format(
-          display_id=update, protostr=protostr)
+          display_id=update._overview_display_id, protostr=protostr)
       display_javascript(Javascript(script))
     else:
       html = _OVERVIEW_HTML_TEMPLATE.format(
           display_id=self._overview_display_id, protostr=protostr)
       display(HTML(html))
 
   def _display_dataframe(self, data, update=None):
-    if update:
-      table_id = 'table_{}'.format(update)
-      html = _DATAFRAME_PAGINATION_TEMPLATE.format(
-          dataframe_html=data.to_html(notebook=True, table_id=table_id),
-          table_id=table_id)
-      update_display(HTML(html), display_id=update)
+    table_id = 'table_{}'.format(
+        update._df_display_id if update else self._df_display_id)
+    columns = [{
+        'title': ''
+    }] + [{
+        'title': str(column)
+    } for column in data.columns]
+    format_window_info_in_dataframe(data)
+    rows = data.applymap(lambda x: str(x)).to_dict('split')['data']
 
 Review comment:
   Could you explain that in a comment there?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387257783
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
 ##########
 @@ -86,10 +85,57 @@ def capture_duration(self, value):
       # The next PCollection evaluation will capture fresh data from sources,
       # and the data captured will be replayed until another eviction.
     """
+    assert value.total_seconds() > 0, 'Duration must be a positive value.'
     self.capture_control._capture_duration = value
 
   # TODO(BEAM-8335): add capture_size options when they are supported.
 
+  @property
+  def display_timestamp_format(self):
+    """The format in which timestamps are displayed.
+
+    Default is '%Y-%m-%d %H:%M:%S.%f%z', e.g. 2020-02-01 15:05:06.000015-08:00.
 
 Review comment:
   Where is the default defined?
   
   If defined somewhere else, let's not repeat these in comments. It will require sync both places going forward.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387259303
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
 
 Review comment:
   what is jquery341 ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] pabloem commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
pabloem commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-596749018
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387359374
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   1. If this try fails, what happens?
   2. Is the console.log statement valuable in this case?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387420719
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
 
 Review comment:
   I've changed it to `interactive_beam_jquery`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595373700
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595373616
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387276995
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -215,20 +287,32 @@ def display_facets(self, updating_pv=None):
     # Ensures that dive, overview and table render the same data because the
     # materialized PCollection data might being updated continuously.
     data = self._to_dataframe()
+    # String-ify the dictionaries for display because elements of type dict
+    # cannot be ordered.
+    data = data.applymap(lambda x: str(x) if isinstance(x, dict) else x)
     if updating_pv:
-      self._display_dive(data, updating_pv._dive_display_id)
-      self._display_overview(data, updating_pv._overview_display_id)
-      self._display_dataframe(data, updating_pv._df_display_id)
+      # Only updates when data is not empty. Otherwise, consider it a bad
+      # iteration and noop since there is nothing to be updated.
+      if data.empty:
+        _LOGGER.debug('Skip a visualization update due to empty data.')
+      else:
+        self._display_dataframe(data.copy(deep=True), updating_pv)
+        if self._display_facets:
+          self._display_dive(data.copy(deep=True), updating_pv)
 
 Review comment:
   Because we make different changes (such as formatting and dropping some columns) to the dataframe before displaying it in these 3 widgets.
   For example, window info needs to be formatted for facets-dive and datatable while getting dropped in facets-overview.
   
   If they share the same instance, the 3 widgets will be altering the same dataframe object in arbitrary order, get arbitrary mixed output or run into all kinds of mapping errors.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387252844
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
+              }}"""
+_OVERVIEW_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-overview id="{display_id}"></facets-overview>
             <script>
               document.querySelector("#{display_id}").protoInput = "{protostr}";
 
 Review comment:
   Why this one does not have the catch statement?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387269492
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -181,6 +247,12 @@ def __init__(self, pcoll):
     self._overview_display_id = 'facets_overview_{}_{}'.format(
         self._cache_key, id(self))
     self._df_display_id = 'df_{}_{}'.format(self._cache_key, id(self))
+    # Whether the visualization should include window info.
 
 Review comment:
   Got it, removing these comments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595926189
 
 
   Resolved merge conflict and force pushed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387258536
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
 ##########
 @@ -211,20 +280,27 @@ def show(*pcolls):
         watched_pcollections.add(val)
   for pcoll in pcolls:
     if pcoll not in watched_pcollections:
-      watch({re.sub(r'[\[\]\(\)]', '_', str(pcoll)): pcoll})
+      watch({'anonymous_pcollection_{}'.format(id(pcoll)): pcoll})
 
+  import warnings
 
 Review comment:
   import at top?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay removed a comment on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595373909
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387260380
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/options/interactive_options.py
 ##########
 @@ -24,6 +24,8 @@
 
 from __future__ import absolute_import
 
+from dateutil import tz
 
 Review comment:
   Are we using timezone from pytz or tz from dateutil?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387415941
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   If this fails, it means the initially displayed widgets have been cleared from the DOM or the initial display hasn't completed yet (maybe because of some racing conditions). NOOP should be the right way to handle it because it either means the user has cleared the output or the script has no target to execute on.
   
   The error is supposed to be logged in the console. However, if not caught, it also gets displayed in notebook output areas. By doing this, we kept the log and also avoid the output area pollution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387342551
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
 ##########
 @@ -211,20 +280,27 @@ def show(*pcolls):
         watched_pcollections.add(val)
   for pcoll in pcolls:
     if pcoll not in watched_pcollections:
-      watch({re.sub(r'[\[\]\(\)]', '_', str(pcoll)): pcoll})
+      watch({'anonymous_pcollection_{}'.format(id(pcoll)): pcoll})
 
+  import warnings
+  warnings.filterwarnings('ignore', category=DeprecationWarning)
 
 Review comment:
   Change the filtering to catch a specific message and only takes effect when `is_in_ipython` when the user invokes `show` for the first time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387268984
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
+              }}"""
+_OVERVIEW_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-overview id="{display_id}"></facets-overview>
             <script>
               document.querySelector("#{display_id}").protoInput = "{protostr}";
 
 Review comment:
   This is kernel (server) side rendering, the HTML is not in DOM yet, so the element must be there when the HTML gets rendered in browser.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387828797
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   > The error is supposed to be logged in the console. However, if not caught, it also gets displayed in notebook output areas. By doing this, we kept the log and also avoid the output area pollution.
   This make sense. Do you even need it in the console log? After catching, we could choose to not log it. Is it useful to log?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387419572
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -238,31 +322,57 @@ def _display_dive(self, data, update=None):
       display(HTML(html))
 
   def _display_overview(self, data, update=None):
+    if (not data.empty and self._include_window_info and
+        all(column in data.columns
+            for column in ('event_time', 'windows', 'pane_info'))):
+      data = data.drop(['event_time', 'windows', 'pane_info'], axis=1)
+
     gfsg = GenericFeatureStatisticsGenerator()
     proto = gfsg.ProtoFromDataFrames([{'name': 'data', 'table': data}])
     protostr = base64.b64encode(proto.SerializeToString()).decode('utf-8')
     if update:
       script = _OVERVIEW_SCRIPT_TEMPLATE.format(
-          display_id=update, protostr=protostr)
+          display_id=update._overview_display_id, protostr=protostr)
       display_javascript(Javascript(script))
     else:
       html = _OVERVIEW_HTML_TEMPLATE.format(
           display_id=self._overview_display_id, protostr=protostr)
       display(HTML(html))
 
   def _display_dataframe(self, data, update=None):
-    if update:
-      table_id = 'table_{}'.format(update)
-      html = _DATAFRAME_PAGINATION_TEMPLATE.format(
-          dataframe_html=data.to_html(notebook=True, table_id=table_id),
-          table_id=table_id)
-      update_display(HTML(html), display_id=update)
+    table_id = 'table_{}'.format(
+        update._df_display_id if update else self._df_display_id)
+    columns = [{
+        'title': ''
+    }] + [{
+        'title': str(column)
+    } for column in data.columns]
+    format_window_info_in_dataframe(data)
+    rows = data.applymap(lambda x: str(x)).to_dict('split')['data']
 
 Review comment:
   Added the comments.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387360704
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
+          var jqueryScript = document.createElement('script');
+          jqueryScript.src = 'https://code.jquery.com/jquery-3.4.1.slim.min.js';
+          jqueryScript.type = 'text/javascript';
+          jqueryScript.onload = function() {{
+            var datatableScript = document.createElement('script');
+            datatableScript.src = 'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';
+            datatableScript.type = 'text/javascript';
+            datatableScript.onload = function() {{
+              window.jquery341 = jQuery.noConflict(true);
+              window.jquery341(document).ready(function($){{
+                {customized_script}
+              }});
+            }}
+            document.head.appendChild(datatableScript);
+          }};
+          document.head.appendChild(jqueryScript);
+        }} else {{
+          window.jquery341(document).ready(function($){{
+            {customized_script}
+          }});
+        }}"""
+
+_HTML_IMPORT_TEMPLATE = """
 
 Review comment:
   could you add comments related to this?
   
   Why is it no longer supported by chrome?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595377096
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595939037
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595373909
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387360492
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
 
 Review comment:
   jquery341 is probably not a good name. What would happen if we upgrade to a different jquery version? Maybe it would be better to call it jquery_singleton or something like that?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387285131
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
 ##########
 @@ -86,10 +85,57 @@ def capture_duration(self, value):
       # The next PCollection evaluation will capture fresh data from sources,
       # and the data captured will be replayed until another eviction.
     """
+    assert value.total_seconds() > 0, 'Duration must be a positive value.'
     self.capture_control._capture_duration = value
 
   # TODO(BEAM-8335): add capture_size options when they are supported.
 
+  @property
+  def display_timestamp_format(self):
+    """The format in which timestamps are displayed.
+
+    Default is '%Y-%m-%d %H:%M:%S.%f%z', e.g. 2020-02-01 15:05:06.000015-08:00.
 
 Review comment:
   docstrings in this module are for notebook users. So keeping them here allows `Shift+Tab` in notebooks to invoke the docstrings pop up. They function as in-notebook user guide.
   
   The default is defined in the `interactive_options` module where we hide the implementation details that are not exposed APIs.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387359733
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -291,3 +402,81 @@ def _to_dataframe(self):
 
   def _is_one_dimension_type(self, val):
     return type(val) in _one_dimension_types
+
+
+def format_window_info_in_dataframe(data):
+  if 'event_time' in data.columns:
+    data['event_time'] = data['event_time'].apply(event_time_formatter)
+  if 'windows' in data.columns:
+    data['windows'] = data['windows'].apply(windows_formatter)
+  if 'pane_info' in data.columns:
+    data['pane_info'] = data['pane_info'].apply(pane_info_formatter)
+
+
+def event_time_formatter(event_time_us):
+  options = ie.current_env().options
+  to_tz = options.display_timezone
+  try:
+    return (
+        datetime.datetime.utcfromtimestamp(event_time_us / 1000000).replace(
+            tzinfo=tz.tzutc()).astimezone(to_tz).strftime(
+                options.display_timestamp_format))
+  except ValueError:
+    if event_time_us < 0:
+      return 'Min Timestamp'
+    return 'Max Timestamp'
+
+
+def windows_formatter(windows):
+  result = []
+  for w in windows:
+    if isinstance(w, GlobalWindow):
+      result.append(str(w))
+    elif isinstance(w, IntervalWindow):
+      # First get the duration in terms of hours, minutes, seconds, and
+      # micros.
+      duration = w.end.micros - w.start.micros
+      duration_secs = duration // 1000000
+      hours, remainder = divmod(duration_secs, 3600)
+      minutes, seconds = divmod(remainder, 60)
+      micros = (duration - duration_secs * 1000000) % 1000000
+
+      # Construct the duration string. Try and write the string in such a
 
 Review comment:
   Is there any other standard function that will do this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595377006
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387831529
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
+          var jqueryScript = document.createElement('script');
+          jqueryScript.src = 'https://code.jquery.com/jquery-3.4.1.slim.min.js';
+          jqueryScript.type = 'text/javascript';
+          jqueryScript.onload = function() {{
+            var datatableScript = document.createElement('script');
+            datatableScript.src = 'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';
+            datatableScript.type = 'text/javascript';
+            datatableScript.onload = function() {{
+              window.jquery341 = jQuery.noConflict(true);
+              window.jquery341(document).ready(function($){{
+                {customized_script}
+              }});
+            }}
+            document.head.appendChild(datatableScript);
+          }};
+          document.head.appendChild(jqueryScript);
+        }} else {{
+          window.jquery341(document).ready(function($){{
+            {customized_script}
+          }});
+        }}"""
+
+_HTML_IMPORT_TEMPLATE = """
 
 Review comment:
   Strange. So we rely on a polyfill. Seems fine for now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387414898
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
+              }}"""
+_OVERVIEW_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-overview id="{display_id}"></facets-overview>
             <script>
               document.querySelector("#{display_id}").protoInput = "{protostr}";
 
 Review comment:
   Sorry, let me put it in an example.
   
   The HTML+JS template is formatted into an HTML obj and gets displayed into the frontend. Inside this little piece of HTML, the `document.querySelector("#{display_id}")` will always return such queried element because the element is created within the HTML itself.
   
   Then this little piece of HTML is embedded in the notebook's DOM. It resides in the output area of a cell in the notebook.
   
   If the user "clears all outputs" from the notebook, the HTML is deleted from the DOM.
   
   Now, `document.querySelector("#{display_id}")` will return `undefined`.
   The user will see `Javascript error` being populated in the output area that just got cleared.
   The output continuously increases at the `show` interval (1 second) until the visualization is done.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387254669
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -215,20 +287,32 @@ def display_facets(self, updating_pv=None):
     # Ensures that dive, overview and table render the same data because the
     # materialized PCollection data might being updated continuously.
     data = self._to_dataframe()
+    # String-ify the dictionaries for display because elements of type dict
+    # cannot be ordered.
+    data = data.applymap(lambda x: str(x) if isinstance(x, dict) else x)
     if updating_pv:
-      self._display_dive(data, updating_pv._dive_display_id)
-      self._display_overview(data, updating_pv._overview_display_id)
-      self._display_dataframe(data, updating_pv._df_display_id)
+      # Only updates when data is not empty. Otherwise, consider it a bad
+      # iteration and noop since there is nothing to be updated.
+      if data.empty:
+        _LOGGER.debug('Skip a visualization update due to empty data.')
+      else:
+        self._display_dataframe(data.copy(deep=True), updating_pv)
+        if self._display_facets:
+          self._display_dive(data.copy(deep=True), updating_pv)
 
 Review comment:
   Why are we copying data?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387253892
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -181,6 +247,12 @@ def __init__(self, pcoll):
     self._overview_display_id = 'facets_overview_{}_{}'.format(
         self._cache_key, id(self))
     self._df_display_id = 'df_{}_{}'.format(self._cache_key, id(self))
+    # Whether the visualization should include window info.
 
 Review comment:
   You can probably drop these comments. They are pretty obvious form variable names and use up to this point.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] pabloem commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
pabloem commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-596672978
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595568670
 
 
   Run PythonLint PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387292177
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
 
 Review comment:
   It's an arbitrary name we give to the jQuery v3.4.1 we imported. It's like a namesapce. Note the magic happens in `window.jquery341 = jQuery.noConflict(true);`.
   The problem here is that:
   1. A frontend can connect to the kernel at any time: code executed by kernel in the past does not have any effect to new frontends.
   2. Multiple frontends can connect to the same kernel: each frontend has its own state (browser: HTML and JS), the rendered HTML+JS cannot assume the existence of any global variable, function definition or libraries.
   
   This ensures no matter how many jQuery gets imported at any time, the interactive notebook always checks and uses the single jQuery configured by interactive modules with Datatable plugin initialized.
   And the `function($)` signature ensures that any customized script executed will use `$` as the singleton instance  `window.jquery341`. This ensures that code reading `$` as jQuery will always work.
   The advantage of doing this isolation is:
   1. The JS imported by interactive modules to any frontend does not alter their existing states. Everything in the notebook still works as it was no matter what libraries and global vars have been used.
   2. HTML with JS rendered by interactive modules will have determined behavior because it always uses the same libraries.
   3. Whether/when a frontend is connected to the kernel doesn't matter now. The visualization HTML contains everything it needs to setup and/or execute scripts.
   4. Arbitrary DOM changes doesn't matter now. Even if the user screws the notebook's HTML, the data visualization broadcast from kernels will always be rendered correctly.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-596030323
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387315589
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_environment.py
 ##########
 @@ -43,6 +43,55 @@
 
 _LOGGER = logging.getLogger(__name__)
 
+# By `format(customized_script=xxx)`, the given `customized_script` is
+# guaranteed to be executed within access to a jquery with datatable plugin
+# configured which is useful so that any `customized_script` is resilient to
+# browser refresh. Inside `customized_script`, use `$` as jQuery.
+_JQUERY_WITH_DATATABLE_TEMPLATE = """
+        if (typeof window.jquery341 == 'undefined') {{
+          var jqueryScript = document.createElement('script');
+          jqueryScript.src = 'https://code.jquery.com/jquery-3.4.1.slim.min.js';
+          jqueryScript.type = 'text/javascript';
+          jqueryScript.onload = function() {{
+            var datatableScript = document.createElement('script');
+            datatableScript.src = 'https://cdn.datatables.net/1.10.20/js/jquery.dataTables.min.js';
+            datatableScript.type = 'text/javascript';
+            datatableScript.onload = function() {{
+              window.jquery341 = jQuery.noConflict(true);
+              window.jquery341(document).ready(function($){{
+                {customized_script}
+              }});
+            }}
+            document.head.appendChild(datatableScript);
+          }};
+          document.head.appendChild(jqueryScript);
+        }} else {{
+          window.jquery341(document).ready(function($){{
+            {customized_script}
+          }});
+        }}"""
+
+_HTML_IMPORT_TEMPLATE = """
 
 Review comment:
   This uses something called `HTML import` where static HTML will be imported and embedded into current HTML.
   Here the HTML we desire is facets-jupyter.html.
   
   This feature is not supported by chrome anymore, thus requires the webcomponents JS lib.
   Similar to the jQuery template, we check if `HTML import` is supported by the browser, if so, import HTMLs else setup webcomponents and chain the `HTML import` to the end of `onload`.
   Note, we import HTMLs in the head for several reasons:
   1. In a notebook, DOM changes all the time. Keeping imported HTMLs in head makes sure all dependency HTMLs available all the time.
   2. `HTML import` only happens once per page load. There is no way to recover an imported HTML if you delete it from DOM unless you refresh the page.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387267410
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   Facets widgets doesn't depend on the jQuery we setup.
   The JS also doesn't depend on the webcomponent (it's for HTML import).
   So there is no need to wait for `onload` of anything.
   The DOM changes when the output area containing the widgets being updated gets deleted by the user in the notebook and some JS exceptions could be thrown out.
   This is to avoid `display_javascript` polluting the output area of notebooks in this scenario.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595932958
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-595938904
 
 
   retest this please

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387320751
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -291,3 +402,81 @@ def _to_dataframe(self):
 
   def _is_one_dimension_type(self, val):
     return type(val) in _one_dimension_types
+
+
+def format_window_info_in_dataframe(data):
+  if 'event_time' in data.columns:
+    data['event_time'] = data['event_time'].apply(event_time_formatter)
+  if 'windows' in data.columns:
+    data['windows'] = data['windows'].apply(windows_formatter)
+  if 'pane_info' in data.columns:
+    data['pane_info'] = data['pane_info'].apply(pane_info_formatter)
+
+
+def event_time_formatter(event_time_us):
+  options = ie.current_env().options
+  to_tz = options.display_timezone
+  try:
+    return (
+        datetime.datetime.utcfromtimestamp(event_time_us / 1000000).replace(
+            tzinfo=tz.tzutc()).astimezone(to_tz).strftime(
+                options.display_timestamp_format))
+  except ValueError:
+    if event_time_us < 0:
+      return 'Min Timestamp'
+    return 'Max Timestamp'
+
+
+def windows_formatter(windows):
+  result = []
+  for w in windows:
+    if isinstance(w, GlobalWindow):
+      result.append(str(w))
+    elif isinstance(w, IntervalWindow):
+      # First get the duration in terms of hours, minutes, seconds, and
+      # micros.
+      duration = w.end.micros - w.start.micros
+      duration_secs = duration // 1000000
+      hours, remainder = divmod(duration_secs, 3600)
+      minutes, seconds = divmod(remainder, 60)
+      micros = (duration - duration_secs * 1000000) % 1000000
+
+      # Construct the duration string. Try and write the string in such a
 
 Review comment:
   This is trying to format a duration potentially with precision at micros, not exactly a `datetime`.  It's more like pretty print a `timedelta`. So the `strftime` function is not applicable.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387255717
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -238,31 +322,57 @@ def _display_dive(self, data, update=None):
       display(HTML(html))
 
   def _display_overview(self, data, update=None):
+    if (not data.empty and self._include_window_info and
+        all(column in data.columns
+            for column in ('event_time', 'windows', 'pane_info'))):
+      data = data.drop(['event_time', 'windows', 'pane_info'], axis=1)
+
     gfsg = GenericFeatureStatisticsGenerator()
     proto = gfsg.ProtoFromDataFrames([{'name': 'data', 'table': data}])
     protostr = base64.b64encode(proto.SerializeToString()).decode('utf-8')
     if update:
       script = _OVERVIEW_SCRIPT_TEMPLATE.format(
-          display_id=update, protostr=protostr)
+          display_id=update._overview_display_id, protostr=protostr)
       display_javascript(Javascript(script))
     else:
       html = _OVERVIEW_HTML_TEMPLATE.format(
           display_id=self._overview_display_id, protostr=protostr)
       display(HTML(html))
 
   def _display_dataframe(self, data, update=None):
-    if update:
-      table_id = 'table_{}'.format(update)
-      html = _DATAFRAME_PAGINATION_TEMPLATE.format(
-          dataframe_html=data.to_html(notebook=True, table_id=table_id),
-          table_id=table_id)
-      update_display(HTML(html), display_id=update)
+    table_id = 'table_{}'.format(
+        update._df_display_id if update else self._df_display_id)
+    columns = [{
+        'title': ''
+    }] + [{
+        'title': str(column)
+    } for column in data.columns]
+    format_window_info_in_dataframe(data)
+    rows = data.applymap(lambda x: str(x)).to_dict('split')['data']
 
 Review comment:
   What is happening in here in the next few lines?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387359940
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/interactive_beam.py
 ##########
 @@ -86,10 +85,57 @@ def capture_duration(self, value):
       # The next PCollection evaluation will capture fresh data from sources,
       # and the data captured will be replayed until another eviction.
     """
+    assert value.total_seconds() > 0, 'Duration must be a positive value.'
     self.capture_control._capture_duration = value
 
   # TODO(BEAM-8335): add capture_size options when they are supported.
 
+  @property
+  def display_timestamp_format(self):
+    """The format in which timestamps are displayed.
+
+    Default is '%Y-%m-%d %H:%M:%S.%f%z', e.g. 2020-02-01 15:05:06.000015-08:00.
 
 Review comment:
   How do we plan to keep the defaults in sync between here and interactive_options?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387267410
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   Facets widgets doesn't depend on the jQuery we setup.
   The JS also doesn't depend on the webcomponent (it's for HTML import).
   So there is no need to wait for `onload` of anything.
   This is to avoid the DOM change when the output area containing the widgets being updated gets deleted by the user in the notebook and some JS exceptions could be thrown out.
   `display_javascript` would pollute the output area in this scenario.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] pabloem commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
pabloem commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-596672765
 
 
   Run Python PreCommit

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
KevinGG commented on a change in pull request #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#discussion_r387415941
 
 

 ##########
 File path: sdks/python/apache_beam/runners/interactive/display/pcoll_visualization.py
 ##########
 @@ -52,42 +56,89 @@
 except ImportError:
   _pcoll_visualization_ready = False
 
+_LOGGER = logging.getLogger(__name__)
+
 # 1-d types that need additional normalization to be compatible with DataFrame.
 _one_dimension_types = (int, float, str, bool, list, tuple)
 
+_CSS = """
+            <style>
+            .p-Widget.jp-OutputPrompt.jp-OutputArea-prompt:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            .p-Widget.jp-RenderedJavaScript.jp-mod-trusted.jp-OutputArea-output:empty {{
+              padding: 0;
+              border: 0;
+            }}
+            </style>"""
 _DIVE_SCRIPT_TEMPLATE = """
-            document.querySelector("#{display_id}").data = {jsonstr};"""
-_DIVE_HTML_TEMPLATE = """
+            try {{
+              document.querySelector("#{display_id}").data = {jsonstr};
+            }} catch (e) {{
+              console.log("#{display_id} is not rendered yet.");
+            }}"""
+_DIVE_HTML_TEMPLATE = _CSS + """
             <script src="https://cdnjs.cloudflare.com/ajax/libs/webcomponentsjs/1.3.3/webcomponents-lite.js"></script>
             <link rel="import" href="https://raw.githubusercontent.com/PAIR-code/facets/1.0.0/facets-dist/facets-jupyter.html">
             <facets-dive sprite-image-width="{sprite_size}" sprite-image-height="{sprite_size}" id="{display_id}" height="600"></facets-dive>
             <script>
               document.querySelector("#{display_id}").data = {jsonstr};
             </script>"""
 _OVERVIEW_SCRIPT_TEMPLATE = """
-              document.querySelector("#{display_id}").protoInput = "{protostr}";
-              """
-_OVERVIEW_HTML_TEMPLATE = """
+              try {{
+                document.querySelector("#{display_id}").protoInput = "{protostr}";
+              }} catch (e) {{
+                console.log("#{display_id} is not rendered yet.");
 
 Review comment:
   If this fails, it means the initially displayed widgets have been cleared from the DOM or the initial display hasn't completed yet (maybe because of some racing conditions). NOOP should be the right way to do so.
   
   The error is supposed to be logged in the console. However, if not caught, it also gets displayed in notebook output areas. By doing this, we kept the log and also avoid the output area pollution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [beam] aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization

Posted by GitBox <gi...@apache.org>.
aaltay commented on issue #11020: [BEAM-7926] Update Data Visualization
URL: https://github.com/apache/beam/pull/11020#issuecomment-596631448
 
 
   There are lint errors. You can run the linter locally to check for these errors. (See https://cwiki.apache.org/confluence/display/BEAM/Python+Tips for notes on how to do that.)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services