You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2020/02/11 22:25:41 UTC

[GitHub] [incubator-superset] john-bodley opened a new pull request #9122: [fix] Fix table viz column order

john-bodley opened a new pull request #9122: [fix] Fix table viz column order
URL: https://github.com/apache/incubator-superset/pull/9122
 
 
   ### CATEGORY
   
   Choose one
   
   - [x] Bug Fix
   - [ ] Enhancement (new features, refinement)
   - [ ] Refactor
   - [ ] Add tests
   - [ ] Build / Development Environment
   - [ ] Documentation
   
   ### SUMMARY
   
   This PR fixes a couple of issues related to the table visualization type. Originally the intent was to ensure that the table column orders adhered to the UI. The PR somewhat morphed to contain:
   
   1. Updated the validity matrix to ensure that defining percent metrics is not viable when `NOT GROUPED BY` columns are defined. 
   2. Replaced outdated Python filter/maps with list comprehensions and Pandas column operators.
   3. Ensure that the resulting data frame is ordered according to group-by or non-group-by column, metrics, and percent metrics.
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   <!--- Skip this if not applicable -->
   
   ### TEST PLAN
   
   CI and manual testing.
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Changes UI
   - [ ] Requires DB Migration.
   - [ ] Confirm DB Migration upgrade and downgrade tested.
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   
   ### REVIEWERS
   
   to: @michellethomas @mistercrunch @villebro 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley commented on a change in pull request #9122: [fix] Fix table viz column order

Posted by GitBox <gi...@apache.org>.
john-bodley commented on a change in pull request #9122: [fix] Fix table viz column order
URL: https://github.com/apache/incubator-superset/pull/9122#discussion_r377991255
 
 

 ##########
 File path: tests/druid_tests.py
 ##########
 @@ -90,12 +90,12 @@ def __reduce__(self):
     {
         "version": "v1",
         "timestamp": "2012-01-01T00:00:00.000Z",
-        "event": {"dim1": "Canada", "dim2": "boy", "metric1": 12345678},
+        "event": {"dim1": "Canada", "dim2": "boy", "count": 12345678},
 
 Review comment:
   This didn't represent the correct result set when using the `count` metric. Previously this wasn't a problem because the Table viz would throw out columns not in the data frame and the relevant tests don't check the associated metric column.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley merged pull request #9122: [fix] Fix table viz column order

Posted by GitBox <gi...@apache.org>.
john-bodley merged pull request #9122: [fix] Fix table viz column order
URL: https://github.com/apache/incubator-superset/pull/9122
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] codecov-io commented on issue #9122: [fix] Fix table viz column order

Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #9122: [fix] Fix table viz column order
URL: https://github.com/apache/incubator-superset/pull/9122#issuecomment-584966276
 
 
   # [Codecov](https://codecov.io/gh/apache/incubator-superset/pull/9122?src=pr&el=h1) Report
   > Merging [#9122](https://codecov.io/gh/apache/incubator-superset/pull/9122?src=pr&el=desc) into [master](https://codecov.io/gh/apache/incubator-superset/commit/291306392443a5a0d0e2ee0cc4a95d37c56d4589?src=pr&el=desc) will **decrease** coverage by `<.01%`.
   > The diff coverage is `n/a`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-superset/pull/9122/graphs/tree.svg?width=650&token=KsB0fHcx6l&height=150&src=pr)](https://codecov.io/gh/apache/incubator-superset/pull/9122?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff            @@
   ##           master   #9122      +/-   ##
   =========================================
   - Coverage    59.1%   59.1%   -0.01%     
   =========================================
     Files         372     372              
     Lines       11920   11922       +2     
     Branches     2917    2919       +2     
   =========================================
   + Hits         7045    7046       +1     
   - Misses       4693    4694       +1     
     Partials      182     182
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-superset/pull/9122?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [superset-frontend/src/chart/chartAction.js](https://codecov.io/gh/apache/incubator-superset/pull/9122/diff?src=pr&el=tree#diff-c3VwZXJzZXQtZnJvbnRlbmQvc3JjL2NoYXJ0L2NoYXJ0QWN0aW9uLmpz) | `43.33% <0%> (+0.09%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/incubator-superset/pull/9122?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/incubator-superset/pull/9122?src=pr&el=footer). Last update [2913063...a05bafd](https://codecov.io/gh/apache/incubator-superset/pull/9122?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley commented on a change in pull request #9122: [fix] Fix table viz column order

Posted by GitBox <gi...@apache.org>.
john-bodley commented on a change in pull request #9122: [fix] Fix table viz column order
URL: https://github.com/apache/incubator-superset/pull/9122#discussion_r378482103
 
 

 ##########
 File path: superset/viz.py
 ##########
 @@ -554,46 +555,50 @@ def query_obj(self):
 
         # Add all percent metrics that are not already in the list
         if "percent_metrics" in fd:
-            d["metrics"] = d["metrics"] + list(
-                filter(lambda m: m not in d["metrics"], fd["percent_metrics"] or [])
+            d["metrics"].extend(
+                m for m in fd["percent_metrics"] or [] if m not in d["metrics"]
 
 Review comment:
   @mistercrunch [`list.extend(iter)`](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) takes any iterable. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] mistercrunch commented on a change in pull request #9122: [fix] Fix table viz column order

Posted by GitBox <gi...@apache.org>.
mistercrunch commented on a change in pull request #9122: [fix] Fix table viz column order
URL: https://github.com/apache/incubator-superset/pull/9122#discussion_r378391583
 
 

 ##########
 File path: superset/viz.py
 ##########
 @@ -554,46 +555,50 @@ def query_obj(self):
 
         # Add all percent metrics that are not already in the list
         if "percent_metrics" in fd:
-            d["metrics"] = d["metrics"] + list(
-                filter(lambda m: m not in d["metrics"], fd["percent_metrics"] or [])
+            d["metrics"].extend(
+                m for m in fd["percent_metrics"] or [] if m not in d["metrics"]
 
 Review comment:
   no action required: I'm surprised to not see brackets around the comprehension, guessing it just returns a generator at that point and `.extend` is ok with that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] john-bodley commented on a change in pull request #9122: [fix] Fix table viz column order

Posted by GitBox <gi...@apache.org>.
john-bodley commented on a change in pull request #9122: [fix] Fix table viz column order
URL: https://github.com/apache/incubator-superset/pull/9122#discussion_r377938118
 
 

 ##########
 File path: superset/viz.py
 ##########
 @@ -554,46 +555,50 @@ def query_obj(self):
 
         # Add all percent metrics that are not already in the list
         if "percent_metrics" in fd:
-            d["metrics"] = d["metrics"] + list(
-                filter(lambda m: m not in d["metrics"], fd["percent_metrics"] or [])
+            d["metrics"].extend(
+                m for m in fd["percent_metrics"] or [] if m not in d["metrics"]
             )
 
         d["is_timeseries"] = self.should_be_timeseries()
         return d
 
     def get_data(self, df: pd.DataFrame) -> VizData:
-        fd = self.form_data
-        if not self.should_be_timeseries() and df is not None and DTTM_ALIAS in df:
+        """
+        Transform the query result to the table representation.
+
+        :param df: The interim dataframe
+        :returns: The table visualization data
+
+        The interim dataframe comprises of the group-by and non-group-by columns and
+        the union of the metrics representing the non-percent and percent metrics. Note
+        the percent metrics have yet to be transformed.
+        """
+
+        if not self.should_be_timeseries() and DTTM_ALIAS in df:
             del df[DTTM_ALIAS]
 
-        # Sum up and compute percentages for all percent metrics
-        percent_metrics = fd.get("percent_metrics") or []
-        percent_metrics = [utils.get_metric_name(m) for m in percent_metrics]
+        # Transform the data frame to adhere to the UI ordering of the columns and
+        # metrics whilst simultaneously computing the percentages (via normalization)
+        # for the percent metrics.
+        non_percent_metric_columns = (
+            self.form_data.get("all_columns") or self.form_data.get("groupby") or []
+        ) + utils.get_metric_names(self.form_data.get("metrics") or [])
 
-        if len(percent_metrics):
-            percent_metrics = list(filter(lambda m: m in df, percent_metrics))
-            metric_sums = {
-                m: reduce(lambda a, b: a + b, df[m]) for m in percent_metrics
-            }
-            metric_percents = {
-                m: list(
-                    map(
-                        lambda a: None if metric_sums[m] == 0 else a / metric_sums[m],
-                        df[m],
-                    )
-                )
-                for m in percent_metrics
-            }
-            for m in percent_metrics:
-                m_name = "%" + m
-                df[m_name] = pd.Series(metric_percents[m], name=m_name)
-            # Remove metrics that are not in the main metrics list
-            metrics = fd.get("metrics") or []
-            metrics = [utils.get_metric_name(m) for m in metrics]
-            for m in filter(
-                lambda m: m not in metrics and m in df.columns, percent_metrics
-            ):
-                del df[m]
+        percent_metric_columns = utils.get_metric_names(
+            self.form_data.get("percent_metrics") or []
+        )
+
+        df = pd.concat(
 
 Review comment:
   This logic has been simplified to leverage Pandas `div` and `sum` column operators. The subset of the data frame excluding the percent metrics is concatenated with the computation associated with the percentage metrics.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org