You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2022/04/26 04:25:25 UTC

[GitHub] [superset] zhaoyongjie opened a new pull request, #19842: fix: count(distinct column_name) in metrics

zhaoyongjie opened a new pull request, #19842:
URL: https://github.com/apache/superset/pull/19842

   ### SUMMARY
   The aggregate function **count_distinct** isn't ANSI SQL.
   Currently, when users use `simple metric`, the AdhocMetric control will generate a `count_distinct(column)` label(verbose name). If user switches to SQL tab, this label will apply to SQL. It isn't ANSI SQL, so directly use this SQL snippet, the error will appear in the most databases.
   
   This PR transform _count distinct metric_ from `count_distinct(column)` to `count(distinct column)`, but does not change the original metric label(verbose name)
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   <!--- Skip this if not applicable -->
   
   ### TESTING INSTRUCTIONS
   <!--- Required! What steps can be taken to manually verify the changes? -->
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in [SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a diff in pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
villebro commented on code in PR #19842:
URL: https://github.com/apache/superset/pull/19842#discussion_r858325003


##########
superset-frontend/src/explore/components/controls/MetricControl/AdhocMetric.js:
##########
@@ -86,20 +89,30 @@ export default class AdhocMetric {
   }
 
   getDefaultLabel() {
-    const label = this.translateToSql(true);
+    const label = this.translateToSql({ useVerboseName: true });
     return label.length < 43 ? label : `${label.substring(0, 40)}...`;
   }
 
-  translateToSql(useVerboseName = false) {
+  translateToSql(
+    params = { useVerboseName: false, transformCountDistinct: false },
+  ) {
     if (this.expressionType === EXPRESSION_TYPES.SIMPLE) {
       const aggregate = this.aggregate || '';
       // eslint-disable-next-line camelcase
       const column =
-        useVerboseName && this.column?.verbose_name
+        params.useVerboseName && this.column?.verbose_name
           ? `(${this.column.verbose_name})`
           : this.column?.column_name
           ? `(${this.column.column_name})`
           : '';
+      // transform from `count_distinct(column)` to `count(distinct column)`
+      if (
+        params.transformCountDistinct &&
+        aggregate === AGGREGATES.COUNT_DISTINCT &&
+        /^\(.*\)$/.test(column)
+      ) {
+        return `COUNT(DISTINCT ${column.slice(1, -1)})`;
+      }

Review Comment:
   I had to reread these lines a few times to understand what was going on (mostly there from before this PR so not your fault!). IMO it would be more readable if we could first do something like
   ```js
   const column =
     useVerboseName && this.column?.verbose_name
     params.useVerboseName && this.column?.verbose_name
       this.column.verbose_name
       : this.column?.column_name
       ? this.column.column_name
       : '';
   ```
   and then something like
   ```js
   if (params.transformCountDistinct && aggregate === AGGREGATES.COUNT_DISTINCT) {
     return `COUNT(DISTINCT ${column})`;
   }
   return `${aggregate}($column)`;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] sadpandajoe commented on pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
sadpandajoe commented on PR #19842:
URL: https://github.com/apache/superset/pull/19842#issuecomment-1115140389

   🏷️ preset:2022.17


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a diff in pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
villebro commented on code in PR #19842:
URL: https://github.com/apache/superset/pull/19842#discussion_r858394703


##########
superset-frontend/src/explore/components/controls/MetricControl/AdhocMetric.js:
##########
@@ -86,20 +89,30 @@ export default class AdhocMetric {
   }
 
   getDefaultLabel() {
-    const label = this.translateToSql(true);
+    const label = this.translateToSql({ useVerboseName: true });
     return label.length < 43 ? label : `${label.substring(0, 40)}...`;
   }
 
-  translateToSql(useVerboseName = false) {
+  translateToSql(
+    params = { useVerboseName: false, transformCountDistinct: false },
+  ) {
     if (this.expressionType === EXPRESSION_TYPES.SIMPLE) {
       const aggregate = this.aggregate || '';
       // eslint-disable-next-line camelcase
       const column =
-        useVerboseName && this.column?.verbose_name
+        params.useVerboseName && this.column?.verbose_name
           ? `(${this.column.verbose_name})`
           : this.column?.column_name
           ? `(${this.column.column_name})`
           : '';
+      // transform from `count_distinct(column)` to `count(distinct column)`
+      if (
+        params.transformCountDistinct &&
+        aggregate === AGGREGATES.COUNT_DISTINCT &&
+        /^\(.*\)$/.test(column)
+      ) {
+        return `COUNT(DISTINCT ${column.slice(1, -1)})`;
+      }

Review Comment:
   Agreed, if tou think this is dangerous to refactor let's do it in a separate PR 👍



##########
superset-frontend/src/explore/components/controls/MetricControl/AdhocMetric.js:
##########
@@ -86,20 +89,30 @@ export default class AdhocMetric {
   }
 
   getDefaultLabel() {
-    const label = this.translateToSql(true);
+    const label = this.translateToSql({ useVerboseName: true });
     return label.length < 43 ? label : `${label.substring(0, 40)}...`;
   }
 
-  translateToSql(useVerboseName = false) {
+  translateToSql(
+    params = { useVerboseName: false, transformCountDistinct: false },
+  ) {
     if (this.expressionType === EXPRESSION_TYPES.SIMPLE) {
       const aggregate = this.aggregate || '';
       // eslint-disable-next-line camelcase
       const column =
-        useVerboseName && this.column?.verbose_name
+        params.useVerboseName && this.column?.verbose_name
           ? `(${this.column.verbose_name})`
           : this.column?.column_name
           ? `(${this.column.column_name})`
           : '';
+      // transform from `count_distinct(column)` to `count(distinct column)`
+      if (
+        params.transformCountDistinct &&
+        aggregate === AGGREGATES.COUNT_DISTINCT &&
+        /^\(.*\)$/.test(column)
+      ) {
+        return `COUNT(DISTINCT ${column.slice(1, -1)})`;
+      }

Review Comment:
   Agreed, if you think this is dangerous to refactor let's do it in a separate PR 👍



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] zhaoyongjie commented on pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
zhaoyongjie commented on PR #19842:
URL: https://github.com/apache/superset/pull/19842#issuecomment-1109414800

   > Lgtm! 1 suggestion - should we rename `COUNT_DISTINCT` to `COUNT DISTINCT` on our list of aggregates in Simple tab?
   
   Thanks @kgabryje! the `count_distinct` is used for a lot of places, and `COUNT_DISTINCT` also affects label(verbose name in query), so I didn't modify it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] zhaoyongjie commented on a diff in pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
zhaoyongjie commented on code in PR #19842:
URL: https://github.com/apache/superset/pull/19842#discussion_r858332944


##########
superset-frontend/src/explore/components/controls/MetricControl/AdhocMetric.js:
##########
@@ -86,20 +89,30 @@ export default class AdhocMetric {
   }
 
   getDefaultLabel() {
-    const label = this.translateToSql(true);
+    const label = this.translateToSql({ useVerboseName: true });
     return label.length < 43 ? label : `${label.substring(0, 40)}...`;
   }
 
-  translateToSql(useVerboseName = false) {
+  translateToSql(
+    params = { useVerboseName: false, transformCountDistinct: false },
+  ) {
     if (this.expressionType === EXPRESSION_TYPES.SIMPLE) {
       const aggregate = this.aggregate || '';
       // eslint-disable-next-line camelcase
       const column =
-        useVerboseName && this.column?.verbose_name
+        params.useVerboseName && this.column?.verbose_name
           ? `(${this.column.verbose_name})`
           : this.column?.column_name
           ? `(${this.column.column_name})`
           : '';
+      // transform from `count_distinct(column)` to `count(distinct column)`
+      if (
+        params.transformCountDistinct &&
+        aggregate === AGGREGATES.COUNT_DISTINCT &&
+        /^\(.*\)$/.test(column)
+      ) {
+        return `COUNT(DISTINCT ${column.slice(1, -1)})`;
+      }

Review Comment:
   The previous logic may not have been covered by UT, so I skipped these very carefully. I think it would be better to add some UTs to this part of the code and then modify it all together. 
   
   Of course, if this needs to be done in this PR, I'm all for it. What do you think about this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] zhaoyongjie merged pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
zhaoyongjie merged PR #19842:
URL: https://github.com/apache/superset/pull/19842


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on a diff in pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
villebro commented on code in PR #19842:
URL: https://github.com/apache/superset/pull/19842#discussion_r858325003


##########
superset-frontend/src/explore/components/controls/MetricControl/AdhocMetric.js:
##########
@@ -86,20 +89,30 @@ export default class AdhocMetric {
   }
 
   getDefaultLabel() {
-    const label = this.translateToSql(true);
+    const label = this.translateToSql({ useVerboseName: true });
     return label.length < 43 ? label : `${label.substring(0, 40)}...`;
   }
 
-  translateToSql(useVerboseName = false) {
+  translateToSql(
+    params = { useVerboseName: false, transformCountDistinct: false },
+  ) {
     if (this.expressionType === EXPRESSION_TYPES.SIMPLE) {
       const aggregate = this.aggregate || '';
       // eslint-disable-next-line camelcase
       const column =
-        useVerboseName && this.column?.verbose_name
+        params.useVerboseName && this.column?.verbose_name
           ? `(${this.column.verbose_name})`
           : this.column?.column_name
           ? `(${this.column.column_name})`
           : '';
+      // transform from `count_distinct(column)` to `count(distinct column)`
+      if (
+        params.transformCountDistinct &&
+        aggregate === AGGREGATES.COUNT_DISTINCT &&
+        /^\(.*\)$/.test(column)
+      ) {
+        return `COUNT(DISTINCT ${column.slice(1, -1)})`;
+      }

Review Comment:
   I had to reread these lines a few times to understand what was going on (mostly there from before this PR so not your fault!). IMO it would be more readable if we could first do something like
   ```js
   const column =
     params.useVerboseName && this.column?.verbose_name
       this.column.verbose_name
       : this.column?.column_name
       ? this.column.column_name
       : '';
   ```
   and then something like
   ```js
   if (params.transformCountDistinct && aggregate === AGGREGATES.COUNT_DISTINCT) {
     return `COUNT(DISTINCT ${column})`;
   }
   return `${aggregate}($column)`;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] codecov[bot] commented on pull request #19842: fix: count(distinct column_name) in metrics

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #19842:
URL: https://github.com/apache/superset/pull/19842#issuecomment-1109350660

   # [Codecov](https://codecov.io/gh/apache/superset/pull/19842?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#19842](https://codecov.io/gh/apache/superset/pull/19842?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (d6369fc) into [master](https://codecov.io/gh/apache/superset/commit/523bd8b79cfd48d1cb3a94f89c8095976844ce59?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (523bd8b) will **increase** coverage by `0.00%`.
   > The diff coverage is `100.00%`.
   
   ```diff
   @@           Coverage Diff           @@
   ##           master   #19842   +/-   ##
   =======================================
     Coverage   66.55%   66.55%           
   =======================================
     Files        1692     1692           
     Lines       64802    64804    +2     
     Branches     6657     6657           
   =======================================
   + Hits        43129    43131    +2     
     Misses      19973    19973           
     Partials     1700     1700           
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | javascript | `51.25% <100.00%> (+<0.01%)` | :arrow_up: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/superset/pull/19842?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...e/components/controls/MetricControl/AdhocMetric.js](https://codecov.io/gh/apache/superset/pull/19842/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQtZnJvbnRlbmQvc3JjL2V4cGxvcmUvY29tcG9uZW50cy9jb250cm9scy9NZXRyaWNDb250cm9sL0FkaG9jTWV0cmljLmpz) | `95.65% <100.00%> (+0.19%)` | :arrow_up: |
   | [...ols/MetricControl/AdhocMetricEditPopover/index.jsx](https://codecov.io/gh/apache/superset/pull/19842/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3VwZXJzZXQtZnJvbnRlbmQvc3JjL2V4cGxvcmUvY29tcG9uZW50cy9jb250cm9scy9NZXRyaWNDb250cm9sL0FkaG9jTWV0cmljRWRpdFBvcG92ZXIvaW5kZXguanN4) | `76.28% <100.00%> (ø)` | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/superset/pull/19842?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/superset/pull/19842?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [523bd8b...d6369fc](https://codecov.io/gh/apache/superset/pull/19842?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org