You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2020/12/10 20:07:43 UTC

[GitHub] [incubator-superset] li-ana opened a new issue #12005: Boxplot - wrong logic

li-ana opened a new issue #12005:
URL: https://github.com/apache/incubator-superset/issues/12005


   Box plot assumes that you will have only 1 observation per timestamp. In my case, this is not true, which means that the box plot will aggregate all of the observations per timestamp using the function that you select, thus it skews the results. It needs to look at raw data, instead of the aggregation. 
   
   Query produced by the Superset box plot:
   **
   ![image](https://user-images.githubusercontent.com/20177485/101823239-79851f80-3af8-11eb-917b-79121d8004ee.png)
   **
   Results of this query:
   ![image](https://user-images.githubusercontent.com/20177485/101823445-c2d56f00-3af8-11eb-8a27-4dafc6a98459.png)
   
   Box plot details:
   ![image](https://user-images.githubusercontent.com/20177485/101823581-fca67580-3af8-11eb-90ef-8444e4dad961.png)
   
   What it should be:
   ![image](https://user-images.githubusercontent.com/20177485/101823662-1e076180-3af9-11eb-9883-5e0eb787252d.png)
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro commented on issue #12005: Boxplot - wrong logic

Posted by GitBox <gi...@apache.org>.
villebro commented on issue #12005:
URL: https://github.com/apache/incubator-superset/issues/12005#issuecomment-743047029


   While we don't yet support using the full raw data, and then calculating the boxplot on that, we recently migrated Boxplot to ECharts and added some features in this PR: #11199 . In the below example I've created a Boxplot where the categories are continents and the distribution is calculated across countries (I'm using the average of total population, as the dataset contains data for multiple years):
   
   ![image](https://user-images.githubusercontent.com/33317356/101879303-cc62e380-3b99-11eb-9793-a82eb52575e8.png)
   
   If you have a row id, you can use that as the the "Distribute Across" parameter. The plan is to add support for using the raw row data, but I probably won't have time to work on it any time soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro edited a comment on issue #12005: Boxplot - wrong logic

Posted by GitBox <gi...@apache.org>.
villebro edited a comment on issue #12005:
URL: https://github.com/apache/incubator-superset/issues/12005#issuecomment-743047029


   While we don't yet support using the full raw data and then calculating the boxplot on that, we recently migrated Boxplot to ECharts and added some features in this PR: #11199 . In the below example I've created a Boxplot where the categories are continents and the distribution is calculated across countries (I'm using the average of total population, as the dataset contains data for multiple years):
   
   ![image](https://user-images.githubusercontent.com/33317356/101879303-cc62e380-3b99-11eb-9793-a82eb52575e8.png)
   
   If you have a row id, you can use that as the the "Distribute Across" parameter. The plan is to add support for using the raw row data, but I probably won't have time to work on it any time soon.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org