You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/03/13 23:21:24 UTC

[GitHub] [incubator-superset] john-bodley edited a comment on issue #4905: [missing values] Reverting replacing missing values with zeros

john-bodley edited a comment on issue #4905: [missing values] Reverting replacing missing values with zeros
URL: https://github.com/apache/incubator-superset/pull/4905#issuecomment-472598963
 
 
   Here is an update using a toy time-series data set which contains missing value and has the following specific features:
   
   1. There is missing values at the _beginning_ of the time-series.
   2. There is missing values at the _end_ of the time-series.
   3. There are missing values resulting in a single _siloed_ data point as well as _adjacent_ data points. 
   
   For context the data set is defined as:
   
   ![sql-lab-preview](https://user-images.githubusercontent.com/4567245/54312685-c3904880-4594-11e9-884a-3618038ec9ee.png)
   
   Bellow are examples of the current and proposed charts. Additionally I tested other chart types which could have been potentially impacted by this change. 
   
   **Line Chart (current)**
   
   ![line-chart-current](https://user-images.githubusercontent.com/4567245/54312343-f2f28580-4593-11e9-96be-5699875fa048.png)
   
   **Line Chart (proposed)**
   
   ![line-chart-proposed](https://user-images.githubusercontent.com/4567245/54312344-f2f28580-4593-11e9-92cf-6eeeda46837f.png)
   
   Note that there is an issue with the NDV3 line chart where the siloed data point does not render other than when one hovers over it. This issue will be fixed with the `data-ui` toolkit.
   
   **Pie Chart (current)**
   
   ![pie-chart-current](https://user-images.githubusercontent.com/4567245/54312324-e706c380-4593-11e9-87b8-014517db40f4.png)
   
   **Pie Chart (proposed)**
   
   ![pie-chart-proposed](https://user-images.githubusercontent.com/4567245/54312325-e706c380-4593-11e9-8b12-1d9ec8ecd158.png)
   
   Note there is no change here, though the code has been changed to ensure that the NULL values are not dropped in the pivot-table. Internally there was some debate whether this was the right approach, but I sense i) it ensures consistency with other charts, and ii) it accurately represents the underly result set, i.e., it informs the user that a dimension exists however the value is either zero of undefined.  
   
   **Bar Chart (current)**
   
   ![bar-chart-current](https://user-images.githubusercontent.com/4567245/54313274-31893f80-4596-11e9-8ecc-f1b074e526be.png)
   
   **Bar Chart (proposed)**
   
   ![bar-chart-proposed](https://user-images.githubusercontent.com/4567245/54313275-3221d600-4596-11e9-9854-ffbb39b10b00.png)
   
   Note this remains unchanged like the Pie Chart and ensures consistency in terms of defining either zero or undefined values.
   
   **Pivot Table (current)**
   
   ![pivot-table-current](https://user-images.githubusercontent.com/4567245/54312326-e706c380-4593-11e9-9b22-6f6c362fa575.png)
   
   **Pivot Table (proposed)**
   
   ![pivot-table-proposed](https://user-images.githubusercontent.com/4567245/54312380-11588100-4594-11e9-839a-dd2d9bf55c22.png)
   
   Note there seems to be an issue with any value other than an empty string being rendered as `NaN`. The current code uses an empty string as the `na_rep` for [`pandas.DataFrame.to_html`](https://pandas.pydata.org/pandas-docs/version/0.23.3/generated/pandas.DataFrame.to_html.html) which is inconsistent with the Table visualization type. Additionally I explicitly made Pandas behave like SQL where the summation of only NULLs returns NULL.
   
   to: @kristw @michellethomas @mistercrunch @williaster 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org