You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2022/07/29 17:20:09 UTC

[GitHub] [superset] mbcsa opened a new issue, #20919: Export to CSV with cached data is converted to Panda Dataframe without column formats

mbcsa opened a new issue, #20919:
URL: https://github.com/apache/superset/issues/20919

   Export to CSV with cached data is converted to Panda Dataframe without column formats.
   Then this breaks the usage of CSV_EXPORT option 'sep' (decimal separator).
   
   The first time the data is exported, all works well and decimal separator is respected.
   
   But when the query is executed again during the Cache Timeout, the data is gathered from "results_backend" and a Dataframe is dynamically created. This way, the Dataframe doesn't have column format specifications
   
   It was working fine until merged commit e1fd90697c1ed4f72e7982629779783ad9736a47
   https://github.com/apache/superset/pull/20760
   
   In file superset/core.py, line dtype=object is setting "object" type for all columns of Dataframe.
   
   ```
   def csv(...)
       ...
       df = pd.DataFrame(
           data=obj["data"],
           dtype=object,
           columns=[c["name"] for c in obj["columns"]],
       )
       ...
   ```
   When removing line "dtype=object,", the CSV works correctly:
   ```
   def csv(...)
       ...
       df = pd.DataFrame(
           data=obj["data"],
           columns=[c["name"] for c in obj["columns"]],
       )
       ...
   ```
   
   #### How to reproduce the bug
   
   1. In superset_config.py configure CSV_EXPORT options:
   ```
   CSV_EXPORT = {
       'encoding': 'utf-8',
       'sep': ';',
       'decimal': ',',
   }
   ```
   2. Go to /superset/sqllab/
   5. Create a NEW SQL query having a Decimal / Float / Real column.
   For Example:
   ![image](https://user-images.githubusercontent.com/92950610/181808295-b7d3b433-857e-4307-ab58-70b87d1099be.png)
   7. Export results to CSV
   8. Note that exported CSV file is correctly formed with configured decimal separator ","
   9. Execute again the SAME SQL, pressing Run button
   10. Export results to CSV AGAIN
   11. Note that exported CSV file has a point "." for decimal separator, instead of ",".
   
   ### Expected results
   
   Export to CSV to use configured decimal separator, either is using Cached data or not.
   
   ### Actual results
   
   OK - Export to CSV is using configured decimal separator when data is comming without caching, directly from DB.
   FAIL - Expor to CSV is NOT using configured decimal separator when data is comming from "results_backend" CACHE. 
   
   #### Screenshots
   
   Pandas Dataframe when cached
   
   ![image](https://user-images.githubusercontent.com/92950610/181809289-c527f913-0758-4aaf-b25a-306c58460fd9.png)
   
   Pandas Dataframe when NOT cached
   
   ![image](https://user-images.githubusercontent.com/92950610/181809345-4bf545a9-fb64-41d0-9f2b-f7c1ed2d0a59.png)
   
   
   ### Environment
   - browser type and version: Chrome / Brave / Firefox
   - superset version: Docker `Superset 0.0.0dev`
   - docker build rusackas Thu Jul 28 17:36:00 UTC 2022
   - any feature flags active:
       "ALERT_REPORTS": True,
       "ENABLE_TEMPLATE_PROCESSING": True
   
   ### Checklist
   
   Make sure to follow these steps before submitting your issue - thank you!
   
   - [x] I have checked the superset logs for python stacktraces and included it here as text if there are any.
   - [x] I have reproduced the issue with at least the latest released version of superset.
   - [x] I have checked the issue tracker for the same issue and I haven't found one similar.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [I] Export to CSV with cached data is converted to Panda Dataframe without column formats [superset]

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas closed issue #20919: Export to CSV with cached data is converted to Panda Dataframe without column formats
URL: https://github.com/apache/superset/issues/20919


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


Re: [I] Export to CSV with cached data is converted to Panda Dataframe without column formats [superset]

Posted by "rusackas (via GitHub)" <gi...@apache.org>.
rusackas commented on issue #20919:
URL: https://github.com/apache/superset/issues/20919#issuecomment-1936428670

   I'm guessing the linked PR should have closed this. If this needs to be reopened, say the word!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org