You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/09/21 17:39:07 UTC

[GitHub] [airflow] harryplumer opened a new issue, #26567: Changes to SqlToS3Operator Breaking CSV formats

harryplumer opened a new issue, #26567:
URL: https://github.com/apache/airflow/issues/26567

   ### Apache Airflow Provider(s)
   
   amazon
   
   ### Versions of Apache Airflow Providers
   
   `apache-airflow-providers-amazon==5.1.0`
   
   ### Apache Airflow version
   
   2.3.4
   
   ### Operating System
   
   Linux
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   Once https://github.com/apache/airflow/pull/25083 was merged, when using CSV as the output format, null strings started appearing as `"None"` in the actual CSV export. This will  cause unintended behavior in most use cases for reading the CSV including uploading to databases.
   
   Certain databases such as Snowflake allow for things like `NULL_IF` on import however there are times where you would want the actual string "None" to be in the field and there would be no way at that point to distinguish.
   
   Before:
   
   ![Screen Shot 2022-09-21 at 11 36 00 AM](https://user-images.githubusercontent.com/30101670/191572950-f2abed8b-55bf-43f8-b166-acf81cb52f06.png)
   
   After:
   
   ![Screen Shot 2022-09-21 at 11 35 52 AM](https://user-images.githubusercontent.com/30101670/191572967-bc61f563-b92b-4678-b22e-befa5511cca8.png)
   
   
   ### What you think should happen instead
   
   The strings should be empty as they did previously. I understand the implementation of the recent PR for parquet and propose that we add an additional condition to line 138 of the `sql_to_s3.py` file restricting that to only if the chosen output is parquet.
   
   ### How to reproduce
   
   Run the `SqlToS3Operator` with the default output format of `CSV` on any query that selects a column of type string that allows null. Look the outputted CSV in S3.
   
   ### Anything else
   
   Every time we select a nullable column with the `SqlToS3Operator`
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] josh-fell commented on issue #26567: Changes to SqlToS3Operator Breaking CSV formats

Posted by GitBox <gi...@apache.org>.
josh-fell commented on issue #26567:
URL: https://github.com/apache/airflow/issues/26567#issuecomment-1254046894

   Feel free to open a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #26567: Changes to SqlToS3Operator Breaking CSV formats

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #26567: Changes to SqlToS3Operator Breaking CSV formats
URL: https://github.com/apache/airflow/issues/26567


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org