You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/17 13:42:56 UTC

[GitHub] [airflow] potiuk commented on pull request #12505: Fix S3ToSnowflakeOperator to support uploading all files in the specified stage

potiuk commented on pull request #12505:
URL: https://github.com/apache/airflow/pull/12505#issuecomment-761814656


   @dstandish  I think for now we can merge it as it is. We are going to release the snowflake provider soon, so I think having just s3 for now is ok - we can always extract the "Copy" "common code"  later but I also have a feeling that S3ToSnowflake/GCSToSnowflake etc. should remain as separate operators (at the very least the init parameters will be different for those ones). We have a generic "transfer" operator in the works to handle "any - to  - any" transfer mechanism proposed and there is alredy a design doc in place for this one:
   
   https://lists.apache.org/thread.html/rc888a329f1c49622c0123c2ddbcfcc107eead020b774f8a8fab6d7f1%40%3Cdev.airflow.apache.org%3E  
   
   This should allow to implement any "storage" and "sql" transfer  by just providing the right hook and this one is the "ultimate" generalisation, but even that generalisation assumes that we will continue to have specialized "XtoY" operators in many cases because there are some optimisations or specific requirements or capabilities of those "specialized" solutions that allow for some kind of optimalisations (for example parallel transfer of many files) or custom behaviors (generating bigquery schema for ToGCS transfers for example). 
   
   I think it's best to add the s3 -> snowflake fix and think about generalizations independently.
   
   @kurtqq @sekikn  -> can you please rebase that one to the latest master? The "snowflake" finally released the version of python connector that does not break other providers and I just merged (on Friday) the change that incorporates it and fixes some other dependencies, so it's worth testing if it works. I plan to release a set of providers soon and I would love this one to be merged! 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org