You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/01 18:11:18 UTC

[GitHub] [airflow] potiuk commented on pull request #12685: Production images on CI are now built from packages

potiuk commented on pull request #12685:
URL: https://github.com/apache/airflow/pull/12685#issuecomment-736726882


   @ashb -> I think you wanted this to happen and it's high time to get the prod image for tests (next step - the Dockerhub one) built from packages,
   
   I reviewed the list of extras installed by default and added a few - especially those that did not require any special dependencies (like http/ftp) - they worked previously but they would not work after changing to packages (only selected provider packages get installed!) 
   
   This is the list I came up with after the review. I based it on how "niche" particular providers are (but this is purely my perception, so I might be wrong).
   
   Included: "async,amazon,celery,cncf.kubernetes,docker,dask,elasticsearch,ftp,grpc,hashicorp,http,google,microsoft.azure,mysql,postgres,redis,sendgrid,sftp,slack,ssh,statsd,virtualenv"
   
   The recommendation is that people build their own images, choose the extras they want, and add extra packages, they need (and our image fully supports customization). But there will be a group of people that will rely on those images for their own production usage, so I think it's the right time to standardize it. We certainly do not want to install all providers, but I do not think we have some clear guidelines of what should be in the reference image so I have to resort to the wisdom of crowd. Once we agree to some proposal here, I will send it to devlist to vote on it.
   
   Maybe better will be to list those which are left:
   Excluded: 
   * all apache providers:  Cassandra, druid, hdfs, hive, kylin livy, pig, pinot, spark, sqoop
   * cloudant
   * databricks
   * datadog
   * dingding
   * discord
   * exasol
   * facebook
   * imap
   * jdbc
   * jenkins
   * jira
   * microsoft.mssql
   * microsoft.winrm
   * mongo
   * odbc
   * openfaas
   * oracle
   * pagerduty
   * papermill
   * plexus
   * presto
   * qubole
   * salesforce
   * samba
   * segment
   * singularity
   * snowflake
   * sqlite
   * vertica
   * yandex
   * zendesk
   
   @ashb, @kaxil, @turbaszek, @mik-laj. @XD-DENG, @feluelle, @eladkal, @ryw, @vikramkoka, @KevinYang21 (also others)  -> can you please take a look and comment if you think this list is "good" for reference image of ours? Any proposals to move providers between "excluded" and "included" ?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org