You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/23 07:10:18 UTC

[GitHub] [airflow] wanderijames edited a comment on pull request #17178: Amazon EMR on Amazon EKS

wanderijames edited a comment on pull request #17178:
URL: https://github.com/apache/airflow/pull/17178#issuecomment-885423290


   > We currently have one operator that allows us to run Spark job on Kubernetes. It works with both EKS and GCP as well as any other Kubernetes platform. - [SparkKubernetesOperator](https://github.com/apache/airflow/blob/d72b363929c86eb03fc9583002459bd10bc7eaeb/airflow/providers/cncf/kubernetes/operators/spark_kubernetes.py#L24). Why would anyone use this operator instead of the generic operator for Kubernetes?
   
   Hey @mik-laj, I am aware of apache spark and livy operator and also EMR operator. However, EMR on EKS works differently because EMR launches virtual cluster in your EKS. The pods (spark master and executors) launched are ephemeral, only existing when a start job is invoked. For more information, kindly visit https://aws.amazon.com/emr/features/eks/
   
   In addition, SparkKubernetesOperator is only suitable if you have Spark cluster has been setup in Kubernetes. In this case, it is not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org