You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Mich Talebzadeh <mi...@gmail.com> on 2022/02/23 18:46:19 UTC

Re: [Fork] ]RE: One click to run Spark on Kubernetes

Hi Janak,

Are you talking about EKS Fargate?
Thanks





   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Wed, 23 Feb 2022 at 17:47, Agarwal, Janak <ja...@amazon.com> wrote:

> [Reducing to thread participants to avoid spamming the entire community’s
> mailboxes]
>
>
>
> Sarath, Bo, Mich,
>
>
>
> Have you read about EMR on EKS <https://aws.amazon.com/emr/features/eks/>?
> We help customers to run Spark workloads on EKS. Today, EMR on EKS supports
> running Spark workloads on your EKS cluster. You will need to setup the EKS
> cluster yourself. To achieve one-click, all you really need to do is setup
> the EKS cluster. As mentioned earlier, setting up EKS cluster is fairly
> simple. We can help you to do that if it helps. Want to give EMR on EKS a
> spin as you decide your path forward?
>
> <Disclaimer: I’m the Product Manager for EMR on EKS>
>
>
>
> Best,
>
> Janak
>
>
>
> *From:* Sarath Annareddy <sa...@gmail.com>
> *Sent:* Wednesday, February 23, 2022 7:41 AM
> *To:* bo yang <bo...@gmail.com>
> *Cc:* Mich Talebzadeh <mi...@gmail.com>; Spark Dev List <
> dev@spark.apache.org>; user <us...@spark.apache.org>
> *Subject:* RE: [EXTERNAL] One click to run Spark on Kubernetes
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi bo
>
>
>
> I am interested to contribute.
>
> But I don’t have free access to any cloud provider. Not sure how I can get
> free access. I know Google, aws, azure only provides temp free access, it
> may not be sufficient.
>
>
>
> Guidance is appreciated.
>
>
>
> Sarath
>
> Sent from my iPhone
>
>
>
> On Feb 23, 2022, at 2:01 AM, bo yang <bo...@gmail.com> wrote:
>
> 
>
> Right, normally people start with simple script, then add more stuff, like
> permission and more components. After some time, people want to run the
> script consistently in different environments. Things will become complex.
>
>
>
> That is why we want to see whether people have interest for such a "one
> click" tool to make things easy.
>
>
>
>
>
> On Tue, Feb 22, 2022 at 11:31 PM Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
> Hi,
>
>
>
> There are two distinct actions here; namely Deploy and Run.
>
>
>
> Deployment can be done by command line script with autoscaling. In the
> newer versions of Kubernnetes you don't even need to specify the node
> types, you can leave it to the Kubernetes cluster  to scale up and down and
> decide on node type.
>
>
>
> The second point is the running spark that you will need to submit.
> However, that depends on setting up access permission, use of service
> accounts, pulling the correct dockerfiles for the driver and the executors.
> Those details add to the complexity.
>
>
>
> Thanks
>
>
>
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Wed, 23 Feb 2022 at 04:06, bo yang <bo...@gmail.com> wrote:
>
> Hi Spark Community,
>
>
>
> We built an open source tool to deploy and run Spark on Kubernetes with a
> one click command. For example, on AWS, it could automatically create an
> EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will
> be able to use curl or a CLI tool to submit Spark application. After the
> deployment, you could also install Uber Remote Shuffle Service to enable
> Dynamic Allocation on Kuberentes.
>
>
>
> Anyone interested in using or working together on such a tool?
>
>
>
> Thanks,
>
> Bo
>
>
>
>

RE: [Fork] ]RE: One click to run Spark on Kubernetes

Posted by "Agarwal, Janak" <ja...@amazon.com.INVALID>.
Mich,

Not sure I follow you since I do not fully understand what GKE conventional is (which at first glance, appears to help customers to setup Kubernetes environment).
EMR on EKS offers a fully managed control plane (among other benefits such as Spark UI for completed jobs) that allows customers to focus on running Spark application on their EKS cluster.

Thanks,
Janak

From: Mich Talebzadeh <mi...@gmail.com>
Sent: Wednesday, February 23, 2022 11:54 AM
To: Agarwal, Janak <ja...@amazon.com>
Cc: Spark dev list <de...@spark.apache.org>
Subject: RE: [EXTERNAL] [Fork] ]RE: One click to run Spark on Kubernetes


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Thanks Janak,  the same as GKE conventional or GKE autopilot. <https://cloud.google.com/kubernetes-engine>

Putting conventional aside, why do you think customers should choose a fully managed package for Spark?

thanks




 [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]   view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.




On Wed, 23 Feb 2022 at 19:00, Agarwal, Janak <ja...@amazon.com>> wrote:
Hey Mich,

EMR on EKS<https://aws.amazon.com/emr/features/eks/> works on both EKS-Fargate and EKS-managed/self-managed EC2 based node groups.

Thanks,
Janak

From: Mich Talebzadeh <mi...@gmail.com>>
Sent: Wednesday, February 23, 2022 10:46 AM
To: Agarwal, Janak <ja...@amazon.com>>
Cc: Spark dev list <de...@spark.apache.org>>
Subject: RE: [EXTERNAL] [Fork] ]RE: One click to run Spark on Kubernetes


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi Janak,

Are you talking about
EKS Fargate?
Thanks







 [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]   view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.




On Wed, 23 Feb 2022 at 17:47, Agarwal, Janak <ja...@amazon.com>> wrote:
[Reducing to thread participants to avoid spamming the entire community’s mailboxes]

Sarath, Bo, Mich,

Have you read about EMR on EKS<https://aws.amazon.com/emr/features/eks/>? We help customers to run Spark workloads on EKS. Today, EMR on EKS supports running Spark workloads on your EKS cluster. You will need to setup the EKS cluster yourself. To achieve one-click, all you really need to do is setup the EKS cluster. As mentioned earlier, setting up EKS cluster is fairly simple. We can help you to do that if it helps. Want to give EMR on EKS a spin as you decide your path forward?
<Disclaimer: I’m the Product Manager for EMR on EKS>

Best,
Janak

From: Sarath Annareddy <sa...@gmail.com>>
Sent: Wednesday, February 23, 2022 7:41 AM
To: bo yang <bo...@gmail.com>>
Cc: Mich Talebzadeh <mi...@gmail.com>>; Spark Dev List <de...@spark.apache.org>>; user <us...@spark.apache.org>>
Subject: RE: [EXTERNAL] One click to run Spark on Kubernetes


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi bo

I am interested to contribute.
But I don’t have free access to any cloud provider. Not sure how I can get free access. I know Google, aws, azure only provides temp free access, it may not be sufficient.

Guidance is appreciated.

Sarath
Sent from my iPhone

On Feb 23, 2022, at 2:01 AM, bo yang <bo...@gmail.com>> wrote:

Right, normally people start with simple script, then add more stuff, like permission and more components. After some time, people want to run the script consistently in different environments. Things will become complex.

That is why we want to see whether people have interest for such a "one click" tool to make things easy.


On Tue, Feb 22, 2022 at 11:31 PM Mich Talebzadeh <mi...@gmail.com>> wrote:
Hi,

There are two distinct actions here; namely Deploy and Run.

Deployment can be done by command line script with autoscaling. In the newer versions of Kubernnetes you don't even need to specify the node types, you can leave it to the Kubernetes cluster  to scale up and down and decide on node type.

The second point is the running spark that you will need to submit. However, that depends on setting up access permission, use of service accounts, pulling the correct dockerfiles for the driver and the executors. Those details add to the complexity.

Thanks




 [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]   view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.




On Wed, 23 Feb 2022 at 04:06, bo yang <bo...@gmail.com>> wrote:
Hi Spark Community,

We built an open source tool to deploy and run Spark on Kubernetes with a one click command. For example, on AWS, it could automatically create an EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will be able to use curl or a CLI tool to submit Spark application. After the deployment, you could also install Uber Remote Shuffle Service to enable Dynamic Allocation on Kuberentes.

Anyone interested in using or working together on such a tool?

Thanks,
Bo


Re: [Fork] ]RE: One click to run Spark on Kubernetes

Posted by Mich Talebzadeh <mi...@gmail.com>.
Thanks Janak,  the same as GKE conventional or GKE autopilot.
<https://cloud.google.com/kubernetes-engine>

Putting conventional aside, why do you think customers should choose a
fully managed package* for Spark*?

thanks



   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Wed, 23 Feb 2022 at 19:00, Agarwal, Janak <ja...@amazon.com> wrote:

> Hey Mich,
>
>
>
> EMR on EKS <https://aws.amazon.com/emr/features/eks/> works on both
> EKS-Fargate and EKS-managed/self-managed EC2 based node groups.
>
>
>
> Thanks,
>
> Janak
>
>
>
> *From:* Mich Talebzadeh <mi...@gmail.com>
> *Sent:* Wednesday, February 23, 2022 10:46 AM
> *To:* Agarwal, Janak <ja...@amazon.com>
> *Cc:* Spark dev list <de...@spark.apache.org>
> *Subject:* RE: [EXTERNAL] [Fork] ]RE: One click to run Spark on Kubernetes
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi Janak,
>
>
>
> Are you talking about
> EKS Fargate?
> Thanks
>
>
>
>
>
>
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Wed, 23 Feb 2022 at 17:47, Agarwal, Janak <ja...@amazon.com> wrote:
>
> [Reducing to thread participants to avoid spamming the entire community’s
> mailboxes]
>
>
>
> Sarath, Bo, Mich,
>
>
>
> Have you read about EMR on EKS <https://aws.amazon.com/emr/features/eks/>?
> We help customers to run Spark workloads on EKS. Today, EMR on EKS supports
> running Spark workloads on your EKS cluster. You will need to setup the EKS
> cluster yourself. To achieve one-click, all you really need to do is setup
> the EKS cluster. As mentioned earlier, setting up EKS cluster is fairly
> simple. We can help you to do that if it helps. Want to give EMR on EKS a
> spin as you decide your path forward?
>
> <Disclaimer: I’m the Product Manager for EMR on EKS>
>
>
>
> Best,
>
> Janak
>
>
>
> *From:* Sarath Annareddy <sa...@gmail.com>
> *Sent:* Wednesday, February 23, 2022 7:41 AM
> *To:* bo yang <bo...@gmail.com>
> *Cc:* Mich Talebzadeh <mi...@gmail.com>; Spark Dev List <
> dev@spark.apache.org>; user <us...@spark.apache.org>
> *Subject:* RE: [EXTERNAL] One click to run Spark on Kubernetes
>
>
>
> *CAUTION*: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
>
>
>
> Hi bo
>
>
>
> I am interested to contribute.
>
> But I don’t have free access to any cloud provider. Not sure how I can get
> free access. I know Google, aws, azure only provides temp free access, it
> may not be sufficient.
>
>
>
> Guidance is appreciated.
>
>
>
> Sarath
>
> Sent from my iPhone
>
>
>
> On Feb 23, 2022, at 2:01 AM, bo yang <bo...@gmail.com> wrote:
>
> 
>
> Right, normally people start with simple script, then add more stuff, like
> permission and more components. After some time, people want to run the
> script consistently in different environments. Things will become complex.
>
>
>
> That is why we want to see whether people have interest for such a "one
> click" tool to make things easy.
>
>
>
>
>
> On Tue, Feb 22, 2022 at 11:31 PM Mich Talebzadeh <
> mich.talebzadeh@gmail.com> wrote:
>
> Hi,
>
>
>
> There are two distinct actions here; namely Deploy and Run.
>
>
>
> Deployment can be done by command line script with autoscaling. In the
> newer versions of Kubernnetes you don't even need to specify the node
> types, you can leave it to the Kubernetes cluster  to scale up and down and
> decide on node type.
>
>
>
> The second point is the running spark that you will need to submit.
> However, that depends on setting up access permission, use of service
> accounts, pulling the correct dockerfiles for the driver and the executors.
> Those details add to the complexity.
>
>
>
> Thanks
>
>
>
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
>
>
>
> On Wed, 23 Feb 2022 at 04:06, bo yang <bo...@gmail.com> wrote:
>
> Hi Spark Community,
>
>
>
> We built an open source tool to deploy and run Spark on Kubernetes with a
> one click command. For example, on AWS, it could automatically create an
> EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will
> be able to use curl or a CLI tool to submit Spark application. After the
> deployment, you could also install Uber Remote Shuffle Service to enable
> Dynamic Allocation on Kuberentes.
>
>
>
> Anyone interested in using or working together on such a tool?
>
>
>
> Thanks,
>
> Bo
>
>
>
>

RE: [Fork] ]RE: One click to run Spark on Kubernetes

Posted by "Agarwal, Janak" <ja...@amazon.com.INVALID>.
Hey Mich,

EMR on EKS<https://aws.amazon.com/emr/features/eks/> works on both EKS-Fargate and EKS-managed/self-managed EC2 based node groups.

Thanks,
Janak

From: Mich Talebzadeh <mi...@gmail.com>
Sent: Wednesday, February 23, 2022 10:46 AM
To: Agarwal, Janak <ja...@amazon.com>
Cc: Spark dev list <de...@spark.apache.org>
Subject: RE: [EXTERNAL] [Fork] ]RE: One click to run Spark on Kubernetes


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi Janak,

Are you talking about
EKS Fargate?
Thanks







 [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]   view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.




On Wed, 23 Feb 2022 at 17:47, Agarwal, Janak <ja...@amazon.com>> wrote:
[Reducing to thread participants to avoid spamming the entire community’s mailboxes]

Sarath, Bo, Mich,

Have you read about EMR on EKS<https://aws.amazon.com/emr/features/eks/>? We help customers to run Spark workloads on EKS. Today, EMR on EKS supports running Spark workloads on your EKS cluster. You will need to setup the EKS cluster yourself. To achieve one-click, all you really need to do is setup the EKS cluster. As mentioned earlier, setting up EKS cluster is fairly simple. We can help you to do that if it helps. Want to give EMR on EKS a spin as you decide your path forward?
<Disclaimer: I’m the Product Manager for EMR on EKS>

Best,
Janak

From: Sarath Annareddy <sa...@gmail.com>>
Sent: Wednesday, February 23, 2022 7:41 AM
To: bo yang <bo...@gmail.com>>
Cc: Mich Talebzadeh <mi...@gmail.com>>; Spark Dev List <de...@spark.apache.org>>; user <us...@spark.apache.org>>
Subject: RE: [EXTERNAL] One click to run Spark on Kubernetes


CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


Hi bo

I am interested to contribute.
But I don’t have free access to any cloud provider. Not sure how I can get free access. I know Google, aws, azure only provides temp free access, it may not be sufficient.

Guidance is appreciated.

Sarath
Sent from my iPhone

On Feb 23, 2022, at 2:01 AM, bo yang <bo...@gmail.com>> wrote:

Right, normally people start with simple script, then add more stuff, like permission and more components. After some time, people want to run the script consistently in different environments. Things will become complex.

That is why we want to see whether people have interest for such a "one click" tool to make things easy.


On Tue, Feb 22, 2022 at 11:31 PM Mich Talebzadeh <mi...@gmail.com>> wrote:
Hi,

There are two distinct actions here; namely Deploy and Run.

Deployment can be done by command line script with autoscaling. In the newer versions of Kubernnetes you don't even need to specify the node types, you can leave it to the Kubernetes cluster  to scale up and down and decide on node type.

The second point is the running spark that you will need to submit. However, that depends on setting up access permission, use of service accounts, pulling the correct dockerfiles for the driver and the executors. Those details add to the complexity.

Thanks




 [https://docs.google.com/uc?export=download&id=1-q7RFGRfLMObPuQPWSd9sl_H1UPNFaIZ&revid=0B1BiUVX33unjMWtVUWpINWFCd0ZQTlhTRHpGckh4Wlg4RG80PQ]   view my Linkedin profile<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.




On Wed, 23 Feb 2022 at 04:06, bo yang <bo...@gmail.com>> wrote:
Hi Spark Community,

We built an open source tool to deploy and run Spark on Kubernetes with a one click command. For example, on AWS, it could automatically create an EKS cluster, node group, NGINX ingress, and Spark Operator. Then you will be able to use curl or a CLI tool to submit Spark application. After the deployment, you could also install Uber Remote Shuffle Service to enable Dynamic Allocation on Kuberentes.

Anyone interested in using or working together on such a tool?

Thanks,
Bo