You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Dan Hill <qu...@gmail.com> on 2020/06/10 18:14:10 UTC

Re: How to safely update jobs in-flight using Apache Beam on AWS EMR?

Hi!  I found great docs about Apache Beam on Dataflow (which makes sense).
I was not able to find this about AWS EMR.

https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline

https://medium.com/google-cloud/restarting-cloud-dataflow-in-flight-9c688c49adfd

Re: How to safely update jobs in-flight using Apache Beam on AWS EMR?

Posted by Dan Hill <qu...@gmail.com>.
Sweet.  I have not seen that video.  Cool.  I'm curious about how well
AWS's managed services (like the Kinesis Data Analytics managed Flink
runner) handle the updates.  I'd guess it is best effort from the saved
state (if enabled).  If this is all delegated by Beam to Flink, then this
is more of a question for AWS.



On Wed, Jun 10, 2020 at 2:18 PM Austin Bennett <wh...@gmail.com>
wrote:

> Hi Dan,
>
> AWS EMR generally runs Flink and/or Spark as supported Beam Runners.  For
> EMR, you might want to check compatibility for versions of Beam/Flink can
> run, and the status of beam pipelines using either of those runners.
>
> On running Beam in AWS, had you seen:
> https://www.youtube.com/watch?v=eCgZRJqdt_I
>
>
>
> Cheers,
> Austin
>
> On Wed, Jun 10, 2020 at 2:02 PM Dan Hill <qu...@gmail.com> wrote:
>
>> No.  I just sent AWS Support a message.
>>
>> On Wed, Jun 10, 2020 at 1:00 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> The runner needs to support it and I'm not aware of an EMR runner for
>>> Apache Beam let alone one that supports pipeline update. Have you tried
>>> reaching out to AWS?
>>>
>>> On Wed, Jun 10, 2020 at 11:14 AM Dan Hill <qu...@gmail.com> wrote:
>>>
>>>> Hi!  I found great docs about Apache Beam on Dataflow (which makes
>>>> sense).  I was not able to find this about AWS EMR.
>>>>
>>>> https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline
>>>>
>>>>
>>>> https://medium.com/google-cloud/restarting-cloud-dataflow-in-flight-9c688c49adfd
>>>>
>>>

Re: How to safely update jobs in-flight using Apache Beam on AWS EMR?

Posted by Austin Bennett <wh...@gmail.com>.
Hi Dan,

AWS EMR generally runs Flink and/or Spark as supported Beam Runners.  For
EMR, you might want to check compatibility for versions of Beam/Flink can
run, and the status of beam pipelines using either of those runners.

On running Beam in AWS, had you seen:
https://www.youtube.com/watch?v=eCgZRJqdt_I



Cheers,
Austin

On Wed, Jun 10, 2020 at 2:02 PM Dan Hill <qu...@gmail.com> wrote:

> No.  I just sent AWS Support a message.
>
> On Wed, Jun 10, 2020 at 1:00 PM Luke Cwik <lc...@google.com> wrote:
>
>> The runner needs to support it and I'm not aware of an EMR runner for
>> Apache Beam let alone one that supports pipeline update. Have you tried
>> reaching out to AWS?
>>
>> On Wed, Jun 10, 2020 at 11:14 AM Dan Hill <qu...@gmail.com> wrote:
>>
>>> Hi!  I found great docs about Apache Beam on Dataflow (which makes
>>> sense).  I was not able to find this about AWS EMR.
>>>
>>> https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline
>>>
>>>
>>> https://medium.com/google-cloud/restarting-cloud-dataflow-in-flight-9c688c49adfd
>>>
>>

Re: How to safely update jobs in-flight using Apache Beam on AWS EMR?

Posted by Dan Hill <qu...@gmail.com>.
No.  I just sent AWS Support a message.

On Wed, Jun 10, 2020 at 1:00 PM Luke Cwik <lc...@google.com> wrote:

> The runner needs to support it and I'm not aware of an EMR runner for
> Apache Beam let alone one that supports pipeline update. Have you tried
> reaching out to AWS?
>
> On Wed, Jun 10, 2020 at 11:14 AM Dan Hill <qu...@gmail.com> wrote:
>
>> Hi!  I found great docs about Apache Beam on Dataflow (which makes
>> sense).  I was not able to find this about AWS EMR.
>>
>> https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline
>>
>>
>> https://medium.com/google-cloud/restarting-cloud-dataflow-in-flight-9c688c49adfd
>>
>

Re: How to safely update jobs in-flight using Apache Beam on AWS EMR?

Posted by Luke Cwik <lc...@google.com>.
The runner needs to support it and I'm not aware of an EMR runner for
Apache Beam let alone one that supports pipeline update. Have you tried
reaching out to AWS?

On Wed, Jun 10, 2020 at 11:14 AM Dan Hill <qu...@gmail.com> wrote:

> Hi!  I found great docs about Apache Beam on Dataflow (which makes
> sense).  I was not able to find this about AWS EMR.
>
> https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline
>
>
> https://medium.com/google-cloud/restarting-cloud-dataflow-in-flight-9c688c49adfd
>