You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Stephan Brosinski (JIRA)" <ji...@apache.org> on 2016/06/21 10:12:57 UTC

[jira] [Commented] (FLINK-1337) Create an Amazon EMR Bootstrap Action

    [ https://issues.apache.org/jira/browse/FLINK-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15341499#comment-15341499 ] 

Stephan Brosinski commented on FLINK-1337:
------------------------------------------

I'm currently doing the following:

- Upload the Flink distributions I want to bootstrap to {{s3://<my-bucket>/flink-dist}}
- Create a EMR cluster with the aws cli and the following param: {{--bootstrap-action Path=s3://<my-bucket>/bootstrap-flink.sh,Args=1.0.3}}

{{bootstrap-flink.sh}} just pulls the Flink distribution from my S3 bucket, untars it to the EMR hadoop user's home directory and symlinks it to {{./flink}} so you don't have to worry about the exact Flink version when submitting jobs. It looks like this:

{code}
    #!/usr/bin/env bash

    FLINK_VERSION=$1
    HADOOP_VERSION=hadoop27
    SCALA_VERSION=scala_2.11
    FLINK_DIST="flink-$FLINK_VERSION-bin-$HADOOP_VERSION-$SCALA_VERSION.tgz"
    FLINK_DIST_URL=s3://<my-bucket>/flink-dist

    aws s3 cp "$DIST_DIR/$FLINK_DIST" . && tar xzf "$FLINK_DIST" && ln -s "flink-$FLINK_VERSION" flink

    exit 0
{code}

This could obviously be better, but it works.
The next issue is that EMR's step API has no support for running "flink run". So I need to SSH to the master node.







> Create an Amazon EMR Bootstrap Action
> -------------------------------------
>
>                 Key: FLINK-1337
>                 URL: https://issues.apache.org/jira/browse/FLINK-1337
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Stephan Ewen
>            Assignee: Timur Fayruzov
>            Priority: Minor
>
> EMR offers bootstrap actions that prepare the cluster by installing additional components, etc..
> We can offer a Flink bootstrap action that downloads, unpacks, and configures Flink. It may optionally install libraries that we like to use (such as Python, BLAS/JBLAS, ...)
> http://blogs.aws.amazon.com/bigdata/post/TxO6EHTHQALSIB/Getting-Started-with-Amazon-EMR-Bootstrap-Actions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)