You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Dustin Jenkins <dj...@gmail.com> on 2017/09/27 16:41:19 UTC

Programmatic configuration

Hello,

I’m running a single Flink Job Manager with a Task Manager in Docker containers with Java 8.  They are remotely located (flink.example.com <http://flink.example.com/>).

I’m submitting a job from my desktop and passing the job to the Job Manager with -m flink.example.com:6123 <http://flink.example.com:6123/>, which seems to work well.  I’m doing a search on an S3 system located at s3.example.com <http://s3.example.com/>.

The problem is that in order to have access to the S3 system, the $HADOOP_CONFIG/core-site.xml and $FLINK_HOME/flink-conf.yaml need to be configured for it at the Job Manager and Task Manager level, which means they are tied to that particular endpoint (including my access key and secret key).  Is there someway I can specify the configuration only in my application so I can leave my Flink server cluster to be mostly generic?

Thank you!
Dustin

Re: Programmatic configuration

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.

Hi Dustin,

Are you using S3 for a Flink source / sink / streaming state backend? Or is it simply used in one of your operators?

I’m assuming the latter since you mentioned “doing a search on an S3 system”. For this, I think it would make sense to simply pass the job-specific S3 endpoint / credentials as a program argument.

As for setting AWS access key / secret keys in the configuration files, that is actually not recommended. The recommended way would be to do that via AWS IAM settings [1].

Cheers,
Gordon

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#identity-and-access-management-iam-recommended

On 28 September 2017 at 12:41:26 AM, Dustin Jenkins (djenkins.cadc@gmail.com) wrote:

Hello,

I’m running a single Flink Job Manager with a Task Manager in Docker containers with Java 8. They are remotely located (flink.example.com).

I’m submitting a job from my desktop and passing the job to the Job Manager with -m flink.example.com:6123, which seems to work well. I’m doing a search on an S3 system located at s3.example.com.

The problem is that in order to have access to the S3 system, the $HADOOP_CONFIG/core-site.xml and $FLINK_HOME/flink-conf.yaml need to be configured for it at the Job Manager and Task Manager level, which means they are tied to that particular endpoint (including my access key and secret key). Is there someway I can specify the configuration only in my application so I can leave my Flink server cluster to be mostly generic?

Thank you!
Dustin