You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Chakravarthy varaga <ch...@gmail.com> on 2017/03/28 16:47:16 UTC

deploying flink in AWS - some teething issues

Hi Team,

   If the flink cluster is conainerized and managed through by a container
orchestrator,

    1.  the orchestrator allocates resources for each JM. TM etc., say if
the container (TM) needs to run with 2G RAM, how should this allocation be
honoured by the TM when its JVM starts. I'm thinking of wrapping up a
script that determines the resource allocation for the container and writes
the flink-conf.yaml before the TM starts the process. Is this the way to go?

    2. The container orchestrator looks at health of the containers and is
however unaware of the job health status runnning inside the
container/cluster. How should this be determined?

Best Regards
CVP

Re: deploying flink in AWS - some teething issues

Posted by Patrick Lucas <pa...@data-artisans.com>.

I think Log4j includes a Syslog appender—the log4j config included with
Flink just logs to the logs/ dir, but you should just be able to modify it
(log4j.properties) to suit your needs.

--
Patrick Lucas

On Thu, Mar 30, 2017 at 2:39 PM, Chakravarthy varaga <
chakravarthyvp@gmail.com> wrote:

> Hi,
>
>     With regards to logging (both Flink & application specific logs)
> within the container, are there best practices that you know of to get the
> logs to a centralized locations..
>     For e.g. the flink TM's log are local inside the container and I don't
> wish to write to shared/mounted volumes, this means that I have to run
> separate daemon running inside the container along with the TM to transport
> these logs to another server or so...
>     + I don't see that flink provides support for syslog to be able to
> connect to rsyslog etc.,
>
>     Can you please advice a way to go here?
>
> Best Regards
> CVP
>
> On Wed, Mar 29, 2017 at 11:33 AM, Patrick Lucas <patrick@data-artisans.com
> > wrote:
>
>> For 1., I think the standard approach would be to specify from without
>> what the heap size should be. If you want an *x* MB heap, you could set
>> your container memory limit to 1.3 * *x* or so (to account for overhead)
>> and set taskmanager.heap.mb: *x* in your config.
>>
>> The other way around—e.g. from inside the container determine its memory
>> limit and divide it by 1.3—sounds interesting though, so please share if
>> you have success with that.
>>
>> For 2. I don't think there's really a good way yet to monitor the health
>> of containerized jobs directly, so probably your best bet is to watch the
>> job's metrics from outside the Flink cluster.
>>
>> --
>> Patrick Lucas
>>
>> On Wed, Mar 29, 2017 at 10:58 AM, Chakravarthy varaga <
>> chakravarthyvp@gmail.com> wrote:
>>
>>> Hi,
>>>
>>>     Any updates here? I'm sure many would have faced similar issues like
>>> these, any help here is highly appreciated.
>>>
>>> Best Regards
>>> CVP
>>>
>>> On Tue, Mar 28, 2017 at 5:47 PM, Chakravarthy varaga <
>>> chakravarthyvp@gmail.com> wrote:
>>>
>>>> Hi Team,
>>>>
>>>>    If the flink cluster is conainerized and managed through by a
>>>> container orchestrator,
>>>>
>>>>     1.  the orchestrator allocates resources for each JM. TM etc., say
>>>> if the container (TM) needs to run with 2G RAM, how should this allocation
>>>> be honoured by the TM when its JVM starts. I'm thinking of wrapping up a
>>>> script that determines the resource allocation for the container and writes
>>>> the flink-conf.yaml before the TM starts the process. Is this the way to go?
>>>>
>>>>     2. The container orchestrator looks at health of the containers and
>>>> is however unaware of the job health status runnning inside the
>>>> container/cluster. How should this be determined?
>>>>
>>>> Best Regards
>>>> CVP
>>>>
>>>>
>>>>
>>>
>>
>

Re: deploying flink in AWS - some teething issues

Posted by Chakravarthy varaga <ch...@gmail.com>.

Hi,

    With regards to logging (both Flink & application specific logs) within
the container, are there best practices that you know of to get the logs to
a centralized locations..
    For e.g. the flink TM's log are local inside the container and I don't
wish to write to shared/mounted volumes, this means that I have to run
separate daemon running inside the container along with the TM to transport
these logs to another server or so...
    + I don't see that flink provides support for syslog to be able to
connect to rsyslog etc.,

    Can you please advice a way to go here?

Best Regards
CVP

On Wed, Mar 29, 2017 at 11:33 AM, Patrick Lucas <pa...@data-artisans.com>
wrote:

> For 1., I think the standard approach would be to specify from without
> what the heap size should be. If you want an *x* MB heap, you could set
> your container memory limit to 1.3 * *x* or so (to account for overhead)
> and set taskmanager.heap.mb: *x* in your config.
>
> The other way around—e.g. from inside the container determine its memory
> limit and divide it by 1.3—sounds interesting though, so please share if
> you have success with that.
>
> For 2. I don't think there's really a good way yet to monitor the health
> of containerized jobs directly, so probably your best bet is to watch the
> job's metrics from outside the Flink cluster.
>
> --
> Patrick Lucas
>
> On Wed, Mar 29, 2017 at 10:58 AM, Chakravarthy varaga <
> chakravarthyvp@gmail.com> wrote:
>
>> Hi,
>>
>>     Any updates here? I'm sure many would have faced similar issues like
>> these, any help here is highly appreciated.
>>
>> Best Regards
>> CVP
>>
>> On Tue, Mar 28, 2017 at 5:47 PM, Chakravarthy varaga <
>> chakravarthyvp@gmail.com> wrote:
>>
>>> Hi Team,
>>>
>>>    If the flink cluster is conainerized and managed through by a
>>> container orchestrator,
>>>
>>>     1.  the orchestrator allocates resources for each JM. TM etc., say
>>> if the container (TM) needs to run with 2G RAM, how should this allocation
>>> be honoured by the TM when its JVM starts. I'm thinking of wrapping up a
>>> script that determines the resource allocation for the container and writes
>>> the flink-conf.yaml before the TM starts the process. Is this the way to go?
>>>
>>>     2. The container orchestrator looks at health of the containers and
>>> is however unaware of the job health status runnning inside the
>>> container/cluster. How should this be determined?
>>>
>>> Best Regards
>>> CVP
>>>
>>>
>>>
>>
>

Re: deploying flink in AWS - some teething issues

Posted by Patrick Lucas <pa...@data-artisans.com>.

For 1., I think the standard approach would be to specify from without what
the heap size should be. If you want an *x* MB heap, you could set your
container memory limit to 1.3 * *x* or so (to account for overhead)
and set taskmanager.heap.mb:
*x* in your config.

The other way around—e.g. from inside the container determine its memory
limit and divide it by 1.3—sounds interesting though, so please share if
you have success with that.

For 2. I don't think there's really a good way yet to monitor the health of
containerized jobs directly, so probably your best bet is to watch the
job's metrics from outside the Flink cluster.

--
Patrick Lucas

On Wed, Mar 29, 2017 at 10:58 AM, Chakravarthy varaga <
chakravarthyvp@gmail.com> wrote:

> Hi,
>
>     Any updates here? I'm sure many would have faced similar issues like
> these, any help here is highly appreciated.
>
> Best Regards
> CVP
>
> On Tue, Mar 28, 2017 at 5:47 PM, Chakravarthy varaga <
> chakravarthyvp@gmail.com> wrote:
>
>> Hi Team,
>>
>>    If the flink cluster is conainerized and managed through by a
>> container orchestrator,
>>
>>     1.  the orchestrator allocates resources for each JM. TM etc., say if
>> the container (TM) needs to run with 2G RAM, how should this allocation be
>> honoured by the TM when its JVM starts. I'm thinking of wrapping up a
>> script that determines the resource allocation for the container and writes
>> the flink-conf.yaml before the TM starts the process. Is this the way to go?
>>
>>     2. The container orchestrator looks at health of the containers and
>> is however unaware of the job health status runnning inside the
>> container/cluster. How should this be determined?
>>
>> Best Regards
>> CVP
>>
>>
>>
>

Re: deploying flink in AWS - some teething issues

Posted by Chakravarthy varaga <ch...@gmail.com>.

Hi,

    Any updates here? I'm sure many would have faced similar issues like
these, any help here is highly appreciated.

Best Regards
CVP

On Tue, Mar 28, 2017 at 5:47 PM, Chakravarthy varaga <
chakravarthyvp@gmail.com> wrote:

> Hi Team,
>
>    If the flink cluster is conainerized and managed through by a container
> orchestrator,
>
>     1.  the orchestrator allocates resources for each JM. TM etc., say if
> the container (TM) needs to run with 2G RAM, how should this allocation be
> honoured by the TM when its JVM starts. I'm thinking of wrapping up a
> script that determines the resource allocation for the container and writes
> the flink-conf.yaml before the TM starts the process. Is this the way to go?
>
>     2. The container orchestrator looks at health of the containers and is
> however unaware of the job health status runnning inside the
> container/cluster. How should this be determined?
>
> Best Regards
> CVP
>
>
>