You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Manu Zhang <ow...@gmail.com> on 2016/12/01 03:16:50 UTC

separation of JVMs for different applications

Hi all,

It seems tasks of different Flink applications can end up in the same JVM
(TaskManager) in standalone mode. Isn't this fragile since errors in one
application could crash another ? I checked FLIP-6
<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
didn't found any mention of changing it in the future.

Any thoughts or have I missed anything ?

Thanks,
Manu Zhang

Re: separation of JVMs for different applications

Posted by Manu Zhang <ow...@gmail.com>.
Created https://issues.apache.org/jira/browse/FLINK-5312.

Thanks,
Manu

On Fri, Dec 9, 2016 at 7:17 PM Till Rohrmann <tr...@apache.org> wrote:

> Hi Manu,
>
> afaik there is no JIRA for standalone v2.0 yet. So feel free to open an
> JIRA for it.
>
> Just a small correction, FLIP-6 is not almost finished yet. But we're
> working on it and are happy for every helping hand :-)
>
> Cheers,
> Till
>
> On Fri, Dec 9, 2016 at 2:27 AM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> If there are not any existing jira for standalone v2.0, may I open a new
> one ?
>
> Thanks,
> Manu
>
> On Wed, Dec 7, 2016 at 12:39 PM Manu Zhang <ow...@gmail.com>
> wrote:
>
> Good to know that.
>
> Is it the "standalone setup v2.0" section ? The wiki page has no
> Google-Doc-like change histories.
> Any jiras opened for that ? Not sure that will be noticed given FLIP-6 is
> almost finished.
>
> Thanks,
> Manu
>
> On Tue, Dec 6, 2016 at 11:55 PM Stephan Ewen <se...@apache.org> wrote:
>
> Hi!
>
> We are currently changing the resource and process model quite a bit:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
> As part of that, I think it makes sense to introduce something like that.
>
> What you can do today is to set TaskManagers to use one slot only, and
> then start multiple TaskManagers per machine. That makes sure that JVMs are
> never shared across machines.
> If you use the "start-cluster.sh" script from Flink, you can enter the
> same hostname multiple times in the workers file, and it will start
> multiple TaskManagers on a machine.
>
> Best,
> Stephan
>
>
>
> On Tue, Dec 6, 2016 at 3:51 AM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> Thanks Stephan,
>
> They don't use YARN now but I think they will consider it.  Do you think
> it would be beneficial to provide such an option as "separate-jvm" in
> stand-alone mode for streaming processor and long running services ? Or do
> you think it would introduce too much complexity ?
>
> Manu
>
> On Tue, Dec 6, 2016 at 1:04 AM Stephan Ewen <se...@apache.org> wrote:
>
> Hi!
>
> Are your customers using YARN? In that case, the default configuration
> will start a new YARN application per Flink job, no JVMs are shared between
> jobs. By default, even each slot has its own JVM.
>
> Greetings,
> Stephan
>
> PS: I think the "spawning new JVMs" is what Till referred to when saying
> "spinning up a new cluster". Keep in mind that Flink is also a batch
> processor, and it handles sequences of short batch jobs (as issued for
> example by interactive shells) and it pre-allocates and manages a lot of
> memory for batch jobs.
>
>
>
> On Mon, Dec 5, 2016 at 3:48 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job.
>
>
> I don't think we have to spin up a new cluster for each job if every job
> gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
> new job when free slots are available. How can we share data between jobs
> and why ?
>
>
>
> On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job. This
> might be helpful for scenarios where you want to run many short-lived and
> light-weight jobs.
>
> But the important part is that you don't have to use this method. You can
> also start a new Flink cluster per job which will then execute the job
> isolated from any other jobs (given that you don't submit other jobs to
> this cluster).
>
> Cheers,
> Till
>
> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> Thanks Fabian and Till.
>
> We have customers who are interested in using Flink but very concerned
> about that "multiple jobs share the same set of TMs". I've just joined the
> community recently so I'm not sure whether there has been a discussion over
> the "multi-tenant cluster mode" before.
>
> The cons are one job/user's failure may crash another, which is
> unacceptable in a multi-tenant scenario.
> What are the pros ? Do the pros overweigh the cons ?
>
> Manu
>
> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org> wrote:
>
> Hi Manu,
>
> with Flip-6 we will be able to support stricter application isolation by
> starting for each job a dedicated JobManager which will execute its tasks
> on TM reserved solely for this job. But at the same time we will continue
> supporting the multi-tenant cluster mode where tasks belonging to multiple
> jobs share the same set of TMs and, thus, might share information between
> them.
>
> Cheers,
> Till
>
> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>
> Hi Manu,
>
> As far as I know, there are not plans to change the stand-alone deployment.
> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
> etc.) which allow to start Flink processes per job.
>
> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
> add more detail.
>
> Best,
> Fabian
>
> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>
> Hi all,
>
> It seems tasks of different Flink applications can end up in the same JVM
> (TaskManager) in standalone mode. Isn't this fragile since errors in one
> application could crash another ? I checked FLIP-6
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
> didn't found any mention of changing it in the future.
>
> Any thoughts or have I missed anything ?
>
> Thanks,
> Manu Zhang
>
>
>
>
>
>
>
>
>

Re: separation of JVMs for different applications

Posted by Till Rohrmann <tr...@apache.org>.
Hi Manu,

afaik there is no JIRA for standalone v2.0 yet. So feel free to open an
JIRA for it.

Just a small correction, FLIP-6 is not almost finished yet. But we're
working on it and are happy for every helping hand :-)

Cheers,
Till

On Fri, Dec 9, 2016 at 2:27 AM, Manu Zhang <ow...@gmail.com> wrote:

> If there are not any existing jira for standalone v2.0, may I open a new
> one ?
>
> Thanks,
> Manu
>
> On Wed, Dec 7, 2016 at 12:39 PM Manu Zhang <ow...@gmail.com>
> wrote:
>
>> Good to know that.
>>
>> Is it the "standalone setup v2.0" section ? The wiki page has no
>> Google-Doc-like change histories.
>> Any jiras opened for that ? Not sure that will be noticed given FLIP-6 is
>> almost finished.
>>
>> Thanks,
>> Manu
>>
>> On Tue, Dec 6, 2016 at 11:55 PM Stephan Ewen <se...@apache.org> wrote:
>>
>> Hi!
>>
>> We are currently changing the resource and process model quite a bit:
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
>> As part of that, I think it makes sense to introduce something like that.
>>
>> What you can do today is to set TaskManagers to use one slot only, and
>> then start multiple TaskManagers per machine. That makes sure that JVMs are
>> never shared across machines.
>> If you use the "start-cluster.sh" script from Flink, you can enter the
>> same hostname multiple times in the workers file, and it will start
>> multiple TaskManagers on a machine.
>>
>> Best,
>> Stephan
>>
>>
>>
>> On Tue, Dec 6, 2016 at 3:51 AM, Manu Zhang <ow...@gmail.com>
>> wrote:
>>
>> Thanks Stephan,
>>
>> They don't use YARN now but I think they will consider it.  Do you think
>> it would be beneficial to provide such an option as "separate-jvm" in
>> stand-alone mode for streaming processor and long running services ? Or do
>> you think it would introduce too much complexity ?
>>
>> Manu
>>
>> On Tue, Dec 6, 2016 at 1:04 AM Stephan Ewen <se...@apache.org> wrote:
>>
>> Hi!
>>
>> Are your customers using YARN? In that case, the default configuration
>> will start a new YARN application per Flink job, no JVMs are shared between
>> jobs. By default, even each slot has its own JVM.
>>
>> Greetings,
>> Stephan
>>
>> PS: I think the "spawning new JVMs" is what Till referred to when saying
>> "spinning up a new cluster". Keep in mind that Flink is also a batch
>> processor, and it handles sequences of short batch jobs (as issued for
>> example by interactive shells) and it pre-allocates and manages a lot of
>> memory for batch jobs.
>>
>>
>>
>> On Mon, Dec 5, 2016 at 3:48 PM, Manu Zhang <ow...@gmail.com>
>> wrote:
>>
>> The pro for the multi-tenant cluster mode is that you can share data
>> between jobs and you don't have to spin up a new cluster for each job.
>>
>>
>> I don't think we have to spin up a new cluster for each job if every job
>> gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
>> new job when free slots are available. How can we share data between jobs
>> and why ?
>>
>>
>>
>> On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>> The pro for the multi-tenant cluster mode is that you can share data
>> between jobs and you don't have to spin up a new cluster for each job. This
>> might be helpful for scenarios where you want to run many short-lived and
>> light-weight jobs.
>>
>> But the important part is that you don't have to use this method. You can
>> also start a new Flink cluster per job which will then execute the job
>> isolated from any other jobs (given that you don't submit other jobs to
>> this cluster).
>>
>> Cheers,
>> Till
>>
>> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
>> wrote:
>>
>> Thanks Fabian and Till.
>>
>> We have customers who are interested in using Flink but very concerned
>> about that "multiple jobs share the same set of TMs". I've just joined the
>> community recently so I'm not sure whether there has been a discussion over
>> the "multi-tenant cluster mode" before.
>>
>> The cons are one job/user's failure may crash another, which is
>> unacceptable in a multi-tenant scenario.
>> What are the pros ? Do the pros overweigh the cons ?
>>
>> Manu
>>
>> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>> Hi Manu,
>>
>> with Flip-6 we will be able to support stricter application isolation by
>> starting for each job a dedicated JobManager which will execute its tasks
>> on TM reserved solely for this job. But at the same time we will continue
>> supporting the multi-tenant cluster mode where tasks belonging to multiple
>> jobs share the same set of TMs and, thus, might share information between
>> them.
>>
>> Cheers,
>> Till
>>
>> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>>
>> Hi Manu,
>>
>> As far as I know, there are not plans to change the stand-alone
>> deployment.
>> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
>> etc.) which allow to start Flink processes per job.
>>
>> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
>> add more detail.
>>
>> Best,
>> Fabian
>>
>> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>>
>> Hi all,
>>
>> It seems tasks of different Flink applications can end up in the same JVM
>> (TaskManager) in standalone mode. Isn't this fragile since errors in one
>> application could crash another ? I checked FLIP-6
>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
>> didn't found any mention of changing it in the future.
>>
>> Any thoughts or have I missed anything ?
>>
>> Thanks,
>> Manu Zhang
>>
>>
>>
>>
>>
>>
>>
>>

Re: separation of JVMs for different applications

Posted by Manu Zhang <ow...@gmail.com>.
If there are not any existing jira for standalone v2.0, may I open a new
one ?

Thanks,
Manu

On Wed, Dec 7, 2016 at 12:39 PM Manu Zhang <ow...@gmail.com> wrote:

> Good to know that.
>
> Is it the "standalone setup v2.0" section ? The wiki page has no
> Google-Doc-like change histories.
> Any jiras opened for that ? Not sure that will be noticed given FLIP-6 is
> almost finished.
>
> Thanks,
> Manu
>
> On Tue, Dec 6, 2016 at 11:55 PM Stephan Ewen <se...@apache.org> wrote:
>
> Hi!
>
> We are currently changing the resource and process model quite a bit:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
> As part of that, I think it makes sense to introduce something like that.
>
> What you can do today is to set TaskManagers to use one slot only, and
> then start multiple TaskManagers per machine. That makes sure that JVMs are
> never shared across machines.
> If you use the "start-cluster.sh" script from Flink, you can enter the
> same hostname multiple times in the workers file, and it will start
> multiple TaskManagers on a machine.
>
> Best,
> Stephan
>
>
>
> On Tue, Dec 6, 2016 at 3:51 AM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> Thanks Stephan,
>
> They don't use YARN now but I think they will consider it.  Do you think
> it would be beneficial to provide such an option as "separate-jvm" in
> stand-alone mode for streaming processor and long running services ? Or do
> you think it would introduce too much complexity ?
>
> Manu
>
> On Tue, Dec 6, 2016 at 1:04 AM Stephan Ewen <se...@apache.org> wrote:
>
> Hi!
>
> Are your customers using YARN? In that case, the default configuration
> will start a new YARN application per Flink job, no JVMs are shared between
> jobs. By default, even each slot has its own JVM.
>
> Greetings,
> Stephan
>
> PS: I think the "spawning new JVMs" is what Till referred to when saying
> "spinning up a new cluster". Keep in mind that Flink is also a batch
> processor, and it handles sequences of short batch jobs (as issued for
> example by interactive shells) and it pre-allocates and manages a lot of
> memory for batch jobs.
>
>
>
> On Mon, Dec 5, 2016 at 3:48 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job.
>
>
> I don't think we have to spin up a new cluster for each job if every job
> gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
> new job when free slots are available. How can we share data between jobs
> and why ?
>
>
>
> On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job. This
> might be helpful for scenarios where you want to run many short-lived and
> light-weight jobs.
>
> But the important part is that you don't have to use this method. You can
> also start a new Flink cluster per job which will then execute the job
> isolated from any other jobs (given that you don't submit other jobs to
> this cluster).
>
> Cheers,
> Till
>
> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> Thanks Fabian and Till.
>
> We have customers who are interested in using Flink but very concerned
> about that "multiple jobs share the same set of TMs". I've just joined the
> community recently so I'm not sure whether there has been a discussion over
> the "multi-tenant cluster mode" before.
>
> The cons are one job/user's failure may crash another, which is
> unacceptable in a multi-tenant scenario.
> What are the pros ? Do the pros overweigh the cons ?
>
> Manu
>
> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org> wrote:
>
> Hi Manu,
>
> with Flip-6 we will be able to support stricter application isolation by
> starting for each job a dedicated JobManager which will execute its tasks
> on TM reserved solely for this job. But at the same time we will continue
> supporting the multi-tenant cluster mode where tasks belonging to multiple
> jobs share the same set of TMs and, thus, might share information between
> them.
>
> Cheers,
> Till
>
> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>
> Hi Manu,
>
> As far as I know, there are not plans to change the stand-alone deployment.
> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
> etc.) which allow to start Flink processes per job.
>
> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
> add more detail.
>
> Best,
> Fabian
>
> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>
> Hi all,
>
> It seems tasks of different Flink applications can end up in the same JVM
> (TaskManager) in standalone mode. Isn't this fragile since errors in one
> application could crash another ? I checked FLIP-6
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
> didn't found any mention of changing it in the future.
>
> Any thoughts or have I missed anything ?
>
> Thanks,
> Manu Zhang
>
>
>
>
>
>
>
>

Re: separation of JVMs for different applications

Posted by Manu Zhang <ow...@gmail.com>.
Good to know that.

Is it the "standalone setup v2.0" section ? The wiki page has no
Google-Doc-like change histories.
Any jiras opened for that ? Not sure that will be noticed given FLIP-6 is
almost finished.

Thanks,
Manu

On Tue, Dec 6, 2016 at 11:55 PM Stephan Ewen <se...@apache.org> wrote:

> Hi!
>
> We are currently changing the resource and process model quite a bit:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
> As part of that, I think it makes sense to introduce something like that.
>
> What you can do today is to set TaskManagers to use one slot only, and
> then start multiple TaskManagers per machine. That makes sure that JVMs are
> never shared across machines.
> If you use the "start-cluster.sh" script from Flink, you can enter the
> same hostname multiple times in the workers file, and it will start
> multiple TaskManagers on a machine.
>
> Best,
> Stephan
>
>
>
> On Tue, Dec 6, 2016 at 3:51 AM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> Thanks Stephan,
>
> They don't use YARN now but I think they will consider it.  Do you think
> it would be beneficial to provide such an option as "separate-jvm" in
> stand-alone mode for streaming processor and long running services ? Or do
> you think it would introduce too much complexity ?
>
> Manu
>
> On Tue, Dec 6, 2016 at 1:04 AM Stephan Ewen <se...@apache.org> wrote:
>
> Hi!
>
> Are your customers using YARN? In that case, the default configuration
> will start a new YARN application per Flink job, no JVMs are shared between
> jobs. By default, even each slot has its own JVM.
>
> Greetings,
> Stephan
>
> PS: I think the "spawning new JVMs" is what Till referred to when saying
> "spinning up a new cluster". Keep in mind that Flink is also a batch
> processor, and it handles sequences of short batch jobs (as issued for
> example by interactive shells) and it pre-allocates and manages a lot of
> memory for batch jobs.
>
>
>
> On Mon, Dec 5, 2016 at 3:48 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job.
>
>
> I don't think we have to spin up a new cluster for each job if every job
> gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
> new job when free slots are available. How can we share data between jobs
> and why ?
>
>
>
> On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job. This
> might be helpful for scenarios where you want to run many short-lived and
> light-weight jobs.
>
> But the important part is that you don't have to use this method. You can
> also start a new Flink cluster per job which will then execute the job
> isolated from any other jobs (given that you don't submit other jobs to
> this cluster).
>
> Cheers,
> Till
>
> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> Thanks Fabian and Till.
>
> We have customers who are interested in using Flink but very concerned
> about that "multiple jobs share the same set of TMs". I've just joined the
> community recently so I'm not sure whether there has been a discussion over
> the "multi-tenant cluster mode" before.
>
> The cons are one job/user's failure may crash another, which is
> unacceptable in a multi-tenant scenario.
> What are the pros ? Do the pros overweigh the cons ?
>
> Manu
>
> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org> wrote:
>
> Hi Manu,
>
> with Flip-6 we will be able to support stricter application isolation by
> starting for each job a dedicated JobManager which will execute its tasks
> on TM reserved solely for this job. But at the same time we will continue
> supporting the multi-tenant cluster mode where tasks belonging to multiple
> jobs share the same set of TMs and, thus, might share information between
> them.
>
> Cheers,
> Till
>
> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>
> Hi Manu,
>
> As far as I know, there are not plans to change the stand-alone deployment.
> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
> etc.) which allow to start Flink processes per job.
>
> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
> add more detail.
>
> Best,
> Fabian
>
> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>
> Hi all,
>
> It seems tasks of different Flink applications can end up in the same JVM
> (TaskManager) in standalone mode. Isn't this fragile since errors in one
> application could crash another ? I checked FLIP-6
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
> didn't found any mention of changing it in the future.
>
> Any thoughts or have I missed anything ?
>
> Thanks,
> Manu Zhang
>
>
>
>
>
>
>
>

Re: separation of JVMs for different applications

Posted by Stephan Ewen <se...@apache.org>.
Hi!

We are currently changing the resource and process model quite a bit:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077
As part of that, I think it makes sense to introduce something like that.

What you can do today is to set TaskManagers to use one slot only, and then
start multiple TaskManagers per machine. That makes sure that JVMs are
never shared across machines.
If you use the "start-cluster.sh" script from Flink, you can enter the same
hostname multiple times in the workers file, and it will start multiple
TaskManagers on a machine.

Best,
Stephan



On Tue, Dec 6, 2016 at 3:51 AM, Manu Zhang <ow...@gmail.com> wrote:

> Thanks Stephan,
>
> They don't use YARN now but I think they will consider it.  Do you think
> it would be beneficial to provide such an option as "separate-jvm" in
> stand-alone mode for streaming processor and long running services ? Or do
> you think it would introduce too much complexity ?
>
> Manu
>
> On Tue, Dec 6, 2016 at 1:04 AM Stephan Ewen <se...@apache.org> wrote:
>
>> Hi!
>>
>> Are your customers using YARN? In that case, the default configuration
>> will start a new YARN application per Flink job, no JVMs are shared between
>> jobs. By default, even each slot has its own JVM.
>>
>> Greetings,
>> Stephan
>>
>> PS: I think the "spawning new JVMs" is what Till referred to when saying
>> "spinning up a new cluster". Keep in mind that Flink is also a batch
>> processor, and it handles sequences of short batch jobs (as issued for
>> example by interactive shells) and it pre-allocates and manages a lot of
>> memory for batch jobs.
>>
>>
>>
>> On Mon, Dec 5, 2016 at 3:48 PM, Manu Zhang <ow...@gmail.com>
>> wrote:
>>
>> The pro for the multi-tenant cluster mode is that you can share data
>> between jobs and you don't have to spin up a new cluster for each job.
>>
>>
>> I don't think we have to spin up a new cluster for each job if every job
>> gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
>> new job when free slots are available. How can we share data between jobs
>> and why ?
>>
>>
>>
>> On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>> The pro for the multi-tenant cluster mode is that you can share data
>> between jobs and you don't have to spin up a new cluster for each job. This
>> might be helpful for scenarios where you want to run many short-lived and
>> light-weight jobs.
>>
>> But the important part is that you don't have to use this method. You can
>> also start a new Flink cluster per job which will then execute the job
>> isolated from any other jobs (given that you don't submit other jobs to
>> this cluster).
>>
>> Cheers,
>> Till
>>
>> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
>> wrote:
>>
>> Thanks Fabian and Till.
>>
>> We have customers who are interested in using Flink but very concerned
>> about that "multiple jobs share the same set of TMs". I've just joined the
>> community recently so I'm not sure whether there has been a discussion over
>> the "multi-tenant cluster mode" before.
>>
>> The cons are one job/user's failure may crash another, which is
>> unacceptable in a multi-tenant scenario.
>> What are the pros ? Do the pros overweigh the cons ?
>>
>> Manu
>>
>> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>> Hi Manu,
>>
>> with Flip-6 we will be able to support stricter application isolation by
>> starting for each job a dedicated JobManager which will execute its tasks
>> on TM reserved solely for this job. But at the same time we will continue
>> supporting the multi-tenant cluster mode where tasks belonging to multiple
>> jobs share the same set of TMs and, thus, might share information between
>> them.
>>
>> Cheers,
>> Till
>>
>> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>>
>> Hi Manu,
>>
>> As far as I know, there are not plans to change the stand-alone
>> deployment.
>> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
>> etc.) which allow to start Flink processes per job.
>>
>> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
>> add more detail.
>>
>> Best,
>> Fabian
>>
>> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>>
>> Hi all,
>>
>> It seems tasks of different Flink applications can end up in the same JVM
>> (TaskManager) in standalone mode. Isn't this fragile since errors in one
>> application could crash another ? I checked FLIP-6
>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
>> didn't found any mention of changing it in the future.
>>
>> Any thoughts or have I missed anything ?
>>
>> Thanks,
>> Manu Zhang
>>
>>
>>
>>
>>
>>
>>

Re: separation of JVMs for different applications

Posted by Manu Zhang <ow...@gmail.com>.
Thanks Stephan,

They don't use YARN now but I think they will consider it.  Do you think it
would be beneficial to provide such an option as "separate-jvm" in
stand-alone mode for streaming processor and long running services ? Or do
you think it would introduce too much complexity ?

Manu

On Tue, Dec 6, 2016 at 1:04 AM Stephan Ewen <se...@apache.org> wrote:

> Hi!
>
> Are your customers using YARN? In that case, the default configuration
> will start a new YARN application per Flink job, no JVMs are shared between
> jobs. By default, even each slot has its own JVM.
>
> Greetings,
> Stephan
>
> PS: I think the "spawning new JVMs" is what Till referred to when saying
> "spinning up a new cluster". Keep in mind that Flink is also a batch
> processor, and it handles sequences of short batch jobs (as issued for
> example by interactive shells) and it pre-allocates and manages a lot of
> memory for batch jobs.
>
>
>
> On Mon, Dec 5, 2016 at 3:48 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job.
>
>
> I don't think we have to spin up a new cluster for each job if every job
> gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
> new job when free slots are available. How can we share data between jobs
> and why ?
>
>
>
> On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job. This
> might be helpful for scenarios where you want to run many short-lived and
> light-weight jobs.
>
> But the important part is that you don't have to use this method. You can
> also start a new Flink cluster per job which will then execute the job
> isolated from any other jobs (given that you don't submit other jobs to
> this cluster).
>
> Cheers,
> Till
>
> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
> Thanks Fabian and Till.
>
> We have customers who are interested in using Flink but very concerned
> about that "multiple jobs share the same set of TMs". I've just joined the
> community recently so I'm not sure whether there has been a discussion over
> the "multi-tenant cluster mode" before.
>
> The cons are one job/user's failure may crash another, which is
> unacceptable in a multi-tenant scenario.
> What are the pros ? Do the pros overweigh the cons ?
>
> Manu
>
> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org> wrote:
>
> Hi Manu,
>
> with Flip-6 we will be able to support stricter application isolation by
> starting for each job a dedicated JobManager which will execute its tasks
> on TM reserved solely for this job. But at the same time we will continue
> supporting the multi-tenant cluster mode where tasks belonging to multiple
> jobs share the same set of TMs and, thus, might share information between
> them.
>
> Cheers,
> Till
>
> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>
> Hi Manu,
>
> As far as I know, there are not plans to change the stand-alone deployment.
> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
> etc.) which allow to start Flink processes per job.
>
> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
> add more detail.
>
> Best,
> Fabian
>
> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>
> Hi all,
>
> It seems tasks of different Flink applications can end up in the same JVM
> (TaskManager) in standalone mode. Isn't this fragile since errors in one
> application could crash another ? I checked FLIP-6
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
> didn't found any mention of changing it in the future.
>
> Any thoughts or have I missed anything ?
>
> Thanks,
> Manu Zhang
>
>
>
>
>
>
>

Re: separation of JVMs for different applications

Posted by Stephan Ewen <se...@apache.org>.
Hi!

Are your customers using YARN? In that case, the default configuration will
start a new YARN application per Flink job, no JVMs are shared between
jobs. By default, even each slot has its own JVM.

Greetings,
Stephan

PS: I think the "spawning new JVMs" is what Till referred to when saying
"spinning up a new cluster". Keep in mind that Flink is also a batch
processor, and it handles sequences of short batch jobs (as issued for
example by interactive shells) and it pre-allocates and manages a lot of
memory for batch jobs.



On Mon, Dec 5, 2016 at 3:48 PM, Manu Zhang <ow...@gmail.com> wrote:

> The pro for the multi-tenant cluster mode is that you can share data
>> between jobs and you don't have to spin up a new cluster for each job.
>
>
> I don't think we have to spin up a new cluster for each job if every job
> gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
> new job when free slots are available. How can we share data between jobs
> and why ?
>
>
>
> On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org>
> wrote:
>
>> The pro for the multi-tenant cluster mode is that you can share data
>> between jobs and you don't have to spin up a new cluster for each job. This
>> might be helpful for scenarios where you want to run many short-lived and
>> light-weight jobs.
>>
>> But the important part is that you don't have to use this method. You can
>> also start a new Flink cluster per job which will then execute the job
>> isolated from any other jobs (given that you don't submit other jobs to
>> this cluster).
>>
>> Cheers,
>> Till
>>
>> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
>> wrote:
>>
>>> Thanks Fabian and Till.
>>>
>>> We have customers who are interested in using Flink but very concerned
>>> about that "multiple jobs share the same set of TMs". I've just joined the
>>> community recently so I'm not sure whether there has been a discussion over
>>> the "multi-tenant cluster mode" before.
>>>
>>> The cons are one job/user's failure may crash another, which is
>>> unacceptable in a multi-tenant scenario.
>>> What are the pros ? Do the pros overweigh the cons ?
>>>
>>> Manu
>>>
>>> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org>
>>> wrote:
>>>
>>>> Hi Manu,
>>>>
>>>> with Flip-6 we will be able to support stricter application isolation
>>>> by starting for each job a dedicated JobManager which will execute its
>>>> tasks on TM reserved solely for this job. But at the same time we will
>>>> continue supporting the multi-tenant cluster mode where tasks belonging to
>>>> multiple jobs share the same set of TMs and, thus, might share information
>>>> between them.
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Manu,
>>>>
>>>> As far as I know, there are not plans to change the stand-alone
>>>> deployment.
>>>> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
>>>> etc.) which allow to start Flink processes per job.
>>>>
>>>> Till (in CC) is more familiar with the FLIP-6 effort and might be able
>>>> to add more detail.
>>>>
>>>> Best,
>>>> Fabian
>>>>
>>>> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>>>>
>>>> Hi all,
>>>>
>>>> It seems tasks of different Flink applications can end up in the same
>>>> JVM (TaskManager) in standalone mode. Isn't this fragile since errors in
>>>> one application could crash another ? I checked FLIP-6
>>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
>>>> didn't found any mention of changing it in the future.
>>>>
>>>> Any thoughts or have I missed anything ?
>>>>
>>>> Thanks,
>>>> Manu Zhang
>>>>
>>>>
>>>>
>>>>
>>
>

Re: separation of JVMs for different applications

Posted by Manu Zhang <ow...@gmail.com>.
>
> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job.


I don't think we have to spin up a new cluster for each job if every job
gets its own JVMs. For examples, Storm will launch a new worker(JVM) for a
new job when free slots are available. How can we share data between jobs
and why ?



On Mon, Dec 5, 2016 at 6:27 PM, Till Rohrmann <tr...@apache.org> wrote:

> The pro for the multi-tenant cluster mode is that you can share data
> between jobs and you don't have to spin up a new cluster for each job. This
> might be helpful for scenarios where you want to run many short-lived and
> light-weight jobs.
>
> But the important part is that you don't have to use this method. You can
> also start a new Flink cluster per job which will then execute the job
> isolated from any other jobs (given that you don't submit other jobs to
> this cluster).
>
> Cheers,
> Till
>
> On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com>
> wrote:
>
>> Thanks Fabian and Till.
>>
>> We have customers who are interested in using Flink but very concerned
>> about that "multiple jobs share the same set of TMs". I've just joined the
>> community recently so I'm not sure whether there has been a discussion over
>> the "multi-tenant cluster mode" before.
>>
>> The cons are one job/user's failure may crash another, which is
>> unacceptable in a multi-tenant scenario.
>> What are the pros ? Do the pros overweigh the cons ?
>>
>> Manu
>>
>> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>>> Hi Manu,
>>>
>>> with Flip-6 we will be able to support stricter application isolation by
>>> starting for each job a dedicated JobManager which will execute its tasks
>>> on TM reserved solely for this job. But at the same time we will continue
>>> supporting the multi-tenant cluster mode where tasks belonging to multiple
>>> jobs share the same set of TMs and, thus, might share information between
>>> them.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com>
>>> wrote:
>>>
>>> Hi Manu,
>>>
>>> As far as I know, there are not plans to change the stand-alone
>>> deployment.
>>> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
>>> etc.) which allow to start Flink processes per job.
>>>
>>> Till (in CC) is more familiar with the FLIP-6 effort and might be able
>>> to add more detail.
>>>
>>> Best,
>>> Fabian
>>>
>>> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>>>
>>> Hi all,
>>>
>>> It seems tasks of different Flink applications can end up in the same
>>> JVM (TaskManager) in standalone mode. Isn't this fragile since errors in
>>> one application could crash another ? I checked FLIP-6
>>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
>>> didn't found any mention of changing it in the future.
>>>
>>> Any thoughts or have I missed anything ?
>>>
>>> Thanks,
>>> Manu Zhang
>>>
>>>
>>>
>>>
>

Re: separation of JVMs for different applications

Posted by Till Rohrmann <tr...@apache.org>.
The pro for the multi-tenant cluster mode is that you can share data
between jobs and you don't have to spin up a new cluster for each job. This
might be helpful for scenarios where you want to run many short-lived and
light-weight jobs.

But the important part is that you don't have to use this method. You can
also start a new Flink cluster per job which will then execute the job
isolated from any other jobs (given that you don't submit other jobs to
this cluster).

Cheers,
Till

On Sat, Dec 3, 2016 at 2:50 PM, Manu Zhang <ow...@gmail.com> wrote:

> Thanks Fabian and Till.
>
> We have customers who are interested in using Flink but very concerned
> about that "multiple jobs share the same set of TMs". I've just joined the
> community recently so I'm not sure whether there has been a discussion over
> the "multi-tenant cluster mode" before.
>
> The cons are one job/user's failure may crash another, which is
> unacceptable in a multi-tenant scenario.
> What are the pros ? Do the pros overweigh the cons ?
>
> Manu
>
> On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org> wrote:
>
>> Hi Manu,
>>
>> with Flip-6 we will be able to support stricter application isolation by
>> starting for each job a dedicated JobManager which will execute its tasks
>> on TM reserved solely for this job. But at the same time we will continue
>> supporting the multi-tenant cluster mode where tasks belonging to multiple
>> jobs share the same set of TMs and, thus, might share information between
>> them.
>>
>> Cheers,
>> Till
>>
>> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>>
>> Hi Manu,
>>
>> As far as I know, there are not plans to change the stand-alone
>> deployment.
>> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
>> etc.) which allow to start Flink processes per job.
>>
>> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
>> add more detail.
>>
>> Best,
>> Fabian
>>
>> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>>
>> Hi all,
>>
>> It seems tasks of different Flink applications can end up in the same JVM
>> (TaskManager) in standalone mode. Isn't this fragile since errors in one
>> application could crash another ? I checked FLIP-6
>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
>> didn't found any mention of changing it in the future.
>>
>> Any thoughts or have I missed anything ?
>>
>> Thanks,
>> Manu Zhang
>>
>>
>>
>>

Re: separation of JVMs for different applications

Posted by Manu Zhang <ow...@gmail.com>.
Thanks Fabian and Till.

We have customers who are interested in using Flink but very concerned
about that "multiple jobs share the same set of TMs". I've just joined the
community recently so I'm not sure whether there has been a discussion over
the "multi-tenant cluster mode" before.

The cons are one job/user's failure may crash another, which is
unacceptable in a multi-tenant scenario.
What are the pros ? Do the pros overweigh the cons ?

Manu

On Fri, Dec 2, 2016 at 7:06 PM Till Rohrmann <tr...@apache.org> wrote:

> Hi Manu,
>
> with Flip-6 we will be able to support stricter application isolation by
> starting for each job a dedicated JobManager which will execute its tasks
> on TM reserved solely for this job. But at the same time we will continue
> supporting the multi-tenant cluster mode where tasks belonging to multiple
> jobs share the same set of TMs and, thus, might share information between
> them.
>
> Cheers,
> Till
>
> On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:
>
> Hi Manu,
>
> As far as I know, there are not plans to change the stand-alone deployment.
> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
> etc.) which allow to start Flink processes per job.
>
> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
> add more detail.
>
> Best,
> Fabian
>
> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>
> Hi all,
>
> It seems tasks of different Flink applications can end up in the same JVM
> (TaskManager) in standalone mode. Isn't this fragile since errors in one
> application could crash another ? I checked FLIP-6
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
> didn't found any mention of changing it in the future.
>
> Any thoughts or have I missed anything ?
>
> Thanks,
> Manu Zhang
>
>
>
>

Re: separation of JVMs for different applications

Posted by Till Rohrmann <tr...@apache.org>.
Hi Manu,

with Flip-6 we will be able to support stricter application isolation by
starting for each job a dedicated JobManager which will execute its tasks
on TM reserved solely for this job. But at the same time we will continue
supporting the multi-tenant cluster mode where tasks belonging to multiple
jobs share the same set of TMs and, thus, might share information between
them.

Cheers,
Till

On Fri, Dec 2, 2016 at 11:19 AM, Fabian Hueske <fh...@gmail.com> wrote:

> Hi Manu,
>
> As far as I know, there are not plans to change the stand-alone deployment.
> FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
> etc.) which allow to start Flink processes per job.
>
> Till (in CC) is more familiar with the FLIP-6 effort and might be able to
> add more detail.
>
> Best,
> Fabian
>
> 2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:
>
>> Hi all,
>>
>> It seems tasks of different Flink applications can end up in the same JVM
>> (TaskManager) in standalone mode. Isn't this fragile since errors in one
>> application could crash another ? I checked FLIP-6
>> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
>> didn't found any mention of changing it in the future.
>>
>> Any thoughts or have I missed anything ?
>>
>> Thanks,
>> Manu Zhang
>>
>
>

Re: separation of JVMs for different applications

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Manu,

As far as I know, there are not plans to change the stand-alone deployment.
FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
etc.) which allow to start Flink processes per job.

Till (in CC) is more familiar with the FLIP-6 effort and might be able to
add more detail.

Best,
Fabian

2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:

> Hi all,
>
> It seems tasks of different Flink applications can end up in the same JVM
> (TaskManager) in standalone mode. Isn't this fragile since errors in one
> application could crash another ? I checked FLIP-6
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
> didn't found any mention of changing it in the future.
>
> Any thoughts or have I missed anything ?
>
> Thanks,
> Manu Zhang
>

Re: separation of JVMs for different applications

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Manu,

As far as I know, there are not plans to change the stand-alone deployment.
FLIP-6 is focusing on deployments via resource providers (YARN, Mesos,
etc.) which allow to start Flink processes per job.

Till (in CC) is more familiar with the FLIP-6 effort and might be able to
add more detail.

Best,
Fabian

2016-12-01 4:16 GMT+01:00 Manu Zhang <ow...@gmail.com>:

> Hi all,
>
> It seems tasks of different Flink applications can end up in the same JVM
> (TaskManager) in standalone mode. Isn't this fragile since errors in one
> application could crash another ? I checked FLIP-6
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077> but
> didn't found any mention of changing it in the future.
>
> Any thoughts or have I missed anything ?
>
> Thanks,
> Manu Zhang
>