You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by Ali Nazemian <al...@gmail.com> on 2017/11/22 13:12:40 UTC

Using Storm Resource Aware Scheduler

Hi all,


One of the issues that we are dealing with is the fact that not all of
the Metron feeds have the same type of resource requirements. For example,
we have some feeds that even a single Strom slot is way more than what it
needs. We thought we could make it more utilised in total by limiting at
least the amount of available heap space per feed to the parser topology
worker. However, since Storm scheduler relies on available slots, it is
very hard and almost impossible to utilise the cluster in the scenario that
there will be lots of different topologies with different requirements
running at the same time. Therefore, on a daily basis, we can see that for
example one of the Storm hosts is 120% utilised and another is 20%
utilised! I was wondering whether we can address this situation by using
Storm Resource Aware scheduler or not.

P.S: it would be very nice to have a functionality to tune Storm
topology-related parameters per feed in the GUI (for example in Management
UI).


Regards,
Ali

Re: Using Storm Resource Aware Scheduler

Posted by Otto Fowler <ot...@gmail.com>.
Hi Ali,

This is a holiday in the US (Thanksgiving) and many people have a 4 day
weekend.  It is also common to travel for this holiday.
It is possible that some of the community that know a bit more about storm
will not be online during this time.

I do not have experience with the RAS, but in doing some research I can see
the following:

1. Our parser topologies are built in code, and therefore would require a
code change to allow for setting the cpu and memory
component properties for parser topology components.  It is also not clear
to me how we set those properties for ambari started topologies.
2. Our enrichment and indexing topologies are built with flux, so I *think*
those configurations could be edited in the field to set the cpu and memory
configurations ( as well as other RAS configs ).  But I have not seen any
Flux examples on how to do so.
3. I am not certain how much of the node specific configurations to the
yaml files can be done from ambari, but it may be.

TL/DR;
We (I) would need to do more research on how we could support this.

Hopefully someone with more storm know how will hop on this soon.

I would recommend that you open a Jira on Metron Storm Topologies
Supporting Resource Aware Scheduling


On November 24, 2017 at 02:56:28, Ali Nazemian (alinazemian@gmail.com)
wrote:

Any help regarding this question would be appreciated.


On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian <al...@gmail.com> wrote:

> 30 mins average of CPU load by checking Ambari.
>
> On 23 Nov. 2017 00:51, "Otto Fowler" <ot...@gmail.com> wrote:
>
> How are you measuring the utilization?
>
>
> On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com)
> wrote:
>
> Hi all,
>
>
> One of the issues that we are dealing with is the fact that not all of
> the Metron feeds have the same type of resource requirements. For example,
> we have some feeds that even a single Strom slot is way more than what it
> needs. We thought we could make it more utilised in total by limiting at
> least the amount of available heap space per feed to the parser topology
> worker. However, since Storm scheduler relies on available slots, it is
> very hard and almost impossible to utilise the cluster in the scenario that
> there will be lots of different topologies with different requirements
> running at the same time. Therefore, on a daily basis, we can see that for
> example one of the Storm hosts is 120% utilised and another is 20%
> utilised! I was wondering whether we can address this situation by using
> Storm Resource Aware scheduler or not.
>
> P.S: it would be very nice to have a functionality to tune Storm
> topology-related parameters per feed in the GUI (for example in Management
> UI).
>
>
> Regards,
> Ali
>
>
>


--
A.Nazemian

Re: Using Storm Resource Aware Scheduler

Posted by Ali Nazemian <al...@gmail.com>.
Sounds great, Simon. We will work on refactoring our design to be aligned
with Metadata feature. As long as we can use the same parser, there is no
technical reason that we cannot use the same feed to handle it. However, I
need to check it for more details to understand how complex it would be to
merge different tenants at this moment. Hopefully, it shouldn't be too
complex.

BTW, I haven't had any permission to close this ticket, so I have just
created a duplicate link to the main ticket as you mentioned.

Cheers,
Ali

On Mon, Nov 27, 2017 at 9:06 AM, Simon Elliston Ball <
simon@simonellistonball.com> wrote:

> The multi-tenancy though meta-data method mentioned is designed to solve
> exactly that problem and has been in the project for some time now. The
> goal would be to have one topology per data schema and use the key to
> communicate tenant meta-data. See https://archive.apache.org/
> dist/metron/0.4.1/site-book/metron-platform/metron-
> parsers/index.html#Metadata <https://archive.apache.org/
> dist/metron/0.4.1/site-book/metron-platform/metron-
> parsers/index.html#Metadata> for details.
>
> The storm issue you mention is something for the storm project to look at,
> so we can’t really comment on their behalf here, but yeah, it will be nice
> to have storm do some of the tuning for us at some point.
>
> Not that the UI already has the tuning parameters you’re talking about in
> the latest version, so there is no need for the new JIRA (
> https://issues.apache.org/jira/browse/METRON-1330 <
> https://issues.apache.org/jira/browse/METRON-1330>). It should be closed
> as a duplicate of https://issues.apache.org/jira/browse/METRON-1161 <
> https://issues.apache.org/jira/browse/METRON-1161>.
>
> Simon
>
> > On 26 Nov 2017, at 02:15, Ali Nazemian <al...@gmail.com> wrote:
> >
> > Oops, I didn't know that. Happy Thanksgiving.
> >
> > Thanks, Otto and Simon.
> >
> > As you are aware of our use cases, with the current limitations of
> > multi-tenancy support, we are creating a feed per tenant per device.
> > Sometimes the amount of traffic we are receiving per each tenant and per
> > each device is way less than dedicating one storm slot for it.
> Therefore, I
> > was hoping to make it at least theoretically possible to tune resources
> > more wisely, but it is not going to be easy at all. This is probably a
> use
> > case that storm auto-scaling mechanism would be very nice to have.
> >
> > https://issues.apache.org/jira/browse/STORM-594
> >
> > On the other side, I can recall there was a PR to address multi-tenancy
> by
> > adding meta-data to Kafka topic. However, I lost track of that feature,
> so
> > maybe this situation can be tackled at another level by merging different
> > parsers.
> >
> > I will create a Jira ticket to add an ability in UI to tune Metron parser
> > feeds at Storm level. Right now it is a little hard to maintain tuning
> > configurations per each parser, and as soon as somebody restarts them
> from
> > Management-UI/Ambari, it will be overwritten.
> >
> >
> > Cheers,
> > Ali
> >
> > On Sat, Nov 25, 2017 at 3:36 AM, Simon Elliston Ball <
> > simon@simonellistonball.com> wrote:
> >
> >> Implementing the resource aware scheduler would be decidedly
> non-trivial.
> >> Every topology will need additional configuration to tune for things
> like
> >> memory sizes, which is not going to buy you much change. So, at the
> >> micro-tuning level of parser this doesn’t make a lot of sense.
> >>
> >> However, it may be relevant to consider separate tuning for parsers in
> >> general vs the core enrichment and indexing topologies (potentially also
> >> for separate indexing topologies when this comes in) and the resource
> >> scheduler could provide a theoretical benefit there.
> >>
> >> Specifying resource requirements per parser topology might sound like a
> >> good idea, but if your parsers are working the way they should, they
> should
> >> be using a small amount of memory as their default size, and achieving
> >> additional resource use by multiplying workers and executors (to get
> higher
> >> usage per slot) and balance the load that way. To be honest, the only
> >> difference you’re going to get from the RAS is to add a bunch of tuning
> >> parameters which allow slightly different granularity of units for
> things
> >> like memory.
> >>
> >> The other RAS feature which might be a good add is prioritisation of
> >> different parser topologies, but again, this is probably not something
> you
> >> want to push hard on unless you are severely limited in resources (in
> which
> >> case, why not just add another node, it will be cheaper than spending
> all
> >> that time micro-tuning the resource requirements for each data feed).
> >>
> >> Right now we do allow a lot of micro tuning of parallelism around things
> >> like the count of executor threads, which is achieves roughly the
> >> equivalent of the cpu based limits in the RAS.
> >>
> >> TL;DR:
> >>
> >> If you’re not using resource pools for different users and using the
> idea
> >> that prioritisation can lead to arbitrary kills, all you’re getting is a
> >> slightly different way of tuning knobs that already exist, but you would
> >> get a slightly different granularity. Also, we would have to rewrite all
> >> the topology code to add the config endpoints for CPU and memory
> estimates.
> >>
> >> Simon
> >>
> >>> On 24 Nov 2017, at 07:56, Ali Nazemian <al...@gmail.com> wrote:
> >>>
> >>> Any help regarding this question would be appreciated.
> >>>
> >>>
> >>> On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian <al...@gmail.com>
> >> wrote:
> >>>
> >>>> 30 mins average of CPU load by checking Ambari.
> >>>>
> >>>> On 23 Nov. 2017 00:51, "Otto Fowler" <ot...@gmail.com> wrote:
> >>>>
> >>>> How are you measuring the utilization?
> >>>>
> >>>>
> >>>> On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com
> )
> >>>> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>>
> >>>> One of the issues that we are dealing with is the fact that not all of
> >>>> the Metron feeds have the same type of resource requirements. For
> >> example,
> >>>> we have some feeds that even a single Strom slot is way more than what
> >> it
> >>>> needs. We thought we could make it more utilised in total by limiting
> at
> >>>> least the amount of available heap space per feed to the parser
> topology
> >>>> worker. However, since Storm scheduler relies on available slots, it
> is
> >>>> very hard and almost impossible to utilise the cluster in the scenario
> >>>> that
> >>>> there will be lots of different topologies with different requirements
> >>>> running at the same time. Therefore, on a daily basis, we can see that
> >> for
> >>>> example one of the Storm hosts is 120% utilised and another is 20%
> >>>> utilised! I was wondering whether we can address this situation by
> using
> >>>> Storm Resource Aware scheduler or not.
> >>>>
> >>>> P.S: it would be very nice to have a functionality to tune Storm
> >>>> topology-related parameters per feed in the GUI (for example in
> >> Management
> >>>> UI).
> >>>>
> >>>>
> >>>> Regards,
> >>>> Ali
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> A.Nazemian
> >>
> >>
> >
> >
> > --
> > A.Nazemian
>
>


-- 
A.Nazemian

Re: Using Storm Resource Aware Scheduler

Posted by Simon Elliston Ball <si...@simonellistonball.com>.
The multi-tenancy though meta-data method mentioned is designed to solve exactly that problem and has been in the project for some time now. The goal would be to have one topology per data schema and use the key to communicate tenant meta-data. See https://archive.apache.org/dist/metron/0.4.1/site-book/metron-platform/metron-parsers/index.html#Metadata <https://archive.apache.org/dist/metron/0.4.1/site-book/metron-platform/metron-parsers/index.html#Metadata> for details.

The storm issue you mention is something for the storm project to look at, so we can’t really comment on their behalf here, but yeah, it will be nice to have storm do some of the tuning for us at some point. 

Not that the UI already has the tuning parameters you’re talking about in the latest version, so there is no need for the new JIRA (https://issues.apache.org/jira/browse/METRON-1330 <https://issues.apache.org/jira/browse/METRON-1330>). It should be closed as a duplicate of https://issues.apache.org/jira/browse/METRON-1161 <https://issues.apache.org/jira/browse/METRON-1161>. 

Simon

> On 26 Nov 2017, at 02:15, Ali Nazemian <al...@gmail.com> wrote:
> 
> Oops, I didn't know that. Happy Thanksgiving.
> 
> Thanks, Otto and Simon.
> 
> As you are aware of our use cases, with the current limitations of
> multi-tenancy support, we are creating a feed per tenant per device.
> Sometimes the amount of traffic we are receiving per each tenant and per
> each device is way less than dedicating one storm slot for it. Therefore, I
> was hoping to make it at least theoretically possible to tune resources
> more wisely, but it is not going to be easy at all. This is probably a use
> case that storm auto-scaling mechanism would be very nice to have.
> 
> https://issues.apache.org/jira/browse/STORM-594
> 
> On the other side, I can recall there was a PR to address multi-tenancy by
> adding meta-data to Kafka topic. However, I lost track of that feature, so
> maybe this situation can be tackled at another level by merging different
> parsers.
> 
> I will create a Jira ticket to add an ability in UI to tune Metron parser
> feeds at Storm level. Right now it is a little hard to maintain tuning
> configurations per each parser, and as soon as somebody restarts them from
> Management-UI/Ambari, it will be overwritten.
> 
> 
> Cheers,
> Ali
> 
> On Sat, Nov 25, 2017 at 3:36 AM, Simon Elliston Ball <
> simon@simonellistonball.com> wrote:
> 
>> Implementing the resource aware scheduler would be decidedly non-trivial.
>> Every topology will need additional configuration to tune for things like
>> memory sizes, which is not going to buy you much change. So, at the
>> micro-tuning level of parser this doesn’t make a lot of sense.
>> 
>> However, it may be relevant to consider separate tuning for parsers in
>> general vs the core enrichment and indexing topologies (potentially also
>> for separate indexing topologies when this comes in) and the resource
>> scheduler could provide a theoretical benefit there.
>> 
>> Specifying resource requirements per parser topology might sound like a
>> good idea, but if your parsers are working the way they should, they should
>> be using a small amount of memory as their default size, and achieving
>> additional resource use by multiplying workers and executors (to get higher
>> usage per slot) and balance the load that way. To be honest, the only
>> difference you’re going to get from the RAS is to add a bunch of tuning
>> parameters which allow slightly different granularity of units for things
>> like memory.
>> 
>> The other RAS feature which might be a good add is prioritisation of
>> different parser topologies, but again, this is probably not something you
>> want to push hard on unless you are severely limited in resources (in which
>> case, why not just add another node, it will be cheaper than spending all
>> that time micro-tuning the resource requirements for each data feed).
>> 
>> Right now we do allow a lot of micro tuning of parallelism around things
>> like the count of executor threads, which is achieves roughly the
>> equivalent of the cpu based limits in the RAS.
>> 
>> TL;DR:
>> 
>> If you’re not using resource pools for different users and using the idea
>> that prioritisation can lead to arbitrary kills, all you’re getting is a
>> slightly different way of tuning knobs that already exist, but you would
>> get a slightly different granularity. Also, we would have to rewrite all
>> the topology code to add the config endpoints for CPU and memory estimates.
>> 
>> Simon
>> 
>>> On 24 Nov 2017, at 07:56, Ali Nazemian <al...@gmail.com> wrote:
>>> 
>>> Any help regarding this question would be appreciated.
>>> 
>>> 
>>> On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian <al...@gmail.com>
>> wrote:
>>> 
>>>> 30 mins average of CPU load by checking Ambari.
>>>> 
>>>> On 23 Nov. 2017 00:51, "Otto Fowler" <ot...@gmail.com> wrote:
>>>> 
>>>> How are you measuring the utilization?
>>>> 
>>>> 
>>>> On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com)
>>>> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> 
>>>> One of the issues that we are dealing with is the fact that not all of
>>>> the Metron feeds have the same type of resource requirements. For
>> example,
>>>> we have some feeds that even a single Strom slot is way more than what
>> it
>>>> needs. We thought we could make it more utilised in total by limiting at
>>>> least the amount of available heap space per feed to the parser topology
>>>> worker. However, since Storm scheduler relies on available slots, it is
>>>> very hard and almost impossible to utilise the cluster in the scenario
>>>> that
>>>> there will be lots of different topologies with different requirements
>>>> running at the same time. Therefore, on a daily basis, we can see that
>> for
>>>> example one of the Storm hosts is 120% utilised and another is 20%
>>>> utilised! I was wondering whether we can address this situation by using
>>>> Storm Resource Aware scheduler or not.
>>>> 
>>>> P.S: it would be very nice to have a functionality to tune Storm
>>>> topology-related parameters per feed in the GUI (for example in
>> Management
>>>> UI).
>>>> 
>>>> 
>>>> Regards,
>>>> Ali
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> A.Nazemian
>> 
>> 
> 
> 
> -- 
> A.Nazemian


Re: Using Storm Resource Aware Scheduler

Posted by Ali Nazemian <al...@gmail.com>.
Oops, I didn't know that. Happy Thanksgiving.

Thanks, Otto and Simon.

As you are aware of our use cases, with the current limitations of
multi-tenancy support, we are creating a feed per tenant per device.
Sometimes the amount of traffic we are receiving per each tenant and per
each device is way less than dedicating one storm slot for it. Therefore, I
was hoping to make it at least theoretically possible to tune resources
more wisely, but it is not going to be easy at all. This is probably a use
case that storm auto-scaling mechanism would be very nice to have.

https://issues.apache.org/jira/browse/STORM-594

On the other side, I can recall there was a PR to address multi-tenancy by
adding meta-data to Kafka topic. However, I lost track of that feature, so
maybe this situation can be tackled at another level by merging different
parsers.

I will create a Jira ticket to add an ability in UI to tune Metron parser
feeds at Storm level. Right now it is a little hard to maintain tuning
configurations per each parser, and as soon as somebody restarts them from
Management-UI/Ambari, it will be overwritten.


Cheers,
Ali

On Sat, Nov 25, 2017 at 3:36 AM, Simon Elliston Ball <
simon@simonellistonball.com> wrote:

> Implementing the resource aware scheduler would be decidedly non-trivial.
> Every topology will need additional configuration to tune for things like
> memory sizes, which is not going to buy you much change. So, at the
> micro-tuning level of parser this doesn’t make a lot of sense.
>
> However, it may be relevant to consider separate tuning for parsers in
> general vs the core enrichment and indexing topologies (potentially also
> for separate indexing topologies when this comes in) and the resource
> scheduler could provide a theoretical benefit there.
>
> Specifying resource requirements per parser topology might sound like a
> good idea, but if your parsers are working the way they should, they should
> be using a small amount of memory as their default size, and achieving
> additional resource use by multiplying workers and executors (to get higher
> usage per slot) and balance the load that way. To be honest, the only
> difference you’re going to get from the RAS is to add a bunch of tuning
> parameters which allow slightly different granularity of units for things
> like memory.
>
> The other RAS feature which might be a good add is prioritisation of
> different parser topologies, but again, this is probably not something you
> want to push hard on unless you are severely limited in resources (in which
> case, why not just add another node, it will be cheaper than spending all
> that time micro-tuning the resource requirements for each data feed).
>
> Right now we do allow a lot of micro tuning of parallelism around things
> like the count of executor threads, which is achieves roughly the
> equivalent of the cpu based limits in the RAS.
>
> TL;DR:
>
> If you’re not using resource pools for different users and using the idea
> that prioritisation can lead to arbitrary kills, all you’re getting is a
> slightly different way of tuning knobs that already exist, but you would
> get a slightly different granularity. Also, we would have to rewrite all
> the topology code to add the config endpoints for CPU and memory estimates.
>
> Simon
>
> > On 24 Nov 2017, at 07:56, Ali Nazemian <al...@gmail.com> wrote:
> >
> > Any help regarding this question would be appreciated.
> >
> >
> > On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian <al...@gmail.com>
> wrote:
> >
> >> 30 mins average of CPU load by checking Ambari.
> >>
> >> On 23 Nov. 2017 00:51, "Otto Fowler" <ot...@gmail.com> wrote:
> >>
> >> How are you measuring the utilization?
> >>
> >>
> >> On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com)
> >> wrote:
> >>
> >> Hi all,
> >>
> >>
> >> One of the issues that we are dealing with is the fact that not all of
> >> the Metron feeds have the same type of resource requirements. For
> example,
> >> we have some feeds that even a single Strom slot is way more than what
> it
> >> needs. We thought we could make it more utilised in total by limiting at
> >> least the amount of available heap space per feed to the parser topology
> >> worker. However, since Storm scheduler relies on available slots, it is
> >> very hard and almost impossible to utilise the cluster in the scenario
> >> that
> >> there will be lots of different topologies with different requirements
> >> running at the same time. Therefore, on a daily basis, we can see that
> for
> >> example one of the Storm hosts is 120% utilised and another is 20%
> >> utilised! I was wondering whether we can address this situation by using
> >> Storm Resource Aware scheduler or not.
> >>
> >> P.S: it would be very nice to have a functionality to tune Storm
> >> topology-related parameters per feed in the GUI (for example in
> Management
> >> UI).
> >>
> >>
> >> Regards,
> >> Ali
> >>
> >>
> >>
> >
> >
> > --
> > A.Nazemian
>
>


-- 
A.Nazemian

Re: Using Storm Resource Aware Scheduler

Posted by Simon Elliston Ball <si...@simonellistonball.com>.
Implementing the resource aware scheduler would be decidedly non-trivial. Every topology will need additional configuration to tune for things like memory sizes, which is not going to buy you much change. So, at the micro-tuning level of parser this doesn’t make a lot of sense. 

However, it may be relevant to consider separate tuning for parsers in general vs the core enrichment and indexing topologies (potentially also for separate indexing topologies when this comes in) and the resource scheduler could provide a theoretical benefit there.

Specifying resource requirements per parser topology might sound like a good idea, but if your parsers are working the way they should, they should be using a small amount of memory as their default size, and achieving additional resource use by multiplying workers and executors (to get higher usage per slot) and balance the load that way. To be honest, the only difference you’re going to get from the RAS is to add a bunch of tuning parameters which allow slightly different granularity of units for things like memory.

The other RAS feature which might be a good add is prioritisation of different parser topologies, but again, this is probably not something you want to push hard on unless you are severely limited in resources (in which case, why not just add another node, it will be cheaper than spending all that time micro-tuning the resource requirements for each data feed).

Right now we do allow a lot of micro tuning of parallelism around things like the count of executor threads, which is achieves roughly the equivalent of the cpu based limits in the RAS. 

TL;DR: 

If you’re not using resource pools for different users and using the idea that prioritisation can lead to arbitrary kills, all you’re getting is a slightly different way of tuning knobs that already exist, but you would get a slightly different granularity. Also, we would have to rewrite all the topology code to add the config endpoints for CPU and memory estimates. 

Simon

> On 24 Nov 2017, at 07:56, Ali Nazemian <al...@gmail.com> wrote:
> 
> Any help regarding this question would be appreciated.
> 
> 
> On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian <al...@gmail.com> wrote:
> 
>> 30 mins average of CPU load by checking Ambari.
>> 
>> On 23 Nov. 2017 00:51, "Otto Fowler" <ot...@gmail.com> wrote:
>> 
>> How are you measuring the utilization?
>> 
>> 
>> On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com)
>> wrote:
>> 
>> Hi all,
>> 
>> 
>> One of the issues that we are dealing with is the fact that not all of
>> the Metron feeds have the same type of resource requirements. For example,
>> we have some feeds that even a single Strom slot is way more than what it
>> needs. We thought we could make it more utilised in total by limiting at
>> least the amount of available heap space per feed to the parser topology
>> worker. However, since Storm scheduler relies on available slots, it is
>> very hard and almost impossible to utilise the cluster in the scenario
>> that
>> there will be lots of different topologies with different requirements
>> running at the same time. Therefore, on a daily basis, we can see that for
>> example one of the Storm hosts is 120% utilised and another is 20%
>> utilised! I was wondering whether we can address this situation by using
>> Storm Resource Aware scheduler or not.
>> 
>> P.S: it would be very nice to have a functionality to tune Storm
>> topology-related parameters per feed in the GUI (for example in Management
>> UI).
>> 
>> 
>> Regards,
>> Ali
>> 
>> 
>> 
> 
> 
> -- 
> A.Nazemian


Re: Using Storm Resource Aware Scheduler

Posted by Ali Nazemian <al...@gmail.com>.
Any help regarding this question would be appreciated.


On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian <al...@gmail.com> wrote:

> 30 mins average of CPU load by checking Ambari.
>
> On 23 Nov. 2017 00:51, "Otto Fowler" <ot...@gmail.com> wrote:
>
> How are you measuring the utilization?
>
>
> On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com)
> wrote:
>
> Hi all,
>
>
> One of the issues that we are dealing with is the fact that not all of
> the Metron feeds have the same type of resource requirements. For example,
> we have some feeds that even a single Strom slot is way more than what it
> needs. We thought we could make it more utilised in total by limiting at
> least the amount of available heap space per feed to the parser topology
> worker. However, since Storm scheduler relies on available slots, it is
> very hard and almost impossible to utilise the cluster in the scenario
> that
> there will be lots of different topologies with different requirements
> running at the same time. Therefore, on a daily basis, we can see that for
> example one of the Storm hosts is 120% utilised and another is 20%
> utilised! I was wondering whether we can address this situation by using
> Storm Resource Aware scheduler or not.
>
> P.S: it would be very nice to have a functionality to tune Storm
> topology-related parameters per feed in the GUI (for example in Management
> UI).
>
>
> Regards,
> Ali
>
>
>


-- 
A.Nazemian

Re: Using Storm Resource Aware Scheduler

Posted by Ali Nazemian <al...@gmail.com>.
30 mins average of CPU load by checking Ambari.

On 23 Nov. 2017 00:51, "Otto Fowler" <ot...@gmail.com> wrote:

How are you measuring the utilization?


On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com)
wrote:

Hi all,


One of the issues that we are dealing with is the fact that not all of
the Metron feeds have the same type of resource requirements. For example,
we have some feeds that even a single Strom slot is way more than what it
needs. We thought we could make it more utilised in total by limiting at
least the amount of available heap space per feed to the parser topology
worker. However, since Storm scheduler relies on available slots, it is
very hard and almost impossible to utilise the cluster in the scenario that
there will be lots of different topologies with different requirements
running at the same time. Therefore, on a daily basis, we can see that for
example one of the Storm hosts is 120% utilised and another is 20%
utilised! I was wondering whether we can address this situation by using
Storm Resource Aware scheduler or not.

P.S: it would be very nice to have a functionality to tune Storm
topology-related parameters per feed in the GUI (for example in Management
UI).


Regards,
Ali

Re: Using Storm Resource Aware Scheduler

Posted by Otto Fowler <ot...@gmail.com>.
How are you measuring the utilization?


On November 22, 2017 at 08:12:51, Ali Nazemian (alinazemian@gmail.com)
wrote:

Hi all,


One of the issues that we are dealing with is the fact that not all of
the Metron feeds have the same type of resource requirements. For example,
we have some feeds that even a single Strom slot is way more than what it
needs. We thought we could make it more utilised in total by limiting at
least the amount of available heap space per feed to the parser topology
worker. However, since Storm scheduler relies on available slots, it is
very hard and almost impossible to utilise the cluster in the scenario that
there will be lots of different topologies with different requirements
running at the same time. Therefore, on a daily basis, we can see that for
example one of the Storm hosts is 120% utilised and another is 20%
utilised! I was wondering whether we can address this situation by using
Storm Resource Aware scheduler or not.

P.S: it would be very nice to have a functionality to tune Storm
topology-related parameters per feed in the GUI (for example in Management
UI).


Regards,
Ali