You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@heron.apache.org by Josh Fischer <jo...@joshfischer.io> on 2018/01/21 14:21:29 UTC
Proposing Changes To ECO
All,
While working with Karthik, we have discovered that the way ECO handles
topology configuration will not work for all configuration types. To be
specific, setting individual component's ram will not work. We will also
have to keep in mind container size that contains the components. My
proposal is this:
Create a standardized way to allow for the configuring of component ram
size in the "config" section of the eco yaml file. This would be a list
of key value pairs that mapped the "id" of a component to an allocated ram
size in MB. An example is below:
config:
topology.workers: 2
topology.component.rammap:
- "some-id": 1234
- "other-id": 6789
However the above implementation may be unclear when it comes to
understanding what unit of measurement is implicitly specified and/or
expected. Or we could do something like below.
config:
topology.workers: 2
topology.component.rammap:
- spec:
id: "component-1"
size: 1234
type: MB // Megabytes
- spec:
id: "component-2"
size: 6789
type: GB // GigaBytes
- spec:
id: "component-3"
size: 123456789
type: B // Bytes
If a mapping is not specified for a component, we can just assume Heron's
defaults. We could then dynamically calculate the container size based off
of the number components and their corresponding allocated resources for
simplicity of use for the user, but still allow them to specify a custom
set of resources to a container like below
topology.container.disk: 1234
topology.container.ram: 3456
topology.container.cpu: 2
It may be best if I reused the ByteAmount object to calculate resource size
to remain consistent with the other Heron APIs. Any concerns or
improvements to this approach I am missing?
Please Advise,
Josh
Re: Proposing Changes To ECO
Posted by Josh Fischer <jo...@joshfischer.io>.
It seems my proposal wasn't fully thought through. I think your suggestion
is the better way to go.
+1
-Josh
On Mon, Jan 22, 2018 at 11:33 AM, Karthik Ramasamy <ka...@streaml.io>
wrote:
> Josh -
>
> One more feedback - since the resources assigned can be CPU, RAM, DISK -
> instead of calling it
>
> topology.component.rammap
>
> can we call it
>
> topology.component.resourcemap
>
> and allow for CPU and DISK. Furthermore, we append the size type into the
> metric as follows
>
> config:
> topology.workers: 2
> topology.component.resourcemap:
>
> - id: "component-1"
> ram: 1234MB
> cpu: 0.5
> disk: 123MB
>
> - id: "component-2"
> ram: 2345MB
> cpu: 0.75
> disk: 4GB
>
> This will make it easier to read and also flexible, thoughts?
>
> cheers
> /karthik
>
>
>
> cheers
> /karthik
>
> On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io> wrote:
>
> > To All,
> >
> > I think I made a mistake in my previous email
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> > - "some-id": 1234
> > - "other-id": 6789
> >
> >
> > I think the yaml above is incorrect as well as other examples. I think
> we
> > would have to do something like below
> >
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> > - "some-id:1234"
> > - "other-id:6789"
> >
> > Which would then product a list of strings that would match the way the
> > topology_component_rammap is set via other apis. The problem with this
> > approach is it would be easy for someone to make a mistake within the
> > formatting of the strings and would then cause us to have to validate the
> > format to fit the specs. I think the approach below would be better. I
> > would then just take the input, do some validation and conversion via the
> > ByteAmount class and generate a properly formatted string to fit the
> specs
> > of the topology_component_rammap values.
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> >
> > - id: "component-1"
> > size: 1234
> > type: MB // Megabytes
> >
> > - id: "component-2"
> > size: 6789
> > type: GB // GigaBytes
> >
> > - id: "component-3"
> > size: 123456789
> > type: B // Bytes
> >
> >
> >
> > Hope I was clear with trying to explain things. Of course I will also be
> > creating the docs as well to explain usage.
> >
> > -Josh
> >
> > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >
> > > All,
> > >
> > > While working with Karthik, we have discovered that the way ECO handles
> > > topology configuration will not work for all configuration types. To
> be
> > > specific, setting individual component's ram will not work. We will
> also
> > > have to keep in mind container size that contains the components. My
> > > proposal is this:
> > >
> > > Create a standardized way to allow for the configuring of component
> ram
> > > size in the "config" section of the eco yaml file. This would be a
> list
> > > of key value pairs that mapped the "id" of a component to an allocated
> > ram
> > > size in MB. An example is below:
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.rammap:
> > > - "some-id": 1234
> > > - "other-id": 6789
> > >
> > > However the above implementation may be unclear when it comes to
> > > understanding what unit of measurement is implicitly specified and/or
> > > expected. Or we could do something like below.
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.rammap:
> > > - spec:
> > > id: "component-1"
> > > size: 1234
> > > type: MB // Megabytes
> > > - spec:
> > > id: "component-2"
> > > size: 6789
> > > type: GB // GigaBytes
> > > - spec:
> > > id: "component-3"
> > > size: 123456789
> > > type: B // Bytes
> > >
> > >
> > > If a mapping is not specified for a component, we can just assume
> Heron's
> > > defaults. We could then dynamically calculate the container size based
> > off
> > > of the number components and their corresponding allocated resources
> for
> > > simplicity of use for the user, but still allow them to specify a
> custom
> > > set of resources to a container like below
> > >
> > > topology.container.disk: 1234
> > > topology.container.ram: 3456
> > > topology.container.cpu: 2
> > >
> > >
> > > It may be best if I reused the ByteAmount object to calculate resource
> > > size to remain consistent with the other Heron APIs. Any concerns or
> > > improvements to this approach I am missing?
> > >
> > > Please Advise,
> > >
> > > Josh
> > >
> >
>
Re: Proposing Changes To ECO
Posted by Ning Wang <wa...@gmail.com>.
Got it. Thanks! It makes more sense now. :)
On Mon, Jan 22, 2018 at 1:51 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> Ning,
>
> In my email I was thinking specifically of setting the componentRam. This
> is case the value is a comma delimited string value which would be easy to
> incorrectly format the list of values to be appended. An image to
> reference is below. So by passing in a list of values, I could then
> correctly format the value String as we would expect.
>
> public static void setComponentRam(Map<String, Object> conf,
> String component, ByteAmount
> ramInBytes) {
> if (conf.containsKey(Config.TOPOLOGY_COMPONENT_RAMMAP)) {
> String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_RAMMAP);
> String newEntry = String.format("%s,%s:%d", oldEntry, component,
> ramInBytes.asBytes());
> conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
> } else {
> String newEntry = String.format("%s:%d", component,
> ramInBytes.asBytes());
> conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
> }
> }
>
> I'm glad you sent this email as it got me thinking about the above spec
> that Karthik mentioned. I've copied his spec below
>
>
> config:
> topology.workers: 2
> topology.component.resourcemap:
>
> - id: "component-1"
> ram: 1234MB
> cpu: 0.5
> disk: 123MB
>
> - id: "component-2"
> ram: 2345MB
> cpu: 0.75
> disk: 4GB
>
> I think disk and cpu resources are allocated at a topology level and would
> not be applicable here. Unless there is a way that you specify this
> through the Heron Config class?.. After looking at the docs here
> https://twitter.github.io/heron/docs/developers/tuning/ and looking at the
> Heron Config class, I don't see way to specify these at a component level.
> I do see there is a way to pass any configuration up to Heron, can I set
> this values via a `prepare()` or `open()` call?
>
> One last note while thinking about this. `setComponentJvmOptions()` has a
> similar behavior. I would have this do the same for this field too I
> believe
>
>
> public static void setComponentJvmOptions(
> Map<String, Object> conf,
> String component,
> String jvmOptions) {
> String optsBase64;
> String componentBase64;
>
> optsBase64 = DatatypeConverter.printBase64Binary(
> jvmOptions.getBytes(StandardCharsets.UTF_8));
> componentBase64 = DatatypeConverter.printBase64Binary(
> component.getBytes(StandardCharsets.UTF_8));
>
> String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_JVMOPTS);
> String newEntry;
> if (oldEntry == null) {
> newEntry = String.format("{\"%s\":\"%s\"}", componentBase64,
> optsBase64);
> } else {
> // To remove the '{' at the start and '}' at the end
> oldEntry = oldEntry.substring(1, oldEntry.length() - 1);
> newEntry = String.format("{%s,\"%s\":\"%s\"}", oldEntry,
> componentBase64, optsBase64);
> }
> // Format for TOPOLOGY_COMPONENT_JVMOPTS would be a json map like this:
> // {
> // "componentNameAInBase64": "jvmOptionsInBase64",
> // "componentNameBInBase64": "jvmOptionsInBase64"
> // }
> conf.put(Config.TOPOLOGY_COMPONENT_JVMOPTS, newEntry);
>
> }
>
>
>
> If I've missed something please let me know.
>
> -Josh
>
>
> On Mon, Jan 22, 2018 at 12:02 PM, Ning Wang <wa...@gmail.com> wrote:
>
> > LGTM. And I like the 123MB more than separating value and unit into two
> > settings.
> >
> > Quick questions:
> > This new config will replace the existing topology.component.rammap?
> > "the way ECO handles topology configuration will not work for all
> > configuration types". Can you give a more specific example?
> >
> > Thanks.
> >
> >
> >
> >
> >
> > On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <ka...@streaml.io>
> > wrote:
> >
> > > Josh -
> > >
> > > One more feedback - since the resources assigned can be CPU, RAM, DISK
> -
> > > instead of calling it
> > >
> > > topology.component.rammap
> > >
> > > can we call it
> > >
> > > topology.component.resourcemap
> > >
> > > and allow for CPU and DISK. Furthermore, we append the size type into
> the
> > > metric as follows
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.resourcemap:
> > >
> > > - id: "component-1"
> > > ram: 1234MB
> > > cpu: 0.5
> > > disk: 123MB
> > >
> > > - id: "component-2"
> > > ram: 2345MB
> > > cpu: 0.75
> > > disk: 4GB
> > >
> > > This will make it easier to read and also flexible, thoughts?
> > >
> > > cheers
> > > /karthik
> > >
> > >
> > >
> > > cheers
> > > /karthik
> > >
> > > On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io>
> > wrote:
> > >
> > > > To All,
> > > >
> > > > I think I made a mistake in my previous email
> > > >
> > > > config:
> > > > topology.workers: 2
> > > > topology.component.rammap:
> > > > - "some-id": 1234
> > > > - "other-id": 6789
> > > >
> > > >
> > > > I think the yaml above is incorrect as well as other examples. I
> think
> > > we
> > > > would have to do something like below
> > > >
> > > >
> > > > config:
> > > > topology.workers: 2
> > > > topology.component.rammap:
> > > > - "some-id:1234"
> > > > - "other-id:6789"
> > > >
> > > > Which would then product a list of strings that would match the way
> the
> > > > topology_component_rammap is set via other apis. The problem with
> this
> > > > approach is it would be easy for someone to make a mistake within the
> > > > formatting of the strings and would then cause us to have to validate
> > the
> > > > format to fit the specs. I think the approach below would be better.
> > I
> > > > would then just take the input, do some validation and conversion via
> > the
> > > > ByteAmount class and generate a properly formatted string to fit the
> > > specs
> > > > of the topology_component_rammap values.
> > > >
> > > > config:
> > > > topology.workers: 2
> > > > topology.component.rammap:
> > > >
> > > > - id: "component-1"
> > > > size: 1234
> > > > type: MB // Megabytes
> > > >
> > > > - id: "component-2"
> > > > size: 6789
> > > > type: GB // GigaBytes
> > > >
> > > > - id: "component-3"
> > > > size: 123456789
> > > > type: B // Bytes
> > > >
> > > >
> > > >
> > > > Hope I was clear with trying to explain things. Of course I will
> also
> > be
> > > > creating the docs as well to explain usage.
> > > >
> > > > -Josh
> > > >
> > > > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> > > wrote:
> > > >
> > > > > All,
> > > > >
> > > > > While working with Karthik, we have discovered that the way ECO
> > handles
> > > > > topology configuration will not work for all configuration types.
> To
> > > be
> > > > > specific, setting individual component's ram will not work. We
> will
> > > also
> > > > > have to keep in mind container size that contains the components.
> My
> > > > > proposal is this:
> > > > >
> > > > > Create a standardized way to allow for the configuring of
> component
> > > ram
> > > > > size in the "config" section of the eco yaml file. This would be
> a
> > > list
> > > > > of key value pairs that mapped the "id" of a component to an
> > allocated
> > > > ram
> > > > > size in MB. An example is below:
> > > > >
> > > > > config:
> > > > > topology.workers: 2
> > > > > topology.component.rammap:
> > > > > - "some-id": 1234
> > > > > - "other-id": 6789
> > > > >
> > > > > However the above implementation may be unclear when it comes to
> > > > > understanding what unit of measurement is implicitly specified
> and/or
> > > > > expected. Or we could do something like below.
> > > > >
> > > > > config:
> > > > > topology.workers: 2
> > > > > topology.component.rammap:
> > > > > - spec:
> > > > > id: "component-1"
> > > > > size: 1234
> > > > > type: MB // Megabytes
> > > > > - spec:
> > > > > id: "component-2"
> > > > > size: 6789
> > > > > type: GB // GigaBytes
> > > > > - spec:
> > > > > id: "component-3"
> > > > > size: 123456789
> > > > > type: B // Bytes
> > > > >
> > > > >
> > > > > If a mapping is not specified for a component, we can just assume
> > > Heron's
> > > > > defaults. We could then dynamically calculate the container size
> > based
> > > > off
> > > > > of the number components and their corresponding allocated
> resources
> > > for
> > > > > simplicity of use for the user, but still allow them to specify a
> > > custom
> > > > > set of resources to a container like below
> > > > >
> > > > > topology.container.disk: 1234
> > > > > topology.container.ram: 3456
> > > > > topology.container.cpu: 2
> > > > >
> > > > >
> > > > > It may be best if I reused the ByteAmount object to calculate
> > resource
> > > > > size to remain consistent with the other Heron APIs. Any concerns
> or
> > > > > improvements to this approach I am missing?
> > > > >
> > > > > Please Advise,
> > > > >
> > > > > Josh
> > > > >
> > > >
> > >
> >
>
Re: Proposing Changes To ECO
Posted by Josh Fischer <jo...@joshfischer.io>.
Ning,
In my email I was thinking specifically of setting the componentRam. This
is case the value is a comma delimited string value which would be easy to
incorrectly format the list of values to be appended. An image to
reference is below. So by passing in a list of values, I could then
correctly format the value String as we would expect.
public static void setComponentRam(Map<String, Object> conf,
String component, ByteAmount ramInBytes) {
if (conf.containsKey(Config.TOPOLOGY_COMPONENT_RAMMAP)) {
String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_RAMMAP);
String newEntry = String.format("%s,%s:%d", oldEntry, component,
ramInBytes.asBytes());
conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
} else {
String newEntry = String.format("%s:%d", component, ramInBytes.asBytes());
conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
}
}
I'm glad you sent this email as it got me thinking about the above spec
that Karthik mentioned. I've copied his spec below
config:
topology.workers: 2
topology.component.resourcemap:
- id: "component-1"
ram: 1234MB
cpu: 0.5
disk: 123MB
- id: "component-2"
ram: 2345MB
cpu: 0.75
disk: 4GB
I think disk and cpu resources are allocated at a topology level and would
not be applicable here. Unless there is a way that you specify this
through the Heron Config class?.. After looking at the docs here
https://twitter.github.io/heron/docs/developers/tuning/ and looking at the
Heron Config class, I don't see way to specify these at a component level.
I do see there is a way to pass any configuration up to Heron, can I set
this values via a `prepare()` or `open()` call?
One last note while thinking about this. `setComponentJvmOptions()` has a
similar behavior. I would have this do the same for this field too I
believe
public static void setComponentJvmOptions(
Map<String, Object> conf,
String component,
String jvmOptions) {
String optsBase64;
String componentBase64;
optsBase64 = DatatypeConverter.printBase64Binary(
jvmOptions.getBytes(StandardCharsets.UTF_8));
componentBase64 = DatatypeConverter.printBase64Binary(
component.getBytes(StandardCharsets.UTF_8));
String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_JVMOPTS);
String newEntry;
if (oldEntry == null) {
newEntry = String.format("{\"%s\":\"%s\"}", componentBase64, optsBase64);
} else {
// To remove the '{' at the start and '}' at the end
oldEntry = oldEntry.substring(1, oldEntry.length() - 1);
newEntry = String.format("{%s,\"%s\":\"%s\"}", oldEntry,
componentBase64, optsBase64);
}
// Format for TOPOLOGY_COMPONENT_JVMOPTS would be a json map like this:
// {
// "componentNameAInBase64": "jvmOptionsInBase64",
// "componentNameBInBase64": "jvmOptionsInBase64"
// }
conf.put(Config.TOPOLOGY_COMPONENT_JVMOPTS, newEntry);
}
If I've missed something please let me know.
-Josh
On Mon, Jan 22, 2018 at 12:02 PM, Ning Wang <wa...@gmail.com> wrote:
> LGTM. And I like the 123MB more than separating value and unit into two
> settings.
>
> Quick questions:
> This new config will replace the existing topology.component.rammap?
> "the way ECO handles topology configuration will not work for all
> configuration types". Can you give a more specific example?
>
> Thanks.
>
>
>
>
>
> On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <ka...@streaml.io>
> wrote:
>
> > Josh -
> >
> > One more feedback - since the resources assigned can be CPU, RAM, DISK -
> > instead of calling it
> >
> > topology.component.rammap
> >
> > can we call it
> >
> > topology.component.resourcemap
> >
> > and allow for CPU and DISK. Furthermore, we append the size type into the
> > metric as follows
> >
> > config:
> > topology.workers: 2
> > topology.component.resourcemap:
> >
> > - id: "component-1"
> > ram: 1234MB
> > cpu: 0.5
> > disk: 123MB
> >
> > - id: "component-2"
> > ram: 2345MB
> > cpu: 0.75
> > disk: 4GB
> >
> > This will make it easier to read and also flexible, thoughts?
> >
> > cheers
> > /karthik
> >
> >
> >
> > cheers
> > /karthik
> >
> > On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >
> > > To All,
> > >
> > > I think I made a mistake in my previous email
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.rammap:
> > > - "some-id": 1234
> > > - "other-id": 6789
> > >
> > >
> > > I think the yaml above is incorrect as well as other examples. I think
> > we
> > > would have to do something like below
> > >
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.rammap:
> > > - "some-id:1234"
> > > - "other-id:6789"
> > >
> > > Which would then product a list of strings that would match the way the
> > > topology_component_rammap is set via other apis. The problem with this
> > > approach is it would be easy for someone to make a mistake within the
> > > formatting of the strings and would then cause us to have to validate
> the
> > > format to fit the specs. I think the approach below would be better.
> I
> > > would then just take the input, do some validation and conversion via
> the
> > > ByteAmount class and generate a properly formatted string to fit the
> > specs
> > > of the topology_component_rammap values.
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.rammap:
> > >
> > > - id: "component-1"
> > > size: 1234
> > > type: MB // Megabytes
> > >
> > > - id: "component-2"
> > > size: 6789
> > > type: GB // GigaBytes
> > >
> > > - id: "component-3"
> > > size: 123456789
> > > type: B // Bytes
> > >
> > >
> > >
> > > Hope I was clear with trying to explain things. Of course I will also
> be
> > > creating the docs as well to explain usage.
> > >
> > > -Josh
> > >
> > > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> > wrote:
> > >
> > > > All,
> > > >
> > > > While working with Karthik, we have discovered that the way ECO
> handles
> > > > topology configuration will not work for all configuration types. To
> > be
> > > > specific, setting individual component's ram will not work. We will
> > also
> > > > have to keep in mind container size that contains the components. My
> > > > proposal is this:
> > > >
> > > > Create a standardized way to allow for the configuring of component
> > ram
> > > > size in the "config" section of the eco yaml file. This would be a
> > list
> > > > of key value pairs that mapped the "id" of a component to an
> allocated
> > > ram
> > > > size in MB. An example is below:
> > > >
> > > > config:
> > > > topology.workers: 2
> > > > topology.component.rammap:
> > > > - "some-id": 1234
> > > > - "other-id": 6789
> > > >
> > > > However the above implementation may be unclear when it comes to
> > > > understanding what unit of measurement is implicitly specified and/or
> > > > expected. Or we could do something like below.
> > > >
> > > > config:
> > > > topology.workers: 2
> > > > topology.component.rammap:
> > > > - spec:
> > > > id: "component-1"
> > > > size: 1234
> > > > type: MB // Megabytes
> > > > - spec:
> > > > id: "component-2"
> > > > size: 6789
> > > > type: GB // GigaBytes
> > > > - spec:
> > > > id: "component-3"
> > > > size: 123456789
> > > > type: B // Bytes
> > > >
> > > >
> > > > If a mapping is not specified for a component, we can just assume
> > Heron's
> > > > defaults. We could then dynamically calculate the container size
> based
> > > off
> > > > of the number components and their corresponding allocated resources
> > for
> > > > simplicity of use for the user, but still allow them to specify a
> > custom
> > > > set of resources to a container like below
> > > >
> > > > topology.container.disk: 1234
> > > > topology.container.ram: 3456
> > > > topology.container.cpu: 2
> > > >
> > > >
> > > > It may be best if I reused the ByteAmount object to calculate
> resource
> > > > size to remain consistent with the other Heron APIs. Any concerns or
> > > > improvements to this approach I am missing?
> > > >
> > > > Please Advise,
> > > >
> > > > Josh
> > > >
> > >
> >
>
Re: Proposing Changes To ECO
Posted by Ning Wang <wa...@gmail.com>.
LGTM. And I like the 123MB more than separating value and unit into two
settings.
Quick questions:
This new config will replace the existing topology.component.rammap?
"the way ECO handles topology configuration will not work for all
configuration types". Can you give a more specific example?
Thanks.
On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <ka...@streaml.io>
wrote:
> Josh -
>
> One more feedback - since the resources assigned can be CPU, RAM, DISK -
> instead of calling it
>
> topology.component.rammap
>
> can we call it
>
> topology.component.resourcemap
>
> and allow for CPU and DISK. Furthermore, we append the size type into the
> metric as follows
>
> config:
> topology.workers: 2
> topology.component.resourcemap:
>
> - id: "component-1"
> ram: 1234MB
> cpu: 0.5
> disk: 123MB
>
> - id: "component-2"
> ram: 2345MB
> cpu: 0.75
> disk: 4GB
>
> This will make it easier to read and also flexible, thoughts?
>
> cheers
> /karthik
>
>
>
> cheers
> /karthik
>
> On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io> wrote:
>
> > To All,
> >
> > I think I made a mistake in my previous email
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> > - "some-id": 1234
> > - "other-id": 6789
> >
> >
> > I think the yaml above is incorrect as well as other examples. I think
> we
> > would have to do something like below
> >
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> > - "some-id:1234"
> > - "other-id:6789"
> >
> > Which would then product a list of strings that would match the way the
> > topology_component_rammap is set via other apis. The problem with this
> > approach is it would be easy for someone to make a mistake within the
> > formatting of the strings and would then cause us to have to validate the
> > format to fit the specs. I think the approach below would be better. I
> > would then just take the input, do some validation and conversion via the
> > ByteAmount class and generate a properly formatted string to fit the
> specs
> > of the topology_component_rammap values.
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> >
> > - id: "component-1"
> > size: 1234
> > type: MB // Megabytes
> >
> > - id: "component-2"
> > size: 6789
> > type: GB // GigaBytes
> >
> > - id: "component-3"
> > size: 123456789
> > type: B // Bytes
> >
> >
> >
> > Hope I was clear with trying to explain things. Of course I will also be
> > creating the docs as well to explain usage.
> >
> > -Josh
> >
> > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >
> > > All,
> > >
> > > While working with Karthik, we have discovered that the way ECO handles
> > > topology configuration will not work for all configuration types. To
> be
> > > specific, setting individual component's ram will not work. We will
> also
> > > have to keep in mind container size that contains the components. My
> > > proposal is this:
> > >
> > > Create a standardized way to allow for the configuring of component
> ram
> > > size in the "config" section of the eco yaml file. This would be a
> list
> > > of key value pairs that mapped the "id" of a component to an allocated
> > ram
> > > size in MB. An example is below:
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.rammap:
> > > - "some-id": 1234
> > > - "other-id": 6789
> > >
> > > However the above implementation may be unclear when it comes to
> > > understanding what unit of measurement is implicitly specified and/or
> > > expected. Or we could do something like below.
> > >
> > > config:
> > > topology.workers: 2
> > > topology.component.rammap:
> > > - spec:
> > > id: "component-1"
> > > size: 1234
> > > type: MB // Megabytes
> > > - spec:
> > > id: "component-2"
> > > size: 6789
> > > type: GB // GigaBytes
> > > - spec:
> > > id: "component-3"
> > > size: 123456789
> > > type: B // Bytes
> > >
> > >
> > > If a mapping is not specified for a component, we can just assume
> Heron's
> > > defaults. We could then dynamically calculate the container size based
> > off
> > > of the number components and their corresponding allocated resources
> for
> > > simplicity of use for the user, but still allow them to specify a
> custom
> > > set of resources to a container like below
> > >
> > > topology.container.disk: 1234
> > > topology.container.ram: 3456
> > > topology.container.cpu: 2
> > >
> > >
> > > It may be best if I reused the ByteAmount object to calculate resource
> > > size to remain consistent with the other Heron APIs. Any concerns or
> > > improvements to this approach I am missing?
> > >
> > > Please Advise,
> > >
> > > Josh
> > >
> >
>
Re: Proposing Changes To ECO
Posted by Karthik Ramasamy <ka...@streaml.io>.
Josh -
One more feedback - since the resources assigned can be CPU, RAM, DISK -
instead of calling it
topology.component.rammap
can we call it
topology.component.resourcemap
and allow for CPU and DISK. Furthermore, we append the size type into the
metric as follows
config:
topology.workers: 2
topology.component.resourcemap:
- id: "component-1"
ram: 1234MB
cpu: 0.5
disk: 123MB
- id: "component-2"
ram: 2345MB
cpu: 0.75
disk: 4GB
This will make it easier to read and also flexible, thoughts?
cheers
/karthik
cheers
/karthik
On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io> wrote:
> To All,
>
> I think I made a mistake in my previous email
>
> config:
> topology.workers: 2
> topology.component.rammap:
> - "some-id": 1234
> - "other-id": 6789
>
>
> I think the yaml above is incorrect as well as other examples. I think we
> would have to do something like below
>
>
> config:
> topology.workers: 2
> topology.component.rammap:
> - "some-id:1234"
> - "other-id:6789"
>
> Which would then product a list of strings that would match the way the
> topology_component_rammap is set via other apis. The problem with this
> approach is it would be easy for someone to make a mistake within the
> formatting of the strings and would then cause us to have to validate the
> format to fit the specs. I think the approach below would be better. I
> would then just take the input, do some validation and conversion via the
> ByteAmount class and generate a properly formatted string to fit the specs
> of the topology_component_rammap values.
>
> config:
> topology.workers: 2
> topology.component.rammap:
>
> - id: "component-1"
> size: 1234
> type: MB // Megabytes
>
> - id: "component-2"
> size: 6789
> type: GB // GigaBytes
>
> - id: "component-3"
> size: 123456789
> type: B // Bytes
>
>
>
> Hope I was clear with trying to explain things. Of course I will also be
> creating the docs as well to explain usage.
>
> -Josh
>
> On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io> wrote:
>
> > All,
> >
> > While working with Karthik, we have discovered that the way ECO handles
> > topology configuration will not work for all configuration types. To be
> > specific, setting individual component's ram will not work. We will also
> > have to keep in mind container size that contains the components. My
> > proposal is this:
> >
> > Create a standardized way to allow for the configuring of component ram
> > size in the "config" section of the eco yaml file. This would be a list
> > of key value pairs that mapped the "id" of a component to an allocated
> ram
> > size in MB. An example is below:
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> > - "some-id": 1234
> > - "other-id": 6789
> >
> > However the above implementation may be unclear when it comes to
> > understanding what unit of measurement is implicitly specified and/or
> > expected. Or we could do something like below.
> >
> > config:
> > topology.workers: 2
> > topology.component.rammap:
> > - spec:
> > id: "component-1"
> > size: 1234
> > type: MB // Megabytes
> > - spec:
> > id: "component-2"
> > size: 6789
> > type: GB // GigaBytes
> > - spec:
> > id: "component-3"
> > size: 123456789
> > type: B // Bytes
> >
> >
> > If a mapping is not specified for a component, we can just assume Heron's
> > defaults. We could then dynamically calculate the container size based
> off
> > of the number components and their corresponding allocated resources for
> > simplicity of use for the user, but still allow them to specify a custom
> > set of resources to a container like below
> >
> > topology.container.disk: 1234
> > topology.container.ram: 3456
> > topology.container.cpu: 2
> >
> >
> > It may be best if I reused the ByteAmount object to calculate resource
> > size to remain consistent with the other Heron APIs. Any concerns or
> > improvements to this approach I am missing?
> >
> > Please Advise,
> >
> > Josh
> >
>
Re: Proposing Changes To ECO
Posted by Josh Fischer <jo...@joshfischer.io>.
To All,
I think I made a mistake in my previous email
config:
topology.workers: 2
topology.component.rammap:
- "some-id": 1234
- "other-id": 6789
I think the yaml above is incorrect as well as other examples. I think we
would have to do something like below
config:
topology.workers: 2
topology.component.rammap:
- "some-id:1234"
- "other-id:6789"
Which would then product a list of strings that would match the way the
topology_component_rammap is set via other apis. The problem with this
approach is it would be easy for someone to make a mistake within the
formatting of the strings and would then cause us to have to validate the
format to fit the specs. I think the approach below would be better. I
would then just take the input, do some validation and conversion via the
ByteAmount class and generate a properly formatted string to fit the specs
of the topology_component_rammap values.
config:
topology.workers: 2
topology.component.rammap:
- id: "component-1"
size: 1234
type: MB // Megabytes
- id: "component-2"
size: 6789
type: GB // GigaBytes
- id: "component-3"
size: 123456789
type: B // Bytes
Hope I was clear with trying to explain things. Of course I will also be
creating the docs as well to explain usage.
-Josh
On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io> wrote:
> All,
>
> While working with Karthik, we have discovered that the way ECO handles
> topology configuration will not work for all configuration types. To be
> specific, setting individual component's ram will not work. We will also
> have to keep in mind container size that contains the components. My
> proposal is this:
>
> Create a standardized way to allow for the configuring of component ram
> size in the "config" section of the eco yaml file. This would be a list
> of key value pairs that mapped the "id" of a component to an allocated ram
> size in MB. An example is below:
>
> config:
> topology.workers: 2
> topology.component.rammap:
> - "some-id": 1234
> - "other-id": 6789
>
> However the above implementation may be unclear when it comes to
> understanding what unit of measurement is implicitly specified and/or
> expected. Or we could do something like below.
>
> config:
> topology.workers: 2
> topology.component.rammap:
> - spec:
> id: "component-1"
> size: 1234
> type: MB // Megabytes
> - spec:
> id: "component-2"
> size: 6789
> type: GB // GigaBytes
> - spec:
> id: "component-3"
> size: 123456789
> type: B // Bytes
>
>
> If a mapping is not specified for a component, we can just assume Heron's
> defaults. We could then dynamically calculate the container size based off
> of the number components and their corresponding allocated resources for
> simplicity of use for the user, but still allow them to specify a custom
> set of resources to a container like below
>
> topology.container.disk: 1234
> topology.container.ram: 3456
> topology.container.cpu: 2
>
>
> It may be best if I reused the ByteAmount object to calculate resource
> size to remain consistent with the other Heron APIs. Any concerns or
> improvements to this approach I am missing?
>
> Please Advise,
>
> Josh
>