You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@heron.apache.org by Josh Fischer <jo...@joshfischer.io> on 2018/01/21 14:21:29 UTC

Proposing Changes To ECO

All,

While working with Karthik, we have discovered that the way ECO handles
topology configuration will not work for all configuration types.  To be
specific, setting individual component's ram will not work.  We will also
have to keep in mind container size that contains the components.  My
proposal is this:

Create a standardized way to allow for the  configuring of  component ram
size in the "config" section of the  eco yaml file.  This would be a list
of key value pairs that mapped the "id" of a component to an allocated ram
size in MB.  An example is below:

config:
  topology.workers: 2
  topology.component.rammap:
    - "some-id": 1234
    - "other-id": 6789

However the  above implementation may be unclear when it comes to
understanding what unit of measurement is implicitly specified and/or
expected.  Or we could do something like below.

config:
  topology.workers: 2
  topology.component.rammap:
    - spec:
       id: "component-1"
       size: 1234
       type: MB  // Megabytes
    - spec:
       id: "component-2"
       size: 6789
       type: GB // GigaBytes
    - spec:
       id: "component-3"
       size: 123456789
       type: B // Bytes


If a mapping is not specified for a component, we can just assume Heron's
defaults.  We could then dynamically calculate the container size based off
of the number components and their corresponding allocated resources for
simplicity of use for the user, but still allow them to specify a custom
set of resources to a container like below

topology.container.disk: 1234
topology.container.ram: 3456
topology.container.cpu: 2


It may be best if I reused the ByteAmount object to calculate resource size
to remain consistent with the other Heron APIs.  Any concerns or
improvements to this approach I am missing?

Please Advise,

Josh

Re: Proposing Changes To ECO

Posted by Josh Fischer <jo...@joshfischer.io>.

It seems my proposal wasn't fully thought through.  I think your suggestion
is the better way to go.

+1

-Josh

On Mon, Jan 22, 2018 at 11:33 AM, Karthik Ramasamy <ka...@streaml.io>
wrote:

> Josh -
>
> One more feedback - since the resources assigned can be CPU, RAM, DISK -
> instead of calling it
>
> topology.component.rammap
>
> can we call it
>
> topology.component.resourcemap
>
> and allow for CPU and DISK. Furthermore, we append the size type into the
> metric as follows
>
> config:
>   topology.workers: 2
>   topology.component.resourcemap:
>
>     - id: "component-1"
>       ram: 1234MB
>       cpu: 0.5
>       disk: 123MB
>
>    - id: "component-2"
>      ram: 2345MB
>      cpu: 0.75
>      disk: 4GB
>
> This will make it easier to read and also flexible, thoughts?
>
> cheers
> /karthik
>
>
>
> cheers
> /karthik
>
> On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io> wrote:
>
> > To All,
> >
> > I think I made  a mistake in my previous email
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >     - "some-id": 1234
> >     - "other-id": 6789
> >
> >
> > I think the yaml above is incorrect as well as other examples.  I think
> we
> > would have to do something like below
> >
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >     - "some-id:1234"
> >     - "other-id:6789"
> >
> > Which would then product a list of strings that would match the way the
> > topology_component_rammap is set via other apis.  The problem with this
> > approach is it would be easy for someone to make a mistake within the
> > formatting of the strings and would then cause us to have to validate the
> > format to fit the specs.  I think the approach below would be better.  I
> > would then just take the input, do some validation and conversion via the
> > ByteAmount class and generate a properly formatted string to fit the
> specs
> > of the topology_component_rammap values.
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >
> >     - id: "component-1"
> >       size: 1234
> >       type: MB  // Megabytes
> >
> >     - id: "component-2"
> >       size: 6789
> >       type: GB // GigaBytes
> >
> >     - id: "component-3"
> >       size: 123456789
> >       type: B // Bytes
> >
> >
> >
> > Hope I was clear with trying to explain things.  Of course I will also be
> > creating the docs as well to explain usage.
> >
> > -Josh
> >
> > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >
> > > All,
> > >
> > > While working with Karthik, we have discovered that the way ECO handles
> > > topology configuration will not work for all configuration types.  To
> be
> > > specific, setting individual component's ram will not work.  We will
> also
> > > have to keep in mind container size that contains the components.  My
> > > proposal is this:
> > >
> > > Create a standardized way to allow for the  configuring of  component
> ram
> > > size in the "config" section of the  eco yaml file.  This would be a
> list
> > > of key value pairs that mapped the "id" of a component to an allocated
> > ram
> > > size in MB.  An example is below:
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - "some-id": 1234
> > >     - "other-id": 6789
> > >
> > > However the  above implementation may be unclear when it comes to
> > > understanding what unit of measurement is implicitly specified and/or
> > > expected.  Or we could do something like below.
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - spec:
> > >        id: "component-1"
> > >        size: 1234
> > >        type: MB  // Megabytes
> > >     - spec:
> > >        id: "component-2"
> > >        size: 6789
> > >        type: GB // GigaBytes
> > >     - spec:
> > >        id: "component-3"
> > >        size: 123456789
> > >        type: B // Bytes
> > >
> > >
> > > If a mapping is not specified for a component, we can just assume
> Heron's
> > > defaults.  We could then dynamically calculate the container size based
> > off
> > > of the number components and their corresponding allocated resources
> for
> > > simplicity of use for the user, but still allow them to specify a
> custom
> > > set of resources to a container like below
> > >
> > > topology.container.disk: 1234
> > > topology.container.ram: 3456
> > > topology.container.cpu: 2
> > >
> > >
> > > It may be best if I reused the ByteAmount object to calculate resource
> > > size to remain consistent with the other Heron APIs.  Any concerns or
> > > improvements to this approach I am missing?
> > >
> > > Please Advise,
> > >
> > > Josh
> > >
> >
>

Re: Proposing Changes To ECO

Posted by Ning Wang <wa...@gmail.com>.

Got it. Thanks! It makes more sense now. :)

On Mon, Jan 22, 2018 at 1:51 PM, Josh Fischer <jo...@joshfischer.io> wrote:

> Ning,
>
> In my email I was thinking specifically of setting the componentRam.  This
> is case the value is a comma delimited string value which would be easy to
> incorrectly format the list of values to be appended.   An image to
> reference is below.  So by passing in a list of values, I could then
> correctly format the value String as we would expect.
>
> public static void setComponentRam(Map<String, Object> conf,
>                                    String component, ByteAmount
> ramInBytes) {
>   if (conf.containsKey(Config.TOPOLOGY_COMPONENT_RAMMAP)) {
>     String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_RAMMAP);
>     String newEntry = String.format("%s,%s:%d", oldEntry, component,
> ramInBytes.asBytes());
>     conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
>   } else {
>     String newEntry = String.format("%s:%d", component,
> ramInBytes.asBytes());
>     conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
>   }
> }
>
> I'm glad you sent this email as it got me thinking about the above spec
> that Karthik mentioned.  I've copied his spec below
>
>
> config:
>   topology.workers: 2
>   topology.component.resourcemap:
>
>     - id: "component-1"
>       ram: 1234MB
>       cpu: 0.5
>       disk: 123MB
>
>    - id: "component-2"
>      ram: 2345MB
>      cpu: 0.75
>      disk: 4GB
>
> I think disk and cpu resources are allocated at a topology level and would
> not be applicable here.  Unless there is a way that you specify this
> through the Heron Config class?..  After looking at the docs here
> https://twitter.github.io/heron/docs/developers/tuning/ and looking at the
> Heron Config class, I don't see way to specify these at a component level.
> I do see there is a way to pass any configuration up to Heron, can I set
> this values via a `prepare()` or `open()` call?
>
> One last note while thinking about this.  `setComponentJvmOptions()` has a
> similar behavior.  I would have this do the same for this field too I
> believe
>
>
> public static void setComponentJvmOptions(
>     Map<String, Object> conf,
>     String component,
>     String jvmOptions) {
>   String optsBase64;
>   String componentBase64;
>
>   optsBase64 = DatatypeConverter.printBase64Binary(
>       jvmOptions.getBytes(StandardCharsets.UTF_8));
>   componentBase64 = DatatypeConverter.printBase64Binary(
>       component.getBytes(StandardCharsets.UTF_8));
>
>   String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_JVMOPTS);
>   String newEntry;
>   if (oldEntry == null) {
>     newEntry = String.format("{\"%s\":\"%s\"}", componentBase64,
> optsBase64);
>   } else {
>     // To remove the '{' at the start and '}' at the end
>     oldEntry = oldEntry.substring(1, oldEntry.length() - 1);
>     newEntry = String.format("{%s,\"%s\":\"%s\"}", oldEntry,
> componentBase64, optsBase64);
>   }
>   // Format for TOPOLOGY_COMPONENT_JVMOPTS would be a json map like this:
>   //  {
>   //     "componentNameAInBase64": "jvmOptionsInBase64",
>   //     "componentNameBInBase64": "jvmOptionsInBase64"
>   //  }
>   conf.put(Config.TOPOLOGY_COMPONENT_JVMOPTS, newEntry);
>
> }
>
>
>
> If I've missed something please let me know.
>
> -Josh
>
>
> On Mon, Jan 22, 2018 at 12:02 PM, Ning Wang <wa...@gmail.com> wrote:
>
> > LGTM. And I like the 123MB  more than separating value and unit into two
> > settings.
> >
> > Quick questions:
> > This new config will replace the existing topology.component.rammap?
> > "the way ECO handles topology configuration will not work for all
> > configuration types". Can you give a more specific example?
> >
> > Thanks.
> >
> >
> >
> >
> >
> > On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <ka...@streaml.io>
> > wrote:
> >
> > > Josh -
> > >
> > > One more feedback - since the resources assigned can be CPU, RAM, DISK
> -
> > > instead of calling it
> > >
> > > topology.component.rammap
> > >
> > > can we call it
> > >
> > > topology.component.resourcemap
> > >
> > > and allow for CPU and DISK. Furthermore, we append the size type into
> the
> > > metric as follows
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.resourcemap:
> > >
> > >     - id: "component-1"
> > >       ram: 1234MB
> > >       cpu: 0.5
> > >       disk: 123MB
> > >
> > >    - id: "component-2"
> > >      ram: 2345MB
> > >      cpu: 0.75
> > >      disk: 4GB
> > >
> > > This will make it easier to read and also flexible, thoughts?
> > >
> > > cheers
> > > /karthik
> > >
> > >
> > >
> > > cheers
> > > /karthik
> > >
> > > On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io>
> > wrote:
> > >
> > > > To All,
> > > >
> > > > I think I made  a mistake in my previous email
> > > >
> > > > config:
> > > >   topology.workers: 2
> > > >   topology.component.rammap:
> > > >     - "some-id": 1234
> > > >     - "other-id": 6789
> > > >
> > > >
> > > > I think the yaml above is incorrect as well as other examples.  I
> think
> > > we
> > > > would have to do something like below
> > > >
> > > >
> > > > config:
> > > >   topology.workers: 2
> > > >   topology.component.rammap:
> > > >     - "some-id:1234"
> > > >     - "other-id:6789"
> > > >
> > > > Which would then product a list of strings that would match the way
> the
> > > > topology_component_rammap is set via other apis.  The problem with
> this
> > > > approach is it would be easy for someone to make a mistake within the
> > > > formatting of the strings and would then cause us to have to validate
> > the
> > > > format to fit the specs.  I think the approach below would be better.
> > I
> > > > would then just take the input, do some validation and conversion via
> > the
> > > > ByteAmount class and generate a properly formatted string to fit the
> > > specs
> > > > of the topology_component_rammap values.
> > > >
> > > > config:
> > > >   topology.workers: 2
> > > >   topology.component.rammap:
> > > >
> > > >     - id: "component-1"
> > > >       size: 1234
> > > >       type: MB  // Megabytes
> > > >
> > > >     - id: "component-2"
> > > >       size: 6789
> > > >       type: GB // GigaBytes
> > > >
> > > >     - id: "component-3"
> > > >       size: 123456789
> > > >       type: B // Bytes
> > > >
> > > >
> > > >
> > > > Hope I was clear with trying to explain things.  Of course I will
> also
> > be
> > > > creating the docs as well to explain usage.
> > > >
> > > > -Josh
> > > >
> > > > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> > > wrote:
> > > >
> > > > > All,
> > > > >
> > > > > While working with Karthik, we have discovered that the way ECO
> > handles
> > > > > topology configuration will not work for all configuration types.
> To
> > > be
> > > > > specific, setting individual component's ram will not work.  We
> will
> > > also
> > > > > have to keep in mind container size that contains the components.
> My
> > > > > proposal is this:
> > > > >
> > > > > Create a standardized way to allow for the  configuring of
> component
> > > ram
> > > > > size in the "config" section of the  eco yaml file.  This would be
> a
> > > list
> > > > > of key value pairs that mapped the "id" of a component to an
> > allocated
> > > > ram
> > > > > size in MB.  An example is below:
> > > > >
> > > > > config:
> > > > >   topology.workers: 2
> > > > >   topology.component.rammap:
> > > > >     - "some-id": 1234
> > > > >     - "other-id": 6789
> > > > >
> > > > > However the  above implementation may be unclear when it comes to
> > > > > understanding what unit of measurement is implicitly specified
> and/or
> > > > > expected.  Or we could do something like below.
> > > > >
> > > > > config:
> > > > >   topology.workers: 2
> > > > >   topology.component.rammap:
> > > > >     - spec:
> > > > >        id: "component-1"
> > > > >        size: 1234
> > > > >        type: MB  // Megabytes
> > > > >     - spec:
> > > > >        id: "component-2"
> > > > >        size: 6789
> > > > >        type: GB // GigaBytes
> > > > >     - spec:
> > > > >        id: "component-3"
> > > > >        size: 123456789
> > > > >        type: B // Bytes
> > > > >
> > > > >
> > > > > If a mapping is not specified for a component, we can just assume
> > > Heron's
> > > > > defaults.  We could then dynamically calculate the container size
> > based
> > > > off
> > > > > of the number components and their corresponding allocated
> resources
> > > for
> > > > > simplicity of use for the user, but still allow them to specify a
> > > custom
> > > > > set of resources to a container like below
> > > > >
> > > > > topology.container.disk: 1234
> > > > > topology.container.ram: 3456
> > > > > topology.container.cpu: 2
> > > > >
> > > > >
> > > > > It may be best if I reused the ByteAmount object to calculate
> > resource
> > > > > size to remain consistent with the other Heron APIs.  Any concerns
> or
> > > > > improvements to this approach I am missing?
> > > > >
> > > > > Please Advise,
> > > > >
> > > > > Josh
> > > > >
> > > >
> > >
> >
>

Re: Proposing Changes To ECO

Posted by Josh Fischer <jo...@joshfischer.io>.

Ning,

In my email I was thinking specifically of setting the componentRam.  This
is case the value is a comma delimited string value which would be easy to
incorrectly format the list of values to be appended.   An image to
reference is below.  So by passing in a list of values, I could then
correctly format the value String as we would expect.

public static void setComponentRam(Map<String, Object> conf,
                                   String component, ByteAmount ramInBytes) {
  if (conf.containsKey(Config.TOPOLOGY_COMPONENT_RAMMAP)) {
    String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_RAMMAP);
    String newEntry = String.format("%s,%s:%d", oldEntry, component,
ramInBytes.asBytes());
    conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
  } else {
    String newEntry = String.format("%s:%d", component, ramInBytes.asBytes());
    conf.put(Config.TOPOLOGY_COMPONENT_RAMMAP, newEntry);
  }
}

I'm glad you sent this email as it got me thinking about the above spec
that Karthik mentioned.  I've copied his spec below


config:
  topology.workers: 2
  topology.component.resourcemap:

    - id: "component-1"
      ram: 1234MB
      cpu: 0.5
      disk: 123MB

   - id: "component-2"
     ram: 2345MB
     cpu: 0.75
     disk: 4GB

I think disk and cpu resources are allocated at a topology level and would
not be applicable here.  Unless there is a way that you specify this
through the Heron Config class?..  After looking at the docs here
https://twitter.github.io/heron/docs/developers/tuning/ and looking at the
Heron Config class, I don't see way to specify these at a component level.
I do see there is a way to pass any configuration up to Heron, can I set
this values via a `prepare()` or `open()` call?

One last note while thinking about this.  `setComponentJvmOptions()` has a
similar behavior.  I would have this do the same for this field too I
believe


public static void setComponentJvmOptions(
    Map<String, Object> conf,
    String component,
    String jvmOptions) {
  String optsBase64;
  String componentBase64;

  optsBase64 = DatatypeConverter.printBase64Binary(
      jvmOptions.getBytes(StandardCharsets.UTF_8));
  componentBase64 = DatatypeConverter.printBase64Binary(
      component.getBytes(StandardCharsets.UTF_8));

  String oldEntry = (String) conf.get(Config.TOPOLOGY_COMPONENT_JVMOPTS);
  String newEntry;
  if (oldEntry == null) {
    newEntry = String.format("{\"%s\":\"%s\"}", componentBase64, optsBase64);
  } else {
    // To remove the '{' at the start and '}' at the end
    oldEntry = oldEntry.substring(1, oldEntry.length() - 1);
    newEntry = String.format("{%s,\"%s\":\"%s\"}", oldEntry,
componentBase64, optsBase64);
  }
  // Format for TOPOLOGY_COMPONENT_JVMOPTS would be a json map like this:
  //  {
  //     "componentNameAInBase64": "jvmOptionsInBase64",
  //     "componentNameBInBase64": "jvmOptionsInBase64"
  //  }
  conf.put(Config.TOPOLOGY_COMPONENT_JVMOPTS, newEntry);

}



If I've missed something please let me know.

-Josh


On Mon, Jan 22, 2018 at 12:02 PM, Ning Wang <wa...@gmail.com> wrote:

> LGTM. And I like the 123MB  more than separating value and unit into two
> settings.
>
> Quick questions:
> This new config will replace the existing topology.component.rammap?
> "the way ECO handles topology configuration will not work for all
> configuration types". Can you give a more specific example?
>
> Thanks.
>
>
>
>
>
> On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <ka...@streaml.io>
> wrote:
>
> > Josh -
> >
> > One more feedback - since the resources assigned can be CPU, RAM, DISK -
> > instead of calling it
> >
> > topology.component.rammap
> >
> > can we call it
> >
> > topology.component.resourcemap
> >
> > and allow for CPU and DISK. Furthermore, we append the size type into the
> > metric as follows
> >
> > config:
> >   topology.workers: 2
> >   topology.component.resourcemap:
> >
> >     - id: "component-1"
> >       ram: 1234MB
> >       cpu: 0.5
> >       disk: 123MB
> >
> >    - id: "component-2"
> >      ram: 2345MB
> >      cpu: 0.75
> >      disk: 4GB
> >
> > This will make it easier to read and also flexible, thoughts?
> >
> > cheers
> > /karthik
> >
> >
> >
> > cheers
> > /karthik
> >
> > On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >
> > > To All,
> > >
> > > I think I made  a mistake in my previous email
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - "some-id": 1234
> > >     - "other-id": 6789
> > >
> > >
> > > I think the yaml above is incorrect as well as other examples.  I think
> > we
> > > would have to do something like below
> > >
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - "some-id:1234"
> > >     - "other-id:6789"
> > >
> > > Which would then product a list of strings that would match the way the
> > > topology_component_rammap is set via other apis.  The problem with this
> > > approach is it would be easy for someone to make a mistake within the
> > > formatting of the strings and would then cause us to have to validate
> the
> > > format to fit the specs.  I think the approach below would be better.
> I
> > > would then just take the input, do some validation and conversion via
> the
> > > ByteAmount class and generate a properly formatted string to fit the
> > specs
> > > of the topology_component_rammap values.
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >
> > >     - id: "component-1"
> > >       size: 1234
> > >       type: MB  // Megabytes
> > >
> > >     - id: "component-2"
> > >       size: 6789
> > >       type: GB // GigaBytes
> > >
> > >     - id: "component-3"
> > >       size: 123456789
> > >       type: B // Bytes
> > >
> > >
> > >
> > > Hope I was clear with trying to explain things.  Of course I will also
> be
> > > creating the docs as well to explain usage.
> > >
> > > -Josh
> > >
> > > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> > wrote:
> > >
> > > > All,
> > > >
> > > > While working with Karthik, we have discovered that the way ECO
> handles
> > > > topology configuration will not work for all configuration types.  To
> > be
> > > > specific, setting individual component's ram will not work.  We will
> > also
> > > > have to keep in mind container size that contains the components.  My
> > > > proposal is this:
> > > >
> > > > Create a standardized way to allow for the  configuring of  component
> > ram
> > > > size in the "config" section of the  eco yaml file.  This would be a
> > list
> > > > of key value pairs that mapped the "id" of a component to an
> allocated
> > > ram
> > > > size in MB.  An example is below:
> > > >
> > > > config:
> > > >   topology.workers: 2
> > > >   topology.component.rammap:
> > > >     - "some-id": 1234
> > > >     - "other-id": 6789
> > > >
> > > > However the  above implementation may be unclear when it comes to
> > > > understanding what unit of measurement is implicitly specified and/or
> > > > expected.  Or we could do something like below.
> > > >
> > > > config:
> > > >   topology.workers: 2
> > > >   topology.component.rammap:
> > > >     - spec:
> > > >        id: "component-1"
> > > >        size: 1234
> > > >        type: MB  // Megabytes
> > > >     - spec:
> > > >        id: "component-2"
> > > >        size: 6789
> > > >        type: GB // GigaBytes
> > > >     - spec:
> > > >        id: "component-3"
> > > >        size: 123456789
> > > >        type: B // Bytes
> > > >
> > > >
> > > > If a mapping is not specified for a component, we can just assume
> > Heron's
> > > > defaults.  We could then dynamically calculate the container size
> based
> > > off
> > > > of the number components and their corresponding allocated resources
> > for
> > > > simplicity of use for the user, but still allow them to specify a
> > custom
> > > > set of resources to a container like below
> > > >
> > > > topology.container.disk: 1234
> > > > topology.container.ram: 3456
> > > > topology.container.cpu: 2
> > > >
> > > >
> > > > It may be best if I reused the ByteAmount object to calculate
> resource
> > > > size to remain consistent with the other Heron APIs.  Any concerns or
> > > > improvements to this approach I am missing?
> > > >
> > > > Please Advise,
> > > >
> > > > Josh
> > > >
> > >
> >
>

Re: Proposing Changes To ECO

Posted by Ning Wang <wa...@gmail.com>.

LGTM. And I like the 123MB  more than separating value and unit into two
settings.

Quick questions:
This new config will replace the existing topology.component.rammap?
"the way ECO handles topology configuration will not work for all
configuration types". Can you give a more specific example?

Thanks.





On Mon, Jan 22, 2018 at 9:33 AM, Karthik Ramasamy <ka...@streaml.io>
wrote:

> Josh -
>
> One more feedback - since the resources assigned can be CPU, RAM, DISK -
> instead of calling it
>
> topology.component.rammap
>
> can we call it
>
> topology.component.resourcemap
>
> and allow for CPU and DISK. Furthermore, we append the size type into the
> metric as follows
>
> config:
>   topology.workers: 2
>   topology.component.resourcemap:
>
>     - id: "component-1"
>       ram: 1234MB
>       cpu: 0.5
>       disk: 123MB
>
>    - id: "component-2"
>      ram: 2345MB
>      cpu: 0.75
>      disk: 4GB
>
> This will make it easier to read and also flexible, thoughts?
>
> cheers
> /karthik
>
>
>
> cheers
> /karthik
>
> On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io> wrote:
>
> > To All,
> >
> > I think I made  a mistake in my previous email
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >     - "some-id": 1234
> >     - "other-id": 6789
> >
> >
> > I think the yaml above is incorrect as well as other examples.  I think
> we
> > would have to do something like below
> >
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >     - "some-id:1234"
> >     - "other-id:6789"
> >
> > Which would then product a list of strings that would match the way the
> > topology_component_rammap is set via other apis.  The problem with this
> > approach is it would be easy for someone to make a mistake within the
> > formatting of the strings and would then cause us to have to validate the
> > format to fit the specs.  I think the approach below would be better.  I
> > would then just take the input, do some validation and conversion via the
> > ByteAmount class and generate a properly formatted string to fit the
> specs
> > of the topology_component_rammap values.
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >
> >     - id: "component-1"
> >       size: 1234
> >       type: MB  // Megabytes
> >
> >     - id: "component-2"
> >       size: 6789
> >       type: GB // GigaBytes
> >
> >     - id: "component-3"
> >       size: 123456789
> >       type: B // Bytes
> >
> >
> >
> > Hope I was clear with trying to explain things.  Of course I will also be
> > creating the docs as well to explain usage.
> >
> > -Josh
> >
> > On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io>
> wrote:
> >
> > > All,
> > >
> > > While working with Karthik, we have discovered that the way ECO handles
> > > topology configuration will not work for all configuration types.  To
> be
> > > specific, setting individual component's ram will not work.  We will
> also
> > > have to keep in mind container size that contains the components.  My
> > > proposal is this:
> > >
> > > Create a standardized way to allow for the  configuring of  component
> ram
> > > size in the "config" section of the  eco yaml file.  This would be a
> list
> > > of key value pairs that mapped the "id" of a component to an allocated
> > ram
> > > size in MB.  An example is below:
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - "some-id": 1234
> > >     - "other-id": 6789
> > >
> > > However the  above implementation may be unclear when it comes to
> > > understanding what unit of measurement is implicitly specified and/or
> > > expected.  Or we could do something like below.
> > >
> > > config:
> > >   topology.workers: 2
> > >   topology.component.rammap:
> > >     - spec:
> > >        id: "component-1"
> > >        size: 1234
> > >        type: MB  // Megabytes
> > >     - spec:
> > >        id: "component-2"
> > >        size: 6789
> > >        type: GB // GigaBytes
> > >     - spec:
> > >        id: "component-3"
> > >        size: 123456789
> > >        type: B // Bytes
> > >
> > >
> > > If a mapping is not specified for a component, we can just assume
> Heron's
> > > defaults.  We could then dynamically calculate the container size based
> > off
> > > of the number components and their corresponding allocated resources
> for
> > > simplicity of use for the user, but still allow them to specify a
> custom
> > > set of resources to a container like below
> > >
> > > topology.container.disk: 1234
> > > topology.container.ram: 3456
> > > topology.container.cpu: 2
> > >
> > >
> > > It may be best if I reused the ByteAmount object to calculate resource
> > > size to remain consistent with the other Heron APIs.  Any concerns or
> > > improvements to this approach I am missing?
> > >
> > > Please Advise,
> > >
> > > Josh
> > >
> >
>

Re: Proposing Changes To ECO

Posted by Karthik Ramasamy <ka...@streaml.io>.

Josh -

One more feedback - since the resources assigned can be CPU, RAM, DISK -
instead of calling it

topology.component.rammap

can we call it

topology.component.resourcemap

and allow for CPU and DISK. Furthermore, we append the size type into the
metric as follows

config:
  topology.workers: 2
  topology.component.resourcemap:

    - id: "component-1"
      ram: 1234MB
      cpu: 0.5
      disk: 123MB

   - id: "component-2"
     ram: 2345MB
     cpu: 0.75
     disk: 4GB

This will make it easier to read and also flexible, thoughts?

cheers
/karthik



cheers
/karthik

On Sun, Jan 21, 2018 at 6:18 PM, Josh Fischer <jo...@joshfischer.io> wrote:

> To All,
>
> I think I made  a mistake in my previous email
>
> config:
>   topology.workers: 2
>   topology.component.rammap:
>     - "some-id": 1234
>     - "other-id": 6789
>
>
> I think the yaml above is incorrect as well as other examples.  I think we
> would have to do something like below
>
>
> config:
>   topology.workers: 2
>   topology.component.rammap:
>     - "some-id:1234"
>     - "other-id:6789"
>
> Which would then product a list of strings that would match the way the
> topology_component_rammap is set via other apis.  The problem with this
> approach is it would be easy for someone to make a mistake within the
> formatting of the strings and would then cause us to have to validate the
> format to fit the specs.  I think the approach below would be better.  I
> would then just take the input, do some validation and conversion via the
> ByteAmount class and generate a properly formatted string to fit the specs
> of the topology_component_rammap values.
>
> config:
>   topology.workers: 2
>   topology.component.rammap:
>
>     - id: "component-1"
>       size: 1234
>       type: MB  // Megabytes
>
>     - id: "component-2"
>       size: 6789
>       type: GB // GigaBytes
>
>     - id: "component-3"
>       size: 123456789
>       type: B // Bytes
>
>
>
> Hope I was clear with trying to explain things.  Of course I will also be
> creating the docs as well to explain usage.
>
> -Josh
>
> On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io> wrote:
>
> > All,
> >
> > While working with Karthik, we have discovered that the way ECO handles
> > topology configuration will not work for all configuration types.  To be
> > specific, setting individual component's ram will not work.  We will also
> > have to keep in mind container size that contains the components.  My
> > proposal is this:
> >
> > Create a standardized way to allow for the  configuring of  component ram
> > size in the "config" section of the  eco yaml file.  This would be a list
> > of key value pairs that mapped the "id" of a component to an allocated
> ram
> > size in MB.  An example is below:
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >     - "some-id": 1234
> >     - "other-id": 6789
> >
> > However the  above implementation may be unclear when it comes to
> > understanding what unit of measurement is implicitly specified and/or
> > expected.  Or we could do something like below.
> >
> > config:
> >   topology.workers: 2
> >   topology.component.rammap:
> >     - spec:
> >        id: "component-1"
> >        size: 1234
> >        type: MB  // Megabytes
> >     - spec:
> >        id: "component-2"
> >        size: 6789
> >        type: GB // GigaBytes
> >     - spec:
> >        id: "component-3"
> >        size: 123456789
> >        type: B // Bytes
> >
> >
> > If a mapping is not specified for a component, we can just assume Heron's
> > defaults.  We could then dynamically calculate the container size based
> off
> > of the number components and their corresponding allocated resources for
> > simplicity of use for the user, but still allow them to specify a custom
> > set of resources to a container like below
> >
> > topology.container.disk: 1234
> > topology.container.ram: 3456
> > topology.container.cpu: 2
> >
> >
> > It may be best if I reused the ByteAmount object to calculate resource
> > size to remain consistent with the other Heron APIs.  Any concerns or
> > improvements to this approach I am missing?
> >
> > Please Advise,
> >
> > Josh
> >
>

Re: Proposing Changes To ECO

Posted by Josh Fischer <jo...@joshfischer.io>.

To All,

I think I made  a mistake in my previous email

config:
  topology.workers: 2
  topology.component.rammap:
    - "some-id": 1234
    - "other-id": 6789


I think the yaml above is incorrect as well as other examples.  I think we
would have to do something like below


config:
  topology.workers: 2
  topology.component.rammap:
    - "some-id:1234"
    - "other-id:6789"

Which would then product a list of strings that would match the way the
topology_component_rammap is set via other apis.  The problem with this
approach is it would be easy for someone to make a mistake within the
formatting of the strings and would then cause us to have to validate the
format to fit the specs.  I think the approach below would be better.  I
would then just take the input, do some validation and conversion via the
ByteAmount class and generate a properly formatted string to fit the specs
of the topology_component_rammap values.

config:
  topology.workers: 2
  topology.component.rammap:

    - id: "component-1"
      size: 1234
      type: MB  // Megabytes

    - id: "component-2"
      size: 6789
      type: GB // GigaBytes

    - id: "component-3"
      size: 123456789
      type: B // Bytes



Hope I was clear with trying to explain things.  Of course I will also be
creating the docs as well to explain usage.

-Josh

On Sun, Jan 21, 2018 at 8:21 AM, Josh Fischer <jo...@joshfischer.io> wrote:

> All,
>
> While working with Karthik, we have discovered that the way ECO handles
> topology configuration will not work for all configuration types.  To be
> specific, setting individual component's ram will not work.  We will also
> have to keep in mind container size that contains the components.  My
> proposal is this:
>
> Create a standardized way to allow for the  configuring of  component ram
> size in the "config" section of the  eco yaml file.  This would be a list
> of key value pairs that mapped the "id" of a component to an allocated ram
> size in MB.  An example is below:
>
> config:
>   topology.workers: 2
>   topology.component.rammap:
>     - "some-id": 1234
>     - "other-id": 6789
>
> However the  above implementation may be unclear when it comes to
> understanding what unit of measurement is implicitly specified and/or
> expected.  Or we could do something like below.
>
> config:
>   topology.workers: 2
>   topology.component.rammap:
>     - spec:
>        id: "component-1"
>        size: 1234
>        type: MB  // Megabytes
>     - spec:
>        id: "component-2"
>        size: 6789
>        type: GB // GigaBytes
>     - spec:
>        id: "component-3"
>        size: 123456789
>        type: B // Bytes
>
>
> If a mapping is not specified for a component, we can just assume Heron's
> defaults.  We could then dynamically calculate the container size based off
> of the number components and their corresponding allocated resources for
> simplicity of use for the user, but still allow them to specify a custom
> set of resources to a container like below
>
> topology.container.disk: 1234
> topology.container.ram: 3456
> topology.container.cpu: 2
>
>
> It may be best if I reused the ByteAmount object to calculate resource
> size to remain consistent with the other Heron APIs.  Any concerns or
> improvements to this approach I am missing?
>
> Please Advise,
>
> Josh
>