You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apex.apache.org by Pramod Immaneni <pr...@datatorrent.com> on 2017/12/19 16:32:40 UTC

[Proposal] Simulate setting for application launch

I have a mini proposal. The command get-app-package-info runs the
populateDAG method of an application to construct the DAG but does not
actually launch the DAG. An application developer does not know in which
context the populateDAG is being called. For example, if they are recording
application starts in an external system from populateDAG, they will have
false entries there. This can be solved in different ways such as
introducing another method in StreamingApplication or more parameters
to populateDAG but a non disruptive option would be to add a property in
the configuration object that is passed to populateDAG to indicate if it is
simulate/test mode or real launch. An application developer can use this
property to take the appropriate actions.

Thanks

Re: [Proposal] Simulate setting for application launch

Posted by Vlad Rozov <vr...@apache.org>.

I still have a concern that introducing an assumption how populateDAG is 
used for anything else other than to construct an application DAG. The 
only use case provided does not well justifies changing primary API 
(even though it is changed without braking semantic version). I would 
prefer an alternate solution, so -0.5.

Thank you,

Vlad

On 12/22/17 11:50, Pramod Immaneni wrote:
> On Fri, Dec 22, 2017 at 8:19 AM, Vlad Rozov <vr...@apache.org> wrote:
>
>> I don't see more complexity in implementing a plugin compared to
>> implementing an application. Additionally, for the use case you mention,
>> plugin is a better option, as likely the behavior applies not to a single
>> application, but all applications in that environment.
>>
> It would be developing a plugin in addition to an application as opposed to
> doing something directly in the application. Also what they may want to do
> may not be general enough or have enough reuse to justify developing a
> plugin. Our users typically build applications and not plugins so I would
> say most who have this need would not build a plugin but would do this
> directly in the application.
>
>
>> Websocket gateway address is a configuration parameter. A DAG may change
>> depending on configuration parameters (presence of a gateway, hadoop
>> version/vendor, security being enabled or disabled), but it should not
>> change depending whether DAG is populated for a launch or to get an info.
>>
> The configuration does not prompt the user to return a different DAG and
> even with plugins some kind of configuration hint is needed. There is no
> formal definition of what the method should and shouldn't do and attempt to
> define the method to only construct a DAG and not do anything else is not
> only retrospective but restrictive, for example should I not be able to
> connect to a syslog server and log something. What you are saying on the
> behavior w.r.t the population of the DAG with varying properties and the
> like is good practice. Like I mentioned earlier, users have always had full
> control of what they want to do in populateDAG method and what DAG they
> want to return and the platform does not particularly care what DAG is
> returned. It does not enforce nor rely that DAGs returned by multiple calls
> to populateDAG be the same DAG.
>
>
>> Thank you,
>>
>> Vlad
>>
>>
>> On 12/21/17 10:05, Pramod Immaneni wrote:
>>
>>> Asking users to create plugins for something they want to do in their
>>> application logic is to do things in an indirect and cumbersome way with
>>> an
>>> added level of complexity. I don't think users will elect to do that.
>>> There
>>> is a reason populateDAG and the operators give users the flexibility they
>>> do today to have any custom logic they want. populateDAG isn't only for
>>> returning a constant DAG for an application, the configuration that is
>>> passed today to populateDAG, apart from hadoop environmental properties
>>> that could be considered constant also includes a variable component,
>>> which
>>> is the user customizable configuration properties. There are already
>>> examples of applications that have used these properties to do something
>>> different. Apart from the properties, some attributes are also injected
>>> into the DAG in a deliberate fashion by the platform to provide user with
>>> these so they can create the dag accordingly. One example is a websocket
>>> gateway address. If this is present applications create a websocket output
>>> operator else they end up create a console or some other output operator.
>>>
>>> On Thu, Dec 21, 2017 at 8:27 AM, Vlad Rozov <vr...@apache.org> wrote:
>>>
>>> "Sometimes" is not a use case. Config is not a context.
>>>> Without concrete use cases the proposed change is not well justified.
>>>> populateDAG() is supposed to populate DAG, not to record anything in an
>>>> external system. It was a design goal for plugins.
>>>>
>>>> Thank you,
>>>>
>>>> Vlad
>>>>
>>>>
>>>> On 12/20/17 02:23, Priyanka Gugale wrote:
>>>>
>>>> +1
>>>>> Sometimes this context is required. We shouldn't change any default
>>>>> behaviour other than making this config available.
>>>>>
>>>>> -Priyanka
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <
>>>>> pramod@datatorrent.com>
>>>>> wrote:
>>>>>
>>>>> The external system recording was just an example, not a specific use
>>>>>
>>>>>> case.
>>>>>> The idea is to provide comprehensive information to populateDAG as to
>>>>>> the
>>>>>> context it is being called under. It is akin to the test mode or
>>>>>> simulate
>>>>>> flag that you see with various utilities. The platform cannot control
>>>>>> what
>>>>>> populateDAG does, even without this information, in multiple calls that
>>>>>> you
>>>>>> mention the application can return different DAGs by depending on
>>>>>> any external factor such as time of day or some external variable. This
>>>>>> is
>>>>>> to merely provide more context information in the config. It is upto
>>>>>> the
>>>>>> application to do what it wishes with it.
>>>>>>
>>>>>> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
>>>>>>
>>>>>> -0.5: populateDAG() may be called by the platform as many times as it
>>>>>>
>>>>>>> needs (even in case it calls it only once now to launch an
>>>>>>> application).
>>>>>>> Passing different parameters to populateDAG() in simulate launch mode
>>>>>>> and
>>>>>>> actual launch may lead to different DAG being constructed for those
>>>>>>> two
>>>>>>> modes. Can't the use case you described be handled by a plugin?
>>>>>>>
>>>>>>> Thank you,
>>>>>>>
>>>>>>> Vlad
>>>>>>>
>>>>>>>
>>>>>>> On 12/19/17 10:06, Sanjay Pujare wrote:
>>>>>>>
>>>>>>> +1 although I prefer something that is more enforceable. So I like the
>>>>>>>
>>>>>>>> idea
>>>>>>>> of another method but that introduces incompatibility so may be in
>>>>>>>> 4.0?
>>>>>>>>
>>>>>>>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
>>>>>>>> amberarrow@yahoo.com.invalid> wrote:
>>>>>>>>
>>>>>>>>      +1
>>>>>>>>
>>>>>>>> Ram
>>>>>>>>>         On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod
>>>>>>>>> Immaneni <
>>>>>>>>> pramod@datatorrent.com> wrote:
>>>>>>>>>
>>>>>>>>>      I have a mini proposal. The command get-app-package-info runs
>>>>>>>>> the
>>>>>>>>> populateDAG method of an application to construct the DAG but does
>>>>>>>>> not
>>>>>>>>> actually launch the DAG. An application developer does not know in
>>>>>>>>>
>>>>>>>>> which
>>>>>>> context the populateDAG is being called. For example, if they are
>>>>>>>
>>>>>>>> recording
>>>>>>>>> application starts in an external system from populateDAG, they will
>>>>>>>>>
>>>>>>>>> have
>>>>>>> false entries there. This can be solved in different ways such as
>>>>>>>
>>>>>>>> introducing another method in StreamingApplication or more parameters
>>>>>>>>> to populateDAG but a non disruptive option would be to add a
>>>>>>>>> property
>>>>>>>>>
>>>>>>>>> in
>>>>>>> the configuration object that is passed to populateDAG to indicate if
>>>>>>>
>>>>>>>> it
>>>>>>>>
>>>>>>> is
>>>>>>>
>>>>>>>> simulate/test mode or real launch. An application developer can use
>>>>>>>>> this
>>>>>>> property to take the appropriate actions.
>>>>>>>
>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>

Re: [Proposal] Simulate setting for application launch

Posted by Pramod Immaneni <pr...@datatorrent.com>.

On Fri, Dec 22, 2017 at 8:19 AM, Vlad Rozov <vr...@apache.org> wrote:

> I don't see more complexity in implementing a plugin compared to
> implementing an application. Additionally, for the use case you mention,
> plugin is a better option, as likely the behavior applies not to a single
> application, but all applications in that environment.
>

It would be developing a plugin in addition to an application as opposed to
doing something directly in the application. Also what they may want to do
may not be general enough or have enough reuse to justify developing a
plugin. Our users typically build applications and not plugins so I would
say most who have this need would not build a plugin but would do this
directly in the application.


> Websocket gateway address is a configuration parameter. A DAG may change
> depending on configuration parameters (presence of a gateway, hadoop
> version/vendor, security being enabled or disabled), but it should not
> change depending whether DAG is populated for a launch or to get an info.
>

The configuration does not prompt the user to return a different DAG and
even with plugins some kind of configuration hint is needed. There is no
formal definition of what the method should and shouldn't do and attempt to
define the method to only construct a DAG and not do anything else is not
only retrospective but restrictive, for example should I not be able to
connect to a syslog server and log something. What you are saying on the
behavior w.r.t the population of the DAG with varying properties and the
like is good practice. Like I mentioned earlier, users have always had full
control of what they want to do in populateDAG method and what DAG they
want to return and the platform does not particularly care what DAG is
returned. It does not enforce nor rely that DAGs returned by multiple calls
to populateDAG be the same DAG.


>
> Thank you,
>
> Vlad
>
>
> On 12/21/17 10:05, Pramod Immaneni wrote:
>
>> Asking users to create plugins for something they want to do in their
>> application logic is to do things in an indirect and cumbersome way with
>> an
>> added level of complexity. I don't think users will elect to do that.
>> There
>> is a reason populateDAG and the operators give users the flexibility they
>> do today to have any custom logic they want. populateDAG isn't only for
>> returning a constant DAG for an application, the configuration that is
>> passed today to populateDAG, apart from hadoop environmental properties
>> that could be considered constant also includes a variable component,
>> which
>> is the user customizable configuration properties. There are already
>> examples of applications that have used these properties to do something
>> different. Apart from the properties, some attributes are also injected
>> into the DAG in a deliberate fashion by the platform to provide user with
>> these so they can create the dag accordingly. One example is a websocket
>> gateway address. If this is present applications create a websocket output
>> operator else they end up create a console or some other output operator.
>>
>> On Thu, Dec 21, 2017 at 8:27 AM, Vlad Rozov <vr...@apache.org> wrote:
>>
>> "Sometimes" is not a use case. Config is not a context.
>>>
>>> Without concrete use cases the proposed change is not well justified.
>>> populateDAG() is supposed to populate DAG, not to record anything in an
>>> external system. It was a design goal for plugins.
>>>
>>> Thank you,
>>>
>>> Vlad
>>>
>>>
>>> On 12/20/17 02:23, Priyanka Gugale wrote:
>>>
>>> +1
>>>> Sometimes this context is required. We shouldn't change any default
>>>> behaviour other than making this config available.
>>>>
>>>> -Priyanka
>>>>
>>>>
>>>>
>>>> On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <
>>>> pramod@datatorrent.com>
>>>> wrote:
>>>>
>>>> The external system recording was just an example, not a specific use
>>>>
>>>>> case.
>>>>> The idea is to provide comprehensive information to populateDAG as to
>>>>> the
>>>>> context it is being called under. It is akin to the test mode or
>>>>> simulate
>>>>> flag that you see with various utilities. The platform cannot control
>>>>> what
>>>>> populateDAG does, even without this information, in multiple calls that
>>>>> you
>>>>> mention the application can return different DAGs by depending on
>>>>> any external factor such as time of day or some external variable. This
>>>>> is
>>>>> to merely provide more context information in the config. It is upto
>>>>> the
>>>>> application to do what it wishes with it.
>>>>>
>>>>> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
>>>>>
>>>>> -0.5: populateDAG() may be called by the platform as many times as it
>>>>>
>>>>>> needs (even in case it calls it only once now to launch an
>>>>>> application).
>>>>>> Passing different parameters to populateDAG() in simulate launch mode
>>>>>> and
>>>>>> actual launch may lead to different DAG being constructed for those
>>>>>> two
>>>>>> modes. Can't the use case you described be handled by a plugin?
>>>>>>
>>>>>> Thank you,
>>>>>>
>>>>>> Vlad
>>>>>>
>>>>>>
>>>>>> On 12/19/17 10:06, Sanjay Pujare wrote:
>>>>>>
>>>>>> +1 although I prefer something that is more enforceable. So I like the
>>>>>>
>>>>>>> idea
>>>>>>> of another method but that introduces incompatibility so may be in
>>>>>>> 4.0?
>>>>>>>
>>>>>>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
>>>>>>> amberarrow@yahoo.com.invalid> wrote:
>>>>>>>
>>>>>>>     +1
>>>>>>>
>>>>>>> Ram
>>>>>>>>        On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod
>>>>>>>> Immaneni <
>>>>>>>> pramod@datatorrent.com> wrote:
>>>>>>>>
>>>>>>>>     I have a mini proposal. The command get-app-package-info runs
>>>>>>>> the
>>>>>>>> populateDAG method of an application to construct the DAG but does
>>>>>>>> not
>>>>>>>> actually launch the DAG. An application developer does not know in
>>>>>>>>
>>>>>>>> which
>>>>>>>
>>>>>> context the populateDAG is being called. For example, if they are
>>>>>>
>>>>>>> recording
>>>>>>>> application starts in an external system from populateDAG, they will
>>>>>>>>
>>>>>>>> have
>>>>>>>
>>>>>> false entries there. This can be solved in different ways such as
>>>>>>
>>>>>>> introducing another method in StreamingApplication or more parameters
>>>>>>>> to populateDAG but a non disruptive option would be to add a
>>>>>>>> property
>>>>>>>>
>>>>>>>> in
>>>>>>>
>>>>>> the configuration object that is passed to populateDAG to indicate if
>>>>>>
>>>>>>> it
>>>>>>>
>>>>>> is
>>>>>>
>>>>>>> simulate/test mode or real launch. An application developer can use
>>>>>>>>
>>>>>>>> this
>>>>>>>
>>>>>> property to take the appropriate actions.
>>>>>>
>>>>>>> Thanks
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>

Re: [Proposal] Simulate setting for application launch

Posted by Vlad Rozov <vr...@apache.org>.

I don't see more complexity in implementing a plugin compared to 
implementing an application. Additionally, for the use case you mention, 
plugin is a better option, as likely the behavior applies not to a 
single application, but all applications in that environment.

Websocket gateway address is a configuration parameter. A DAG may change 
depending on configuration parameters (presence of a gateway, hadoop 
version/vendor, security being enabled or disabled), but it should not 
change depending whether DAG is populated for a launch or to get an info.

Thank you,

Vlad

On 12/21/17 10:05, Pramod Immaneni wrote:
> Asking users to create plugins for something they want to do in their
> application logic is to do things in an indirect and cumbersome way with an
> added level of complexity. I don't think users will elect to do that. There
> is a reason populateDAG and the operators give users the flexibility they
> do today to have any custom logic they want. populateDAG isn't only for
> returning a constant DAG for an application, the configuration that is
> passed today to populateDAG, apart from hadoop environmental properties
> that could be considered constant also includes a variable component, which
> is the user customizable configuration properties. There are already
> examples of applications that have used these properties to do something
> different. Apart from the properties, some attributes are also injected
> into the DAG in a deliberate fashion by the platform to provide user with
> these so they can create the dag accordingly. One example is a websocket
> gateway address. If this is present applications create a websocket output
> operator else they end up create a console or some other output operator.
>
> On Thu, Dec 21, 2017 at 8:27 AM, Vlad Rozov <vr...@apache.org> wrote:
>
>> "Sometimes" is not a use case. Config is not a context.
>>
>> Without concrete use cases the proposed change is not well justified.
>> populateDAG() is supposed to populate DAG, not to record anything in an
>> external system. It was a design goal for plugins.
>>
>> Thank you,
>>
>> Vlad
>>
>>
>> On 12/20/17 02:23, Priyanka Gugale wrote:
>>
>>> +1
>>> Sometimes this context is required. We shouldn't change any default
>>> behaviour other than making this config available.
>>>
>>> -Priyanka
>>>
>>>
>>>
>>> On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <pr...@datatorrent.com>
>>> wrote:
>>>
>>> The external system recording was just an example, not a specific use
>>>> case.
>>>> The idea is to provide comprehensive information to populateDAG as to the
>>>> context it is being called under. It is akin to the test mode or simulate
>>>> flag that you see with various utilities. The platform cannot control
>>>> what
>>>> populateDAG does, even without this information, in multiple calls that
>>>> you
>>>> mention the application can return different DAGs by depending on
>>>> any external factor such as time of day or some external variable. This
>>>> is
>>>> to merely provide more context information in the config. It is upto the
>>>> application to do what it wishes with it.
>>>>
>>>> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
>>>>
>>>> -0.5: populateDAG() may be called by the platform as many times as it
>>>>> needs (even in case it calls it only once now to launch an application).
>>>>> Passing different parameters to populateDAG() in simulate launch mode
>>>>> and
>>>>> actual launch may lead to different DAG being constructed for those two
>>>>> modes. Can't the use case you described be handled by a plugin?
>>>>>
>>>>> Thank you,
>>>>>
>>>>> Vlad
>>>>>
>>>>>
>>>>> On 12/19/17 10:06, Sanjay Pujare wrote:
>>>>>
>>>>> +1 although I prefer something that is more enforceable. So I like the
>>>>>> idea
>>>>>> of another method but that introduces incompatibility so may be in 4.0?
>>>>>>
>>>>>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
>>>>>> amberarrow@yahoo.com.invalid> wrote:
>>>>>>
>>>>>>     +1
>>>>>>
>>>>>>> Ram
>>>>>>>        On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
>>>>>>> pramod@datatorrent.com> wrote:
>>>>>>>
>>>>>>>     I have a mini proposal. The command get-app-package-info runs the
>>>>>>> populateDAG method of an application to construct the DAG but does not
>>>>>>> actually launch the DAG. An application developer does not know in
>>>>>>>
>>>>>> which
>>>>> context the populateDAG is being called. For example, if they are
>>>>>>> recording
>>>>>>> application starts in an external system from populateDAG, they will
>>>>>>>
>>>>>> have
>>>>> false entries there. This can be solved in different ways such as
>>>>>>> introducing another method in StreamingApplication or more parameters
>>>>>>> to populateDAG but a non disruptive option would be to add a property
>>>>>>>
>>>>>> in
>>>>> the configuration object that is passed to populateDAG to indicate if
>>>>>> it
>>>>> is
>>>>>>> simulate/test mode or real launch. An application developer can use
>>>>>>>
>>>>>> this
>>>>> property to take the appropriate actions.
>>>>>>> Thanks
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>

Re: [Proposal] Simulate setting for application launch

Posted by Pramod Immaneni <pr...@datatorrent.com>.

Asking users to create plugins for something they want to do in their
application logic is to do things in an indirect and cumbersome way with an
added level of complexity. I don't think users will elect to do that. There
is a reason populateDAG and the operators give users the flexibility they
do today to have any custom logic they want. populateDAG isn't only for
returning a constant DAG for an application, the configuration that is
passed today to populateDAG, apart from hadoop environmental properties
that could be considered constant also includes a variable component, which
is the user customizable configuration properties. There are already
examples of applications that have used these properties to do something
different. Apart from the properties, some attributes are also injected
into the DAG in a deliberate fashion by the platform to provide user with
these so they can create the dag accordingly. One example is a websocket
gateway address. If this is present applications create a websocket output
operator else they end up create a console or some other output operator.

On Thu, Dec 21, 2017 at 8:27 AM, Vlad Rozov <vr...@apache.org> wrote:

> "Sometimes" is not a use case. Config is not a context.
>
> Without concrete use cases the proposed change is not well justified.
> populateDAG() is supposed to populate DAG, not to record anything in an
> external system. It was a design goal for plugins.
>
> Thank you,
>
> Vlad
>
>
> On 12/20/17 02:23, Priyanka Gugale wrote:
>
>> +1
>> Sometimes this context is required. We shouldn't change any default
>> behaviour other than making this config available.
>>
>> -Priyanka
>>
>>
>>
>> On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <pr...@datatorrent.com>
>> wrote:
>>
>> The external system recording was just an example, not a specific use
>>> case.
>>> The idea is to provide comprehensive information to populateDAG as to the
>>> context it is being called under. It is akin to the test mode or simulate
>>> flag that you see with various utilities. The platform cannot control
>>> what
>>> populateDAG does, even without this information, in multiple calls that
>>> you
>>> mention the application can return different DAGs by depending on
>>> any external factor such as time of day or some external variable. This
>>> is
>>> to merely provide more context information in the config. It is upto the
>>> application to do what it wishes with it.
>>>
>>> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
>>>
>>> -0.5: populateDAG() may be called by the platform as many times as it
>>>> needs (even in case it calls it only once now to launch an application).
>>>> Passing different parameters to populateDAG() in simulate launch mode
>>>> and
>>>> actual launch may lead to different DAG being constructed for those two
>>>> modes. Can't the use case you described be handled by a plugin?
>>>>
>>>> Thank you,
>>>>
>>>> Vlad
>>>>
>>>>
>>>> On 12/19/17 10:06, Sanjay Pujare wrote:
>>>>
>>>> +1 although I prefer something that is more enforceable. So I like the
>>>>> idea
>>>>> of another method but that introduces incompatibility so may be in 4.0?
>>>>>
>>>>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
>>>>> amberarrow@yahoo.com.invalid> wrote:
>>>>>
>>>>>    +1
>>>>>
>>>>>> Ram
>>>>>>       On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
>>>>>> pramod@datatorrent.com> wrote:
>>>>>>
>>>>>>    I have a mini proposal. The command get-app-package-info runs the
>>>>>> populateDAG method of an application to construct the DAG but does not
>>>>>> actually launch the DAG. An application developer does not know in
>>>>>>
>>>>> which
>>>
>>>> context the populateDAG is being called. For example, if they are
>>>>>> recording
>>>>>> application starts in an external system from populateDAG, they will
>>>>>>
>>>>> have
>>>
>>>> false entries there. This can be solved in different ways such as
>>>>>> introducing another method in StreamingApplication or more parameters
>>>>>> to populateDAG but a non disruptive option would be to add a property
>>>>>>
>>>>> in
>>>
>>>> the configuration object that is passed to populateDAG to indicate if
>>>>>>
>>>>> it
>>>
>>>> is
>>>>>> simulate/test mode or real launch. An application developer can use
>>>>>>
>>>>> this
>>>
>>>> property to take the appropriate actions.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>

Re: [Proposal] Simulate setting for application launch

Posted by Pramod Immaneni <pr...@datatorrent.com>.

I am thinking of a boolean property named "apex.run.simulate"

On Thu, Dec 21, 2017 at 9:05 AM, Sanjay Pujare <sa...@datatorrent.com>
wrote:

> It is relatively easy to describe the justification for this change without
> getting into the weeds and hairsplitting words.
>
> A DAG is built not only to launch an application but also to let a user
> visualize and configure it. Currently "populateDAG" is the only method we
> require application writers to implement and they implement it with the
> goal of running the application. So it can use properties, configuration
> and code that is really only needed if you want to "run" the DAG.
>
> As mentioned above a perfectly valid use case is that a platform allows a
> user to construct a DAG, visualize it and then attach configuration values
> to various components in the DAG, save these values as some kind of a
> "configuration package" and then at a future date run the DAG with this
> setup. This is consistent with the view that construction of a pipeline and
> execution of the pipeline are 2 separate phases and should be delineated as
> such.
>
> If you understand and agree with the justification we can work on improving
> the original proposal.
>
> Sanjay
>
>
> On Thu, Dec 21, 2017 at 8:27 AM, Vlad Rozov <vr...@apache.org> wrote:
>
> > "Sometimes" is not a use case. Config is not a context.
> >
> > Without concrete use cases the proposed change is not well justified.
> > populateDAG() is supposed to populate DAG, not to record anything in an
> > external system. It was a design goal for plugins.
> >
> > Thank you,
> >
> > Vlad
> >
> >
> > On 12/20/17 02:23, Priyanka Gugale wrote:
> >
> >> +1
> >> Sometimes this context is required. We shouldn't change any default
> >> behaviour other than making this config available.
> >>
> >> -Priyanka
> >>
> >>
> >>
> >> On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <
> pramod@datatorrent.com>
> >> wrote:
> >>
> >> The external system recording was just an example, not a specific use
> >>> case.
> >>> The idea is to provide comprehensive information to populateDAG as to
> the
> >>> context it is being called under. It is akin to the test mode or
> simulate
> >>> flag that you see with various utilities. The platform cannot control
> >>> what
> >>> populateDAG does, even without this information, in multiple calls that
> >>> you
> >>> mention the application can return different DAGs by depending on
> >>> any external factor such as time of day or some external variable. This
> >>> is
> >>> to merely provide more context information in the config. It is upto
> the
> >>> application to do what it wishes with it.
> >>>
> >>> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
> >>>
> >>> -0.5: populateDAG() may be called by the platform as many times as it
> >>>> needs (even in case it calls it only once now to launch an
> application).
> >>>> Passing different parameters to populateDAG() in simulate launch mode
> >>>> and
> >>>> actual launch may lead to different DAG being constructed for those
> two
> >>>> modes. Can't the use case you described be handled by a plugin?
> >>>>
> >>>> Thank you,
> >>>>
> >>>> Vlad
> >>>>
> >>>>
> >>>> On 12/19/17 10:06, Sanjay Pujare wrote:
> >>>>
> >>>> +1 although I prefer something that is more enforceable. So I like the
> >>>>> idea
> >>>>> of another method but that introduces incompatibility so may be in
> 4.0?
> >>>>>
> >>>>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
> >>>>> amberarrow@yahoo.com.invalid> wrote:
> >>>>>
> >>>>>    +1
> >>>>>
> >>>>>> Ram
> >>>>>>       On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod
> Immaneni <
> >>>>>> pramod@datatorrent.com> wrote:
> >>>>>>
> >>>>>>    I have a mini proposal. The command get-app-package-info runs the
> >>>>>> populateDAG method of an application to construct the DAG but does
> not
> >>>>>> actually launch the DAG. An application developer does not know in
> >>>>>>
> >>>>> which
> >>>
> >>>> context the populateDAG is being called. For example, if they are
> >>>>>> recording
> >>>>>> application starts in an external system from populateDAG, they will
> >>>>>>
> >>>>> have
> >>>
> >>>> false entries there. This can be solved in different ways such as
> >>>>>> introducing another method in StreamingApplication or more
> parameters
> >>>>>> to populateDAG but a non disruptive option would be to add a
> property
> >>>>>>
> >>>>> in
> >>>
> >>>> the configuration object that is passed to populateDAG to indicate if
> >>>>>>
> >>>>> it
> >>>
> >>>> is
> >>>>>> simulate/test mode or real launch. An application developer can use
> >>>>>>
> >>>>> this
> >>>
> >>>> property to take the appropriate actions.
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >
>

Re: [Proposal] Simulate setting for application launch

Posted by Sanjay Pujare <sa...@datatorrent.com>.

It is relatively easy to describe the justification for this change without
getting into the weeds and hairsplitting words.

A DAG is built not only to launch an application but also to let a user
visualize and configure it. Currently "populateDAG" is the only method we
require application writers to implement and they implement it with the
goal of running the application. So it can use properties, configuration
and code that is really only needed if you want to "run" the DAG.

As mentioned above a perfectly valid use case is that a platform allows a
user to construct a DAG, visualize it and then attach configuration values
to various components in the DAG, save these values as some kind of a
"configuration package" and then at a future date run the DAG with this
setup. This is consistent with the view that construction of a pipeline and
execution of the pipeline are 2 separate phases and should be delineated as
such.

If you understand and agree with the justification we can work on improving
the original proposal.

Sanjay


On Thu, Dec 21, 2017 at 8:27 AM, Vlad Rozov <vr...@apache.org> wrote:

> "Sometimes" is not a use case. Config is not a context.
>
> Without concrete use cases the proposed change is not well justified.
> populateDAG() is supposed to populate DAG, not to record anything in an
> external system. It was a design goal for plugins.
>
> Thank you,
>
> Vlad
>
>
> On 12/20/17 02:23, Priyanka Gugale wrote:
>
>> +1
>> Sometimes this context is required. We shouldn't change any default
>> behaviour other than making this config available.
>>
>> -Priyanka
>>
>>
>>
>> On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <pr...@datatorrent.com>
>> wrote:
>>
>> The external system recording was just an example, not a specific use
>>> case.
>>> The idea is to provide comprehensive information to populateDAG as to the
>>> context it is being called under. It is akin to the test mode or simulate
>>> flag that you see with various utilities. The platform cannot control
>>> what
>>> populateDAG does, even without this information, in multiple calls that
>>> you
>>> mention the application can return different DAGs by depending on
>>> any external factor such as time of day or some external variable. This
>>> is
>>> to merely provide more context information in the config. It is upto the
>>> application to do what it wishes with it.
>>>
>>> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
>>>
>>> -0.5: populateDAG() may be called by the platform as many times as it
>>>> needs (even in case it calls it only once now to launch an application).
>>>> Passing different parameters to populateDAG() in simulate launch mode
>>>> and
>>>> actual launch may lead to different DAG being constructed for those two
>>>> modes. Can't the use case you described be handled by a plugin?
>>>>
>>>> Thank you,
>>>>
>>>> Vlad
>>>>
>>>>
>>>> On 12/19/17 10:06, Sanjay Pujare wrote:
>>>>
>>>> +1 although I prefer something that is more enforceable. So I like the
>>>>> idea
>>>>> of another method but that introduces incompatibility so may be in 4.0?
>>>>>
>>>>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
>>>>> amberarrow@yahoo.com.invalid> wrote:
>>>>>
>>>>>    +1
>>>>>
>>>>>> Ram
>>>>>>       On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
>>>>>> pramod@datatorrent.com> wrote:
>>>>>>
>>>>>>    I have a mini proposal. The command get-app-package-info runs the
>>>>>> populateDAG method of an application to construct the DAG but does not
>>>>>> actually launch the DAG. An application developer does not know in
>>>>>>
>>>>> which
>>>
>>>> context the populateDAG is being called. For example, if they are
>>>>>> recording
>>>>>> application starts in an external system from populateDAG, they will
>>>>>>
>>>>> have
>>>
>>>> false entries there. This can be solved in different ways such as
>>>>>> introducing another method in StreamingApplication or more parameters
>>>>>> to populateDAG but a non disruptive option would be to add a property
>>>>>>
>>>>> in
>>>
>>>> the configuration object that is passed to populateDAG to indicate if
>>>>>>
>>>>> it
>>>
>>>> is
>>>>>> simulate/test mode or real launch. An application developer can use
>>>>>>
>>>>> this
>>>
>>>> property to take the appropriate actions.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>

Re: [Proposal] Simulate setting for application launch

Posted by Vlad Rozov <vr...@apache.org>.

"Sometimes" is not a use case. Config is not a context.

Without concrete use cases the proposed change is not well justified. 
populateDAG() is supposed to populate DAG, not to record anything in an 
external system. It was a design goal for plugins.

Thank you,

Vlad

On 12/20/17 02:23, Priyanka Gugale wrote:
> +1
> Sometimes this context is required. We shouldn't change any default
> behaviour other than making this config available.
>
> -Priyanka
>
>
>
> On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> The external system recording was just an example, not a specific use case.
>> The idea is to provide comprehensive information to populateDAG as to the
>> context it is being called under. It is akin to the test mode or simulate
>> flag that you see with various utilities. The platform cannot control what
>> populateDAG does, even without this information, in multiple calls that you
>> mention the application can return different DAGs by depending on
>> any external factor such as time of day or some external variable. This is
>> to merely provide more context information in the config. It is upto the
>> application to do what it wishes with it.
>>
>> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
>>
>>> -0.5: populateDAG() may be called by the platform as many times as it
>>> needs (even in case it calls it only once now to launch an application).
>>> Passing different parameters to populateDAG() in simulate launch mode and
>>> actual launch may lead to different DAG being constructed for those two
>>> modes. Can't the use case you described be handled by a plugin?
>>>
>>> Thank you,
>>>
>>> Vlad
>>>
>>>
>>> On 12/19/17 10:06, Sanjay Pujare wrote:
>>>
>>>> +1 although I prefer something that is more enforceable. So I like the
>>>> idea
>>>> of another method but that introduces incompatibility so may be in 4.0?
>>>>
>>>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
>>>> amberarrow@yahoo.com.invalid> wrote:
>>>>
>>>>    +1
>>>>> Ram
>>>>>       On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
>>>>> pramod@datatorrent.com> wrote:
>>>>>
>>>>>    I have a mini proposal. The command get-app-package-info runs the
>>>>> populateDAG method of an application to construct the DAG but does not
>>>>> actually launch the DAG. An application developer does not know in
>> which
>>>>> context the populateDAG is being called. For example, if they are
>>>>> recording
>>>>> application starts in an external system from populateDAG, they will
>> have
>>>>> false entries there. This can be solved in different ways such as
>>>>> introducing another method in StreamingApplication or more parameters
>>>>> to populateDAG but a non disruptive option would be to add a property
>> in
>>>>> the configuration object that is passed to populateDAG to indicate if
>> it
>>>>> is
>>>>> simulate/test mode or real launch. An application developer can use
>> this
>>>>> property to take the appropriate actions.
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>

Re: [Proposal] Simulate setting for application launch

Posted by Priyanka Gugale <pr...@apache.org>.

+1
Sometimes this context is required. We shouldn't change any default
behaviour other than making this config available.

-Priyanka



On Wed, Dec 20, 2017 at 5:32 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> The external system recording was just an example, not a specific use case.
> The idea is to provide comprehensive information to populateDAG as to the
> context it is being called under. It is akin to the test mode or simulate
> flag that you see with various utilities. The platform cannot control what
> populateDAG does, even without this information, in multiple calls that you
> mention the application can return different DAGs by depending on
> any external factor such as time of day or some external variable. This is
> to merely provide more context information in the config. It is upto the
> application to do what it wishes with it.
>
> On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:
>
> > -0.5: populateDAG() may be called by the platform as many times as it
> > needs (even in case it calls it only once now to launch an application).
> > Passing different parameters to populateDAG() in simulate launch mode and
> > actual launch may lead to different DAG being constructed for those two
> > modes. Can't the use case you described be handled by a plugin?
> >
> > Thank you,
> >
> > Vlad
> >
> >
> > On 12/19/17 10:06, Sanjay Pujare wrote:
> >
> >> +1 although I prefer something that is more enforceable. So I like the
> >> idea
> >> of another method but that introduces incompatibility so may be in 4.0?
> >>
> >> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
> >> amberarrow@yahoo.com.invalid> wrote:
> >>
> >>   +1
> >>> Ram
> >>>      On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
> >>> pramod@datatorrent.com> wrote:
> >>>
> >>>   I have a mini proposal. The command get-app-package-info runs the
> >>> populateDAG method of an application to construct the DAG but does not
> >>> actually launch the DAG. An application developer does not know in
> which
> >>> context the populateDAG is being called. For example, if they are
> >>> recording
> >>> application starts in an external system from populateDAG, they will
> have
> >>> false entries there. This can be solved in different ways such as
> >>> introducing another method in StreamingApplication or more parameters
> >>> to populateDAG but a non disruptive option would be to add a property
> in
> >>> the configuration object that is passed to populateDAG to indicate if
> it
> >>> is
> >>> simulate/test mode or real launch. An application developer can use
> this
> >>> property to take the appropriate actions.
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >
>

Re: [Proposal] Simulate setting for application launch

Posted by Pramod Immaneni <pr...@datatorrent.com>.

The external system recording was just an example, not a specific use case.
The idea is to provide comprehensive information to populateDAG as to the
context it is being called under. It is akin to the test mode or simulate
flag that you see with various utilities. The platform cannot control what
populateDAG does, even without this information, in multiple calls that you
mention the application can return different DAGs by depending on
any external factor such as time of day or some external variable. This is
to merely provide more context information in the config. It is upto the
application to do what it wishes with it.

On Tue, Dec 19, 2017 at 2:28 PM, Vlad Rozov <vr...@apache.org> wrote:

> -0.5: populateDAG() may be called by the platform as many times as it
> needs (even in case it calls it only once now to launch an application).
> Passing different parameters to populateDAG() in simulate launch mode and
> actual launch may lead to different DAG being constructed for those two
> modes. Can't the use case you described be handled by a plugin?
>
> Thank you,
>
> Vlad
>
>
> On 12/19/17 10:06, Sanjay Pujare wrote:
>
>> +1 although I prefer something that is more enforceable. So I like the
>> idea
>> of another method but that introduces incompatibility so may be in 4.0?
>>
>> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
>> amberarrow@yahoo.com.invalid> wrote:
>>
>>   +1
>>> Ram
>>>      On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
>>> pramod@datatorrent.com> wrote:
>>>
>>>   I have a mini proposal. The command get-app-package-info runs the
>>> populateDAG method of an application to construct the DAG but does not
>>> actually launch the DAG. An application developer does not know in which
>>> context the populateDAG is being called. For example, if they are
>>> recording
>>> application starts in an external system from populateDAG, they will have
>>> false entries there. This can be solved in different ways such as
>>> introducing another method in StreamingApplication or more parameters
>>> to populateDAG but a non disruptive option would be to add a property in
>>> the configuration object that is passed to populateDAG to indicate if it
>>> is
>>> simulate/test mode or real launch. An application developer can use this
>>> property to take the appropriate actions.
>>>
>>> Thanks
>>>
>>>
>>>
>

Re: [Proposal] Simulate setting for application launch

Posted by Vlad Rozov <vr...@apache.org>.

-0.5: populateDAG() may be called by the platform as many times as it 
needs (even in case it calls it only once now to launch an application). 
Passing different parameters to populateDAG() in simulate launch mode 
and actual launch may lead to different DAG being constructed for those 
two modes. Can't the use case you described be handled by a plugin?

Thank you,

Vlad

On 12/19/17 10:06, Sanjay Pujare wrote:
> +1 although I prefer something that is more enforceable. So I like the idea
> of another method but that introduces incompatibility so may be in 4.0?
>
> On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
> amberarrow@yahoo.com.invalid> wrote:
>
>>   +1
>> Ram
>>      On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
>> pramod@datatorrent.com> wrote:
>>
>>   I have a mini proposal. The command get-app-package-info runs the
>> populateDAG method of an application to construct the DAG but does not
>> actually launch the DAG. An application developer does not know in which
>> context the populateDAG is being called. For example, if they are recording
>> application starts in an external system from populateDAG, they will have
>> false entries there. This can be solved in different ways such as
>> introducing another method in StreamingApplication or more parameters
>> to populateDAG but a non disruptive option would be to add a property in
>> the configuration object that is passed to populateDAG to indicate if it is
>> simulate/test mode or real launch. An application developer can use this
>> property to take the appropriate actions.
>>
>> Thanks
>>
>>

Re: [Proposal] Simulate setting for application launch

Posted by Sanjay Pujare <sa...@datatorrent.com>.

+1 although I prefer something that is more enforceable. So I like the idea
of another method but that introduces incompatibility so may be in 4.0?

On Tue, Dec 19, 2017 at 9:40 AM, Munagala Ramanath <
amberarrow@yahoo.com.invalid> wrote:

>  +1
> Ram
>     On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <
> pramod@datatorrent.com> wrote:
>
>  I have a mini proposal. The command get-app-package-info runs the
> populateDAG method of an application to construct the DAG but does not
> actually launch the DAG. An application developer does not know in which
> context the populateDAG is being called. For example, if they are recording
> application starts in an external system from populateDAG, they will have
> false entries there. This can be solved in different ways such as
> introducing another method in StreamingApplication or more parameters
> to populateDAG but a non disruptive option would be to add a property in
> the configuration object that is passed to populateDAG to indicate if it is
> simulate/test mode or real launch. An application developer can use this
> property to take the appropriate actions.
>
> Thanks
>
>

Re: [Proposal] Simulate setting for application launch

Posted by Munagala Ramanath <am...@yahoo.com.INVALID>.

 +1
Ram
    On Tuesday, December 19, 2017, 8:33:21 AM PST, Pramod Immaneni <pr...@datatorrent.com> wrote:  
 
 I have a mini proposal. The command get-app-package-info runs the
populateDAG method of an application to construct the DAG but does not
actually launch the DAG. An application developer does not know in which
context the populateDAG is being called. For example, if they are recording
application starts in an external system from populateDAG, they will have
false entries there. This can be solved in different ways such as
introducing another method in StreamingApplication or more parameters
to populateDAG but a non disruptive option would be to add a property in
the configuration object that is passed to populateDAG to indicate if it is
simulate/test mode or real launch. An application developer can use this
property to take the appropriate actions.

Thanks