You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airavata.apache.org by Saminda Wijeratne <sa...@gmail.com> on 2014/06/24 01:30:27 UTC

Enabling Workflow Support through Orchestrator

With a few updates to the Orchestrator CPI we are carrying ahead the
updating the workflow interpreter to support workflow executions in
Airavata for 0.13 release as the attached diagram.


[image: Inline image 1]

Re: Enabling Workflow Support through Orchestrator

Posted by Saminda Wijeratne <sa...@gmail.com>.

hi Marlon,

You understanding of the choices is correct.

Choice 2: Once in the implementation we figure-out from the id whether its
a workflow or an application the "ExecType" is automatically inferred
without having to user provide it like in Choice 1.

Choice 4: I'm a little hesitant to introduce a new structure because other
than the fact that we need to specify either the workflow template id or
the application id everything else remains the same under the current data
model. i.e. users do not need to query twice to get experiment data or
summary data of workflows and applications. Ideally it would be nice if we
can have ApplicationExperiment and WorkflowExperiment which extends
Experiment. But thrift doesn't support inheritance. wdyt?




On Tue, Jun 24, 2014 at 10:00 AM, Suresh Marru <sm...@apache.org> wrote:

> This is a good summary. I am inclined on Choice 3. The experiment data
> structure is similar for both single application and workflows, but the API
> calls are explicit. From a user stand point, both application and workflows
> have inputs, outputs and QoS configurations. But the level of details
> exposed in workflows is more granular. So the data structure can be re-used.
>
> I also worry about using the magic parameters, the more we stay away from
> XOR like situations, it may be unambiguous.
>
> Suresh
>
> On Jun 24, 2014, at 9:36 AM, Marlon Pierce <ma...@iu.edu> wrote:
>
> > Hi Saminda--
> >
> > Can you say more about why you have three options and what the tradeoffs
> are?  Below is my understanding.
> >
> > * Choice 2: one launch method and one executableId for both workflows
> and single applications, so they are treated in the API as fundamentally
> the same. Beautiful uniformity but may be more contorted to implement.  I
> like this as an API call but implementing it may have unintended
> consequences.
> >
> > * Choice 1: still has a universal launch method and execID but makes the
> execution type explicit with ExecType.  This makes the API user responsible
> for making this choice.  Increases the chance that the API user will make a
> mistake.  I don't like it for that reason.
> >
> > * Choice 3: different methods for launching single apps and workflows,
> but the Experiment structure is the same as Choice 2. Not as beautiful as
> Choice 2 but may have a cleaner implementation. API user probably knows
> they need a workflow, but what happens if they send an Experiment object of
> the wrong type to one of the methods (workflow Experiment to
> launchApplication)?
> >
> > * Choice 4 (not shown): like Choice 3 but with WorkflowExperiment struct
> for workflows and launchWorkflow(WorkflowExperiment).
> >
> >
> > I like Choice 2 (keep the API simple) and then Choice 4 (if you can't
> make it simple, make it unambiguous).  Choice 2 is my least favorite (API
> user must supply the right magic parameter).
> >
> >
> > Marlon
> >
> >
> >
> > On 6/23/14, 7:56 PM, Saminda Wijeratne wrote:
> >> In order to distinguish single application vs workflow execution in the
> API
> >> we thought of few choices (trivial parameters not shown here and
> proposed
> >> parameter/property names not well thought out yet).
> >>
> >> *Choice 1*
> >> launchExperiment(Experiment experiment)
> >> struct Experiment{
> >>    ...
> >>    string executableId // application id for single app or workflow
> >> template id for workflow
> >>    ExecType type // SINGLE_APP/WORKFLOW
> >>    ...
> >> }
> >>
> >> *Choice 2*
> >> launchExperiment(Experiment experiment)
> >> struct Experiment{
> >>    ...
> >>    string executableId // unique id for application id for single app or
> >> workflow template id for workflow
> >>    ...
> >> }
> >>
> >> *Choice 3*
> >> launchApplication(Experiment experiment)
> >> launchWorkflow(Experiment experiment)
> >>
> >> launchExperiment(Experiment experiment)
> >> struct Experiment{
> >>    ...
> >>    string executableId // application id for single app or workflow
> >> template id for workflow
> >>    ...
> >> }
> >>
> >> Any thoughts?
> >>
> >>
> >> On Mon, Jun 23, 2014 at 7:30 PM, Saminda Wijeratne <sa...@gmail.com>
> >> wrote:
> >>
> >>> With a few updates to the Orchestrator CPI we are carrying ahead the
> >>> updating the workflow interpreter to support workflow executions in
> >>> Airavata for 0.13 release as the attached diagram.
> >>>
> >>>
> >>> [image: Inline image 1]
>
>

Re: Enabling Workflow Support through Orchestrator

Posted by Suresh Marru <sm...@apache.org>.

I am sensing the lessons learnt from the current data models is Experiment is too coarse. We need to break it down, probably after 0.13 release in mid-july. 

If we take choice 3: I think we should have launchApplication and launchWorkflow. The method signatures once broken down, can be like ExperimentMetadata, ExperimentConfiguration. And for Application, we can specify ApplicationId, and WorkflowID for workflow launch call and so forth.

So the big Experiment data model will have common things like metadata and configuration. But will need to fork off with specifics in the middle. I think Saminda had some suggestions on how to disambiguate for application vs workflow. We can have an enum which can be simpleApplication and workflow for now. In future it can expand to DataAnalysis and so forth.

Suresh

On Jun 24, 2014, at 11:17 AM, IU-Gmail <ra...@gmail.com> wrote:

> In choice 3 what does "launchExperiment(Experiment experiment)” mean?  This will add confusion for the API user and developer. I like choice 3 for clarity and we already have workflow related elements in the experiment Struct. 
> 
> Thanks
> Raminder
> On Jun 24, 2014, at 10:00 AM, Suresh Marru <sm...@apache.org> wrote:
> 
>> This is a good summary. I am inclined on Choice 3. The experiment data structure is similar for both single application and workflows, but the API calls are explicit. From a user stand point, both application and workflows have inputs, outputs and QoS configurations. But the level of details exposed in workflows is more granular. So the data structure can be re-used.
>> 
>> I also worry about using the magic parameters, the more we stay away from XOR like situations, it may be unambiguous. 
>> 
>> Suresh
>> 
>> On Jun 24, 2014, at 9:36 AM, Marlon Pierce <ma...@iu.edu> wrote:
>> 
>>> Hi Saminda--
>>> 
>>> Can you say more about why you have three options and what the tradeoffs are?  Below is my understanding.
>>> 
>>> * Choice 2: one launch method and one executableId for both workflows and single applications, so they are treated in the API as fundamentally the same. Beautiful uniformity but may be more contorted to implement.  I like this as an API call but implementing it may have unintended consequences.
>>> 
>>> * Choice 1: still has a universal launch method and execID but makes the execution type explicit with ExecType.  This makes the API user responsible for making this choice.  Increases the chance that the API user will make a mistake.  I don't like it for that reason.
>>> 
>>> * Choice 3: different methods for launching single apps and workflows, but the Experiment structure is the same as Choice 2. Not as beautiful as Choice 2 but may have a cleaner implementation. API user probably knows they need a workflow, but what happens if they send an Experiment object of the wrong type to one of the methods (workflow Experiment to launchApplication)?
>>> 
>>> * Choice 4 (not shown): like Choice 3 but with WorkflowExperiment struct for workflows and launchWorkflow(WorkflowExperiment).
>>> 
>>> 
>>> I like Choice 2 (keep the API simple) and then Choice 4 (if you can't make it simple, make it unambiguous).  Choice 2 is my least favorite (API user must supply the right magic parameter).
>>> 
>>> 
>>> Marlon
>>> 
>>> 
>>> 
>>> On 6/23/14, 7:56 PM, Saminda Wijeratne wrote:
>>>> In order to distinguish single application vs workflow execution in the API
>>>> we thought of few choices (trivial parameters not shown here and proposed
>>>> parameter/property names not well thought out yet).
>>>> 
>>>> *Choice 1*
>>>> launchExperiment(Experiment experiment)
>>>> struct Experiment{
>>>>  ...
>>>>  string executableId // application id for single app or workflow
>>>> template id for workflow
>>>>  ExecType type // SINGLE_APP/WORKFLOW
>>>>  ...
>>>> }
>>>> 
>>>> *Choice 2*
>>>> launchExperiment(Experiment experiment)
>>>> struct Experiment{
>>>>  ...
>>>>  string executableId // unique id for application id for single app or
>>>> workflow template id for workflow
>>>>  ...
>>>> }
>>>> 
>>>> *Choice 3*
>>>> launchApplication(Experiment experiment)
>>>> launchWorkflow(Experiment experiment)
>>>> 
>>>> launchExperiment(Experiment experiment)
>>>> struct Experiment{
>>>>  ...
>>>>  string executableId // application id for single app or workflow
>>>> template id for workflow
>>>>  ...
>>>> }
>>>> 
>>>> Any thoughts?
>>>> 
>>>> 
>>>> On Mon, Jun 23, 2014 at 7:30 PM, Saminda Wijeratne <sa...@gmail.com>
>>>> wrote:
>>>> 
>>>>> With a few updates to the Orchestrator CPI we are carrying ahead the
>>>>> updating the workflow interpreter to support workflow executions in
>>>>> Airavata for 0.13 release as the attached diagram.
>>>>> 
>>>>> 
>>>>> [image: Inline image 1]
>> 
>

Re: Enabling Workflow Support through Orchestrator

Posted by IU-Gmail <ra...@gmail.com>.

In choice 3 what does "launchExperiment(Experiment experiment)” mean?  This will add confusion for the API user and developer. I like choice 3 for clarity and we already have workflow related elements in the experiment Struct. 

Thanks
Raminder
On Jun 24, 2014, at 10:00 AM, Suresh Marru <sm...@apache.org> wrote:

> This is a good summary. I am inclined on Choice 3. The experiment data structure is similar for both single application and workflows, but the API calls are explicit. From a user stand point, both application and workflows have inputs, outputs and QoS configurations. But the level of details exposed in workflows is more granular. So the data structure can be re-used.
> 
> I also worry about using the magic parameters, the more we stay away from XOR like situations, it may be unambiguous. 
> 
> Suresh
> 
> On Jun 24, 2014, at 9:36 AM, Marlon Pierce <ma...@iu.edu> wrote:
> 
>> Hi Saminda--
>> 
>> Can you say more about why you have three options and what the tradeoffs are?  Below is my understanding.
>> 
>> * Choice 2: one launch method and one executableId for both workflows and single applications, so they are treated in the API as fundamentally the same. Beautiful uniformity but may be more contorted to implement.  I like this as an API call but implementing it may have unintended consequences.
>> 
>> * Choice 1: still has a universal launch method and execID but makes the execution type explicit with ExecType.  This makes the API user responsible for making this choice.  Increases the chance that the API user will make a mistake.  I don't like it for that reason.
>> 
>> * Choice 3: different methods for launching single apps and workflows, but the Experiment structure is the same as Choice 2. Not as beautiful as Choice 2 but may have a cleaner implementation. API user probably knows they need a workflow, but what happens if they send an Experiment object of the wrong type to one of the methods (workflow Experiment to launchApplication)?
>> 
>> * Choice 4 (not shown): like Choice 3 but with WorkflowExperiment struct for workflows and launchWorkflow(WorkflowExperiment).
>> 
>> 
>> I like Choice 2 (keep the API simple) and then Choice 4 (if you can't make it simple, make it unambiguous).  Choice 2 is my least favorite (API user must supply the right magic parameter).
>> 
>> 
>> Marlon
>> 
>> 
>> 
>> On 6/23/14, 7:56 PM, Saminda Wijeratne wrote:
>>> In order to distinguish single application vs workflow execution in the API
>>> we thought of few choices (trivial parameters not shown here and proposed
>>> parameter/property names not well thought out yet).
>>> 
>>> *Choice 1*
>>> launchExperiment(Experiment experiment)
>>> struct Experiment{
>>>   ...
>>>   string executableId // application id for single app or workflow
>>> template id for workflow
>>>   ExecType type // SINGLE_APP/WORKFLOW
>>>   ...
>>> }
>>> 
>>> *Choice 2*
>>> launchExperiment(Experiment experiment)
>>> struct Experiment{
>>>   ...
>>>   string executableId // unique id for application id for single app or
>>> workflow template id for workflow
>>>   ...
>>> }
>>> 
>>> *Choice 3*
>>> launchApplication(Experiment experiment)
>>> launchWorkflow(Experiment experiment)
>>> 
>>> launchExperiment(Experiment experiment)
>>> struct Experiment{
>>>   ...
>>>   string executableId // application id for single app or workflow
>>> template id for workflow
>>>   ...
>>> }
>>> 
>>> Any thoughts?
>>> 
>>> 
>>> On Mon, Jun 23, 2014 at 7:30 PM, Saminda Wijeratne <sa...@gmail.com>
>>> wrote:
>>> 
>>>> With a few updates to the Orchestrator CPI we are carrying ahead the
>>>> updating the workflow interpreter to support workflow executions in
>>>> Airavata for 0.13 release as the attached diagram.
>>>> 
>>>> 
>>>> [image: Inline image 1]
>

Re: Enabling Workflow Support through Orchestrator

Posted by Suresh Marru <sm...@apache.org>.

This is a good summary. I am inclined on Choice 3. The experiment data structure is similar for both single application and workflows, but the API calls are explicit. From a user stand point, both application and workflows have inputs, outputs and QoS configurations. But the level of details exposed in workflows is more granular. So the data structure can be re-used.

I also worry about using the magic parameters, the more we stay away from XOR like situations, it may be unambiguous. 

Suresh

On Jun 24, 2014, at 9:36 AM, Marlon Pierce <ma...@iu.edu> wrote:

> Hi Saminda--
> 
> Can you say more about why you have three options and what the tradeoffs are?  Below is my understanding.
> 
> * Choice 2: one launch method and one executableId for both workflows and single applications, so they are treated in the API as fundamentally the same. Beautiful uniformity but may be more contorted to implement.  I like this as an API call but implementing it may have unintended consequences.
> 
> * Choice 1: still has a universal launch method and execID but makes the execution type explicit with ExecType.  This makes the API user responsible for making this choice.  Increases the chance that the API user will make a mistake.  I don't like it for that reason.
> 
> * Choice 3: different methods for launching single apps and workflows, but the Experiment structure is the same as Choice 2. Not as beautiful as Choice 2 but may have a cleaner implementation. API user probably knows they need a workflow, but what happens if they send an Experiment object of the wrong type to one of the methods (workflow Experiment to launchApplication)?
> 
> * Choice 4 (not shown): like Choice 3 but with WorkflowExperiment struct for workflows and launchWorkflow(WorkflowExperiment).
> 
> 
> I like Choice 2 (keep the API simple) and then Choice 4 (if you can't make it simple, make it unambiguous).  Choice 2 is my least favorite (API user must supply the right magic parameter).
> 
> 
> Marlon
> 
> 
> 
> On 6/23/14, 7:56 PM, Saminda Wijeratne wrote:
>> In order to distinguish single application vs workflow execution in the API
>> we thought of few choices (trivial parameters not shown here and proposed
>> parameter/property names not well thought out yet).
>> 
>> *Choice 1*
>> launchExperiment(Experiment experiment)
>> struct Experiment{
>>    ...
>>    string executableId // application id for single app or workflow
>> template id for workflow
>>    ExecType type // SINGLE_APP/WORKFLOW
>>    ...
>> }
>> 
>> *Choice 2*
>> launchExperiment(Experiment experiment)
>> struct Experiment{
>>    ...
>>    string executableId // unique id for application id for single app or
>> workflow template id for workflow
>>    ...
>> }
>> 
>> *Choice 3*
>> launchApplication(Experiment experiment)
>> launchWorkflow(Experiment experiment)
>> 
>> launchExperiment(Experiment experiment)
>> struct Experiment{
>>    ...
>>    string executableId // application id for single app or workflow
>> template id for workflow
>>    ...
>> }
>> 
>> Any thoughts?
>> 
>> 
>> On Mon, Jun 23, 2014 at 7:30 PM, Saminda Wijeratne <sa...@gmail.com>
>> wrote:
>> 
>>> With a few updates to the Orchestrator CPI we are carrying ahead the
>>> updating the workflow interpreter to support workflow executions in
>>> Airavata for 0.13 release as the attached diagram.
>>> 
>>> 
>>> [image: Inline image 1]

Re: Enabling Workflow Support through Orchestrator

Posted by Marlon Pierce <ma...@iu.edu>.

Hi Saminda--

Can you say more about why you have three options and what the tradeoffs 
are?  Below is my understanding.

* Choice 2: one launch method and one executableId for both workflows 
and single applications, so they are treated in the API as fundamentally 
the same.  Beautiful uniformity but may be more contorted to implement.  
I like this as an API call but implementing it may have unintended 
consequences.

* Choice 1: still has a universal launch method and execID but makes the 
execution type explicit with ExecType.  This makes the API user 
responsible for making this choice.  Increases the chance that the API 
user will make a mistake.  I don't like it for that reason.

* Choice 3: different methods for launching single apps and workflows, 
but the Experiment structure is the same as Choice 2. Not as beautiful 
as Choice 2 but may have a cleaner implementation. API user probably 
knows they need a workflow, but what happens if they send an Experiment 
object of the wrong type to one of the methods (workflow Experiment to 
launchApplication)?

* Choice 4 (not shown): like Choice 3 but with WorkflowExperiment struct 
for workflows and launchWorkflow(WorkflowExperiment).

I like Choice 2 (keep the API simple) and then Choice 4 (if you can't 
make it simple, make it unambiguous).  Choice 2 is my least favorite 
(API user must supply the right magic parameter).

Marlon

On 6/23/14, 7:56 PM, Saminda Wijeratne wrote:
> In order to distinguish single application vs workflow execution in the API
> we thought of few choices (trivial parameters not shown here and proposed
> parameter/property names not well thought out yet).
>
> *Choice 1*
> launchExperiment(Experiment experiment)
> struct Experiment{
>     ...
>     string executableId // application id for single app or workflow
> template id for workflow
>     ExecType type // SINGLE_APP/WORKFLOW
>     ...
> }
>
> *Choice 2*
> launchExperiment(Experiment experiment)
> struct Experiment{
>     ...
>     string executableId // unique id for application id for single app or
> workflow template id for workflow
>     ...
> }
>
> *Choice 3*
> launchApplication(Experiment experiment)
> launchWorkflow(Experiment experiment)
>
> launchExperiment(Experiment experiment)
> struct Experiment{
>     ...
>     string executableId // application id for single app or workflow
> template id for workflow
>     ...
> }
>
> Any thoughts?
>
>
> On Mon, Jun 23, 2014 at 7:30 PM, Saminda Wijeratne <sa...@gmail.com>
> wrote:
>
>> With a few updates to the Orchestrator CPI we are carrying ahead the
>> updating the workflow interpreter to support workflow executions in
>> Airavata for 0.13 release as the attached diagram.
>>
>>
>> [image: Inline image 1]
>>

Re: Enabling Workflow Support through Orchestrator

Posted by Saminda Wijeratne <sa...@gmail.com>.

In order to distinguish single application vs workflow execution in the API
we thought of few choices (trivial parameters not shown here and proposed
parameter/property names not well thought out yet).

*Choice 1*
launchExperiment(Experiment experiment)
struct Experiment{
   ...
   string executableId // application id for single app or workflow
template id for workflow
   ExecType type // SINGLE_APP/WORKFLOW
   ...
}

*Choice 2*
launchExperiment(Experiment experiment)
struct Experiment{
   ...
   string executableId // unique id for application id for single app or
workflow template id for workflow
   ...
}

*Choice 3*
launchApplication(Experiment experiment)
launchWorkflow(Experiment experiment)

launchExperiment(Experiment experiment)
struct Experiment{
   ...
   string executableId // application id for single app or workflow
template id for workflow
   ...
}

Any thoughts?

On Mon, Jun 23, 2014 at 7:30 PM, Saminda Wijeratne <sa...@gmail.com>
wrote:

> With a few updates to the Orchestrator CPI we are carrying ahead the
> updating the workflow interpreter to support workflow executions in
> Airavata for 0.13 release as the attached diagram.
>
>
> [image: Inline image 1]
>

Re: Enabling Workflow Support through Orchestrator

Posted by Suresh Marru <sm...@apache.org>.

Hi Saminda,

This looks like a good plan. The way I am interpreting Node and Task in experiment is Node is more like an Abstract object and Task is interface like object. More preciously, a gateway might request run experiment and Orchestrator might decide to use brute-force scheduling and create 3 tasks on three machines. If this is correct:

* I think we need to rename the Node to something more descriptive - (this has been inherited from workflow terminology which may not apply)

* I think the node in a workflow should also call the same abstract method (currently launch experiment). 

Suresh

On Jun 23, 2014, at 7:30 PM, Saminda Wijeratne <sa...@gmail.com> wrote:

> With a few updates to the Orchestrator CPI we are carrying ahead the updating the workflow interpreter to support workflow executions in Airavata for 0.13 release as the attached diagram. 
> 
> 
> <wi-support-in-orchestrator.png>