You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@airavata.apache.org by Udayanga Wickramasinghe <ma...@gmail.com> on 2013/08/13 02:56:53 UTC

How to design a workflow in several stages/steps on a remote environment

Hi ,
I have a workflow (which is a simple one at the moment) that will execute
on a remote environment in several stages. Actually I have registered a
command line application on Airavata that will use SSH provider to run them
in a pipeline. For a single step this works fine but i am having trouble
connecting them to execute one after the other.

My requirement is like this , first step of the workflow will take several
inputs and generate an output file (using a remote load leveler script
OR/AND MPI application ) and we need to block/wait until a result is
generated and we take the this result plus some more inputs and execute the
second. Likewise this process will continue for several stages on a remote
grid environment (ie:- Bigred) and final result will be copied back to our
staged servers. How can we achieve a blocking wait in a workflow with
Airavata specially with load leveler/mpi application job submission being
asynchronous ? Is there any special constuct that support this out of the
box or do we have to write extensions ? Even if we can create multiple
command line application/s easily with Airavata i don't see a way to link
them in a pipeline manner. I would very much appreciate your thoughts on
this and possible way to approach this.

Thanks
Udayanga


-- 
http://www.udayangawiki.blogspot.com

Re: How to design a workflow in several stages/steps on a remote environment

Posted by Udayanga Wickramasinghe <ma...@gmail.com>.

Hi Suresh,
That would be great because we will migrate all our tools and data to
BigRed II eventually (possibly before the end of this year).

Thanks ,
Udayanga


On Tue, Aug 13, 2013 at 1:34 PM, Suresh Marru <sm...@apache.org> wrote:

> Hi Udayanga,
>
> In addition to enhancements you mention, note that we will have support
> for Big Red II in Airavata one way or another. We will keep you posted.
>
> Suresh
>
> On Aug 13, 2013, at 1:26 PM, Udayanga Wickramasinghe <
> mastershield2007@gmail.com> wrote:
>
> > Hi Suresh,
> > Thanks a lot for your thoughts on this. I see control flow strategy
> would work perfect for my case , however sighting your problems at GRAM
> support on Bigred i may need to implement an 'synchronous' call at the
> Bigred side (ie:- a polling mechanism) . Unfortunately we don't have a
> Quarry cluster as well and we are at the moment migrating to Bigred2. I
> think the future Airavata support you mentioned would be the perfect case
> for a scenario like ours (because we have hundreds of scripts that perform
> molecular dynamics that we execute remotely using ssh ) and it would be a
> great addition to the Airavata tool set.
> >
> > Regards
> > Udayanga
> >
> >
> > On Mon, Aug 12, 2013 at 9:45 PM, Suresh Marru <sm...@apache.org> wrote:
> > Hi Udayanga,
> >
> > Airavata workflows supports both data flow and control flow executions.
> That means if you connect output of first step to input of second step,
> then execution of second step will wait until the output from step 1 is
> available. In control flow, you can connect the right hand top corner of
> the step 1 to left hand bottom corner of step 2. This will indicate the
> execution of step 2 will wait until step 1 is executed. This will be
> independent of data dependencies.
> >
> > Your workflow is not working since you are using SSH provider which as
> you described the load leveler script is executing as a synchronous
> command. There are plans in upcoming 0.9 release to support asynchronous
> local and ssh provider, they should help you. Or if you have a Grid
> middleware like GRAM, then they by nature support asynchronous batch
> submissions. Unfortunately the load leveler version on Big Red is only
> partial implementation of the load leveler specification and only web
> services version of the GRAM (GRAM 4) used to work with it. The latest
> pre-ws gram (GRAM 5), does not support load leveler version on Big Red. Do
> you have accounts on Quarry cluster? It has a GRAM5 installation which
> should work with Airavata.
> >
> > In the near future, we plan to develop native local resource manager
> integration so Airavata can directly interact with batch systems through
> ssh and gsi-ssh protocols.
> >
> > Suresh
> >
> > On Aug 12, 2013, at 8:56 PM, Udayanga Wickramasinghe <
> mastershield2007@gmail.com> wrote:
> >
> > > Hi ,
> > > I have a workflow (which is a simple one at the moment) that will
> execute on a remote environment in several stages. Actually I have
> registered a command line application on Airavata that will use SSH
> provider to run them in a pipeline. For a single step this works fine but i
> am having trouble connecting them to execute one after the other.
> > >
> > > My requirement is like this , first step of the workflow will take
> several inputs and generate an output file (using a remote load leveler
> script OR/AND MPI application ) and we need to block/wait until a result is
> generated and we take the this result plus some more inputs and execute the
> second. Likewise this process will continue for several stages on a remote
> grid environment (ie:- Bigred) and final result will be copied back to our
> staged servers. How can we achieve a blocking wait in a workflow with
> Airavata specially with load leveler/mpi application job submission being
> asynchronous ? Is there any special constuct that support this out of the
> box or do we have to write extensions ? Even if we can create multiple
> command line application/s easily with Airavata i don't see a way to link
> them in a pipeline manner. I would very much appreciate your thoughts on
> this and possible way to approach this.
> > >
> > > Thanks
> > > Udayanga
> > >
> > >
> > > --
> > > http://www.udayangawiki.blogspot.com
> >
> >
> >
> >
> > --
> > http://www.udayangawiki.blogspot.com
>
>


-- 
http://www.udayangawiki.blogspot.com

Re: How to design a workflow in several stages/steps on a remote environment

Posted by Suresh Marru <sm...@apache.org>.

Hi Udayanga,

In addition to enhancements you mention, note that we will have support for Big Red II in Airavata one way or another. We will keep you posted.

Suresh

On Aug 13, 2013, at 1:26 PM, Udayanga Wickramasinghe <ma...@gmail.com> wrote:

> Hi Suresh,
> Thanks a lot for your thoughts on this. I see control flow strategy would work perfect for my case , however sighting your problems at GRAM support on Bigred i may need to implement an 'synchronous' call at the Bigred side (ie:- a polling mechanism) . Unfortunately we don't have a Quarry cluster as well and we are at the moment migrating to Bigred2. I think the future Airavata support you mentioned would be the perfect case for a scenario like ours (because we have hundreds of scripts that perform molecular dynamics that we execute remotely using ssh ) and it would be a great addition to the Airavata tool set.
> 
> Regards
> Udayanga
> 
> 
> On Mon, Aug 12, 2013 at 9:45 PM, Suresh Marru <sm...@apache.org> wrote:
> Hi Udayanga,
> 
> Airavata workflows supports both data flow and control flow executions. That means if you connect output of first step to input of second step, then execution of second step will wait until the output from step 1 is available. In control flow, you can connect the right hand top corner of the step 1 to left hand bottom corner of step 2. This will indicate the execution of step 2 will wait until step 1 is executed. This will be independent of data dependencies.
> 
> Your workflow is not working since you are using SSH provider which as you described the load leveler script is executing as a synchronous command. There are plans in upcoming 0.9 release to support asynchronous local and ssh provider, they should help you. Or if you have a Grid middleware like GRAM, then they by nature support asynchronous batch submissions. Unfortunately the load leveler version on Big Red is only partial implementation of the load leveler specification and only web services version of the GRAM (GRAM 4) used to work with it. The latest pre-ws gram (GRAM 5), does not support load leveler version on Big Red. Do you have accounts on Quarry cluster? It has a GRAM5 installation which should work with Airavata.
> 
> In the near future, we plan to develop native local resource manager integration so Airavata can directly interact with batch systems through ssh and gsi-ssh protocols.
> 
> Suresh
> 
> On Aug 12, 2013, at 8:56 PM, Udayanga Wickramasinghe <ma...@gmail.com> wrote:
> 
> > Hi ,
> > I have a workflow (which is a simple one at the moment) that will execute on a remote environment in several stages. Actually I have registered a command line application on Airavata that will use SSH provider to run them in a pipeline. For a single step this works fine but i am having trouble connecting them to execute one after the other.
> >
> > My requirement is like this , first step of the workflow will take several inputs and generate an output file (using a remote load leveler script OR/AND MPI application ) and we need to block/wait until a result is generated and we take the this result plus some more inputs and execute the second. Likewise this process will continue for several stages on a remote grid environment (ie:- Bigred) and final result will be copied back to our staged servers. How can we achieve a blocking wait in a workflow with Airavata specially with load leveler/mpi application job submission being asynchronous ? Is there any special constuct that support this out of the box or do we have to write extensions ? Even if we can create multiple command line application/s easily with Airavata i don't see a way to link them in a pipeline manner. I would very much appreciate your thoughts on this and possible way to approach this.
> >
> > Thanks
> > Udayanga
> >
> >
> > --
> > http://www.udayangawiki.blogspot.com
> 
> 
> 
> 
> -- 
> http://www.udayangawiki.blogspot.com

Re: How to design a workflow in several stages/steps on a remote environment

Posted by Udayanga Wickramasinghe <ma...@gmail.com>.

Hi Suresh,
Thanks a lot for your thoughts on this. I see control flow strategy would
work perfect for my case , however sighting your problems at GRAM support
on Bigred i may need to implement an 'synchronous' call at the Bigred side
(ie:- a polling mechanism) . Unfortunately we don't have a Quarry cluster
as well and we are at the moment migrating to Bigred2. I think the future
Airavata support you mentioned would be the perfect case for a scenario
like ours (because we have hundreds of scripts that perform molecular
dynamics that we execute remotely using ssh ) and it would be a great
addition to the Airavata tool set.

Regards
Udayanga


On Mon, Aug 12, 2013 at 9:45 PM, Suresh Marru <sm...@apache.org> wrote:

> Hi Udayanga,
>
> Airavata workflows supports both data flow and control flow executions.
> That means if you connect output of first step to input of second step,
> then execution of second step will wait until the output from step 1 is
> available. In control flow, you can connect the right hand top corner of
> the step 1 to left hand bottom corner of step 2. This will indicate the
> execution of step 2 will wait until step 1 is executed. This will be
> independent of data dependencies.
>
> Your workflow is not working since you are using SSH provider which as you
> described the load leveler script is executing as a synchronous command.
> There are plans in upcoming 0.9 release to support asynchronous local and
> ssh provider, they should help you. Or if you have a Grid middleware like
> GRAM, then they by nature support asynchronous batch submissions.
> Unfortunately the load leveler version on Big Red is only partial
> implementation of the load leveler specification and only web services
> version of the GRAM (GRAM 4) used to work with it. The latest pre-ws gram
> (GRAM 5), does not support load leveler version on Big Red. Do you have
> accounts on Quarry cluster? It has a GRAM5 installation which should work
> with Airavata.
>
> In the near future, we plan to develop native local resource manager
> integration so Airavata can directly interact with batch systems through
> ssh and gsi-ssh protocols.
>
> Suresh
>
> On Aug 12, 2013, at 8:56 PM, Udayanga Wickramasinghe <
> mastershield2007@gmail.com> wrote:
>
> > Hi ,
> > I have a workflow (which is a simple one at the moment) that will
> execute on a remote environment in several stages. Actually I have
> registered a command line application on Airavata that will use SSH
> provider to run them in a pipeline. For a single step this works fine but i
> am having trouble connecting them to execute one after the other.
> >
> > My requirement is like this , first step of the workflow will take
> several inputs and generate an output file (using a remote load leveler
> script OR/AND MPI application ) and we need to block/wait until a result is
> generated and we take the this result plus some more inputs and execute the
> second. Likewise this process will continue for several stages on a remote
> grid environment (ie:- Bigred) and final result will be copied back to our
> staged servers. How can we achieve a blocking wait in a workflow with
> Airavata specially with load leveler/mpi application job submission being
> asynchronous ? Is there any special constuct that support this out of the
> box or do we have to write extensions ? Even if we can create multiple
> command line application/s easily with Airavata i don't see a way to link
> them in a pipeline manner. I would very much appreciate your thoughts on
> this and possible way to approach this.
> >
> > Thanks
> > Udayanga
> >
> >
> > --
> > http://www.udayangawiki.blogspot.com
>
>


-- 
http://www.udayangawiki.blogspot.com

Re: How to design a workflow in several stages/steps on a remote environment

Posted by Suresh Marru <sm...@apache.org>.

Hi Udayanga,

Airavata workflows supports both data flow and control flow executions. That means if you connect output of first step to input of second step, then execution of second step will wait until the output from step 1 is available. In control flow, you can connect the right hand top corner of the step 1 to left hand bottom corner of step 2. This will indicate the execution of step 2 will wait until step 1 is executed. This will be independent of data dependencies.

Your workflow is not working since you are using SSH provider which as you described the load leveler script is executing as a synchronous command. There are plans in upcoming 0.9 release to support asynchronous local and ssh provider, they should help you. Or if you have a Grid middleware like GRAM, then they by nature support asynchronous batch submissions. Unfortunately the load leveler version on Big Red is only partial implementation of the load leveler specification and only web services version of the GRAM (GRAM 4) used to work with it. The latest pre-ws gram (GRAM 5), does not support load leveler version on Big Red. Do you have accounts on Quarry cluster? It has a GRAM5 installation which should work with Airavata. 

In the near future, we plan to develop native local resource manager integration so Airavata can directly interact with batch systems through ssh and gsi-ssh protocols.

Suresh

On Aug 12, 2013, at 8:56 PM, Udayanga Wickramasinghe <ma...@gmail.com> wrote:

> Hi ,
> I have a workflow (which is a simple one at the moment) that will execute on a remote environment in several stages. Actually I have registered a command line application on Airavata that will use SSH provider to run them in a pipeline. For a single step this works fine but i am having trouble connecting them to execute one after the other.  
> 
> My requirement is like this , first step of the workflow will take several inputs and generate an output file (using a remote load leveler script OR/AND MPI application ) and we need to block/wait until a result is generated and we take the this result plus some more inputs and execute the second. Likewise this process will continue for several stages on a remote grid environment (ie:- Bigred) and final result will be copied back to our staged servers. How can we achieve a blocking wait in a workflow with Airavata specially with load leveler/mpi application job submission being asynchronous ? Is there any special constuct that support this out of the box or do we have to write extensions ? Even if we can create multiple command line application/s easily with Airavata i don't see a way to link them in a pipeline manner. I would very much appreciate your thoughts on this and possible way to approach this.
> 
> Thanks
> Udayanga 
>  
> 
> -- 
> http://www.udayangawiki.blogspot.com