You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by "Cayanan, Michael D (388J)" <mi...@jpl.nasa.gov> on 2013/03/07 16:41:36 UTC

Workflow2 - State Persistence

Hi Chris and/or Brian,

I'm currently on a project where PDS and AMMOS are joining forces to create a pipeline service and we'd like to use the latest Workflow2 to do that. They asked a question at last week's meeting regarding if the latest and greatest OODT Workflow will be able to do state persistence. I can let Paul Ramirez chime in any further details since he was at the meeting as well, but can the latest and greatest Workflow do that? And if so, under which engine? I'm currently using the following engine:


workflow.engine.factory = org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory


Thanks,

Mike



Re: Workflow2 - State Persistence

Posted by Cameron Goodale <si...@gmail.com>.
Hey Mike,

I don't have an answer to your question, but I wanted to ask about your use
case to better understand your question.

Do you intend to pass metadata from one completed workflow into the next?
 Or are you looking at different levels (i.e. Tasks, Events, etc..)  Can
you provide a simple example so we can kick around some ideas?

In the past when I have 2 workflows, where the outputs of WF1 and inputs to
WF2, I have used the Filemanager as the persistence layer between the two.
 Meaning we ingest the outputs of WF1, then ask WF2 to query the
Filemanager for it's inputs.  But I think you are asking for something
different here.

Good luck,


-Cameron


On Thu, Mar 7, 2013 at 7:41 AM, Cayanan, Michael D (388J) <
michael.d.cayanan@jpl.nasa.gov> wrote:

>  Hi Chris and/or Brian,
>
>  I'm currently on a project where PDS and AMMOS are joining forces to
> create a pipeline service and we'd like to use the latest Workflow2 to do
> that. They asked a question at last week's meeting regarding if the latest
> and greatest OODT Workflow will be able to do state persistence. I can let
> Paul Ramirez chime in any further details since he was at the meeting as
> well, but can the latest and greatest Workflow do that? And if so, under
> which engine? I'm currently using the following engine:
>
>   workflow.engine.factory =
> org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory
>
>
>  Thanks,
>
> Mike
>
>
>
>


-- 

Sent from a Tin Can attached to a String

Re: Workflow2 - State Persistence

Posted by Cameron Goodale <si...@gmail.com>.
Hey Mike,

I don't have an answer to your question, but I wanted to ask about your use
case to better understand your question.

Do you intend to pass metadata from one completed workflow into the next?
 Or are you looking at different levels (i.e. Tasks, Events, etc..)  Can
you provide a simple example so we can kick around some ideas?

In the past when I have 2 workflows, where the outputs of WF1 and inputs to
WF2, I have used the Filemanager as the persistence layer between the two.
 Meaning we ingest the outputs of WF1, then ask WF2 to query the
Filemanager for it's inputs.  But I think you are asking for something
different here.

Good luck,


-Cameron


On Thu, Mar 7, 2013 at 7:41 AM, Cayanan, Michael D (388J) <
michael.d.cayanan@jpl.nasa.gov> wrote:

>  Hi Chris and/or Brian,
>
>  I'm currently on a project where PDS and AMMOS are joining forces to
> create a pipeline service and we'd like to use the latest Workflow2 to do
> that. They asked a question at last week's meeting regarding if the latest
> and greatest OODT Workflow will be able to do state persistence. I can let
> Paul Ramirez chime in any further details since he was at the meeting as
> well, but can the latest and greatest Workflow do that? And if so, under
> which engine? I'm currently using the following engine:
>
>   workflow.engine.factory =
> org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory
>
>
>  Thanks,
>
> Mike
>
>
>
>


-- 

Sent from a Tin Can attached to a String

Re: Workflow2 - State Persistence

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Mike,



From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Wednesday, March 13, 2013 10:40 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Re: Workflow2 - State Persistence

Hey Chris,

Sounds awesome. +1 to the additional features. I'm pretty sure I'll be able to help out as we go further along in developing this pipeline service for AMMOS. They eventually would like this feature and are asking me how much effort is required to do this.

I think from their end, what they want, from what I understand, is the following use-case scenario:

1) Run a command that will take a snapshot of the Workflow.
2) Bring the system down. Don't wait for anything to finish.
3) Upon restart, the Workflow should resume from where you last took the snapshot.


>> So, right now today, snapshots are taken at WorkflowLifecycleStates, as the Instance moves through
its lifecycle. The only thing that doesn't happen is to have the Engine pick not done workflows and then
restart them (#1 on mine below).

The AMMOS folks are worried that consistently doing state persistence will be a performance hog and so by having the ability to save the state on demand, they'll be able to have things run more efficiently.

>> We have some pretty clear evidence of this over the last 7 years doing data processing with Apache OODT that this is not the case, and though the performance of this matters, that can be horizontally scaled out at minimum overhead and complexity to support such a persistence.

Hopefully we can tag up soon to go over this so that I can let them know how much effort this implementation would take.

>> Yep hopefully this week!

:)

Cheers,
Chris


Cheers,
Mike

From: <Mattmann>, Chris Mattmann <Ch...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Tuesday, March 12, 2013 8:32 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Re: Workflow2 - State Persistence

Hey Mike,

Workflow information (state, etc.) is persisted to the WorkflowInstanceRepository. In Wengine/Workflow2 support in trunk it makes heavy use of this, but the support is evolving.

I think for 0.6 I'll add some functionality and features (and would appreciate any help) to allow users to:

  1.  Specify in workflow.properties whether or not prior not completed (or not "finished" category state) workflows should be cleared out and/or should be restarted and run on startup. Maybe a property like org.apache.oodt.cas.workflow.engine.unfinishedWorkflows.complete=true or false
  2.  Finish support in WorkflowInstanceRepository for persisting all new Wengine state information

Those are the 2 items that should be done for 0.6 to support this feature. It's on the roadmap just not fully done yet.

Cheers,
Chris


From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Thursday, March 7, 2013 8:41 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Workflow2 - State Persistence

Hi Chris and/or Brian,

I'm currently on a project where PDS and AMMOS are joining forces to create a pipeline service and we'd like to use the latest Workflow2 to do that. They asked a question at last week's meeting regarding if the latest and greatest OODT Workflow will be able to do state persistence. I can let Paul Ramirez chime in any further details since he was at the meeting as well, but can the latest and greatest Workflow do that? And if so, under which engine? I'm currently using the following engine:


workflow.engine.factory = org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory


Thanks,

Mike



Re: Workflow2 - State Persistence

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Mike,



From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Wednesday, March 13, 2013 10:40 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Re: Workflow2 - State Persistence

Hey Chris,

Sounds awesome. +1 to the additional features. I'm pretty sure I'll be able to help out as we go further along in developing this pipeline service for AMMOS. They eventually would like this feature and are asking me how much effort is required to do this.

I think from their end, what they want, from what I understand, is the following use-case scenario:

1) Run a command that will take a snapshot of the Workflow.
2) Bring the system down. Don't wait for anything to finish.
3) Upon restart, the Workflow should resume from where you last took the snapshot.


>> So, right now today, snapshots are taken at WorkflowLifecycleStates, as the Instance moves through
its lifecycle. The only thing that doesn't happen is to have the Engine pick not done workflows and then
restart them (#1 on mine below).

The AMMOS folks are worried that consistently doing state persistence will be a performance hog and so by having the ability to save the state on demand, they'll be able to have things run more efficiently.

>> We have some pretty clear evidence of this over the last 7 years doing data processing with Apache OODT that this is not the case, and though the performance of this matters, that can be horizontally scaled out at minimum overhead and complexity to support such a persistence.

Hopefully we can tag up soon to go over this so that I can let them know how much effort this implementation would take.

>> Yep hopefully this week!

:)

Cheers,
Chris


Cheers,
Mike

From: <Mattmann>, Chris Mattmann <Ch...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Tuesday, March 12, 2013 8:32 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Re: Workflow2 - State Persistence

Hey Mike,

Workflow information (state, etc.) is persisted to the WorkflowInstanceRepository. In Wengine/Workflow2 support in trunk it makes heavy use of this, but the support is evolving.

I think for 0.6 I'll add some functionality and features (and would appreciate any help) to allow users to:

  1.  Specify in workflow.properties whether or not prior not completed (or not "finished" category state) workflows should be cleared out and/or should be restarted and run on startup. Maybe a property like org.apache.oodt.cas.workflow.engine.unfinishedWorkflows.complete=true or false
  2.  Finish support in WorkflowInstanceRepository for persisting all new Wengine state information

Those are the 2 items that should be done for 0.6 to support this feature. It's on the roadmap just not fully done yet.

Cheers,
Chris


From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Thursday, March 7, 2013 8:41 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Workflow2 - State Persistence

Hi Chris and/or Brian,

I'm currently on a project where PDS and AMMOS are joining forces to create a pipeline service and we'd like to use the latest Workflow2 to do that. They asked a question at last week's meeting regarding if the latest and greatest OODT Workflow will be able to do state persistence. I can let Paul Ramirez chime in any further details since he was at the meeting as well, but can the latest and greatest Workflow do that? And if so, under which engine? I'm currently using the following engine:


workflow.engine.factory = org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory


Thanks,

Mike



Re: Workflow2 - State Persistence

Posted by "Cayanan, Michael D (388J)" <mi...@jpl.nasa.gov>.
Hey Chris,

Sounds awesome. +1 to the additional features. I'm pretty sure I'll be able to help out as we go further along in developing this pipeline service for AMMOS. They eventually would like this feature and are asking me how much effort is required to do this.

I think from their end, what they want, from what I understand, is the following use-case scenario:

1) Run a command that will take a snapshot of the Workflow.
2) Bring the system down. Don't wait for anything to finish.
3) Upon restart, the Workflow should resume from where you last took the snapshot.

The AMMOS folks are worried that consistently doing state persistence will be a performance hog and so by having the ability to save the state on demand, they'll be able to have things run more efficiently.

Hopefully we can tag up soon to go over this so that I can let them know how much effort this implementation would take.

Cheers,
Mike

From: <Mattmann>, Chris Mattmann <Ch...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Tuesday, March 12, 2013 8:32 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Re: Workflow2 - State Persistence

Hey Mike,

Workflow information (state, etc.) is persisted to the WorkflowInstanceRepository. In Wengine/Workflow2 support in trunk it makes heavy use of this, but the support is evolving.

I think for 0.6 I'll add some functionality and features (and would appreciate any help) to allow users to:

  1.  Specify in workflow.properties whether or not prior not completed (or not "finished" category state) workflows should be cleared out and/or should be restarted and run on startup. Maybe a property like org.apache.oodt.cas.workflow.engine.unfinishedWorkflows.complete=true or false
  2.  Finish support in WorkflowInstanceRepository for persisting all new Wengine state information

Those are the 2 items that should be done for 0.6 to support this feature. It's on the roadmap just not fully done yet.

Cheers,
Chris


From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Thursday, March 7, 2013 8:41 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Workflow2 - State Persistence

Hi Chris and/or Brian,

I'm currently on a project where PDS and AMMOS are joining forces to create a pipeline service and we'd like to use the latest Workflow2 to do that. They asked a question at last week's meeting regarding if the latest and greatest OODT Workflow will be able to do state persistence. I can let Paul Ramirez chime in any further details since he was at the meeting as well, but can the latest and greatest Workflow do that? And if so, under which engine? I'm currently using the following engine:


workflow.engine.factory = org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory


Thanks,

Mike



Re: Workflow2 - State Persistence

Posted by "Cayanan, Michael D (388J)" <mi...@jpl.nasa.gov>.
Hey Chris,

Sounds awesome. +1 to the additional features. I'm pretty sure I'll be able to help out as we go further along in developing this pipeline service for AMMOS. They eventually would like this feature and are asking me how much effort is required to do this.

I think from their end, what they want, from what I understand, is the following use-case scenario:

1) Run a command that will take a snapshot of the Workflow.
2) Bring the system down. Don't wait for anything to finish.
3) Upon restart, the Workflow should resume from where you last took the snapshot.

The AMMOS folks are worried that consistently doing state persistence will be a performance hog and so by having the ability to save the state on demand, they'll be able to have things run more efficiently.

Hopefully we can tag up soon to go over this so that I can let them know how much effort this implementation would take.

Cheers,
Mike

From: <Mattmann>, Chris Mattmann <Ch...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Tuesday, March 12, 2013 8:32 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Re: Workflow2 - State Persistence

Hey Mike,

Workflow information (state, etc.) is persisted to the WorkflowInstanceRepository. In Wengine/Workflow2 support in trunk it makes heavy use of this, but the support is evolving.

I think for 0.6 I'll add some functionality and features (and would appreciate any help) to allow users to:

  1.  Specify in workflow.properties whether or not prior not completed (or not "finished" category state) workflows should be cleared out and/or should be restarted and run on startup. Maybe a property like org.apache.oodt.cas.workflow.engine.unfinishedWorkflows.complete=true or false
  2.  Finish support in WorkflowInstanceRepository for persisting all new Wengine state information

Those are the 2 items that should be done for 0.6 to support this feature. It's on the roadmap just not fully done yet.

Cheers,
Chris


From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Thursday, March 7, 2013 8:41 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Workflow2 - State Persistence

Hi Chris and/or Brian,

I'm currently on a project where PDS and AMMOS are joining forces to create a pipeline service and we'd like to use the latest Workflow2 to do that. They asked a question at last week's meeting regarding if the latest and greatest OODT Workflow will be able to do state persistence. I can let Paul Ramirez chime in any further details since he was at the meeting as well, but can the latest and greatest Workflow do that? And if so, under which engine? I'm currently using the following engine:


workflow.engine.factory = org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory


Thanks,

Mike



Re: Workflow2 - State Persistence

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Mike,

Workflow information (state, etc.) is persisted to the WorkflowInstanceRepository. In Wengine/Workflow2 support in trunk it makes heavy use of this, but the support is evolving.

I think for 0.6 I'll add some functionality and features (and would appreciate any help) to allow users to:

  1.  Specify in workflow.properties whether or not prior not completed (or not "finished" category state) workflows should be cleared out and/or should be restarted and run on startup. Maybe a property like org.apache.oodt.cas.workflow.engine.unfinishedWorkflows.complete=true or false
  2.  Finish support in WorkflowInstanceRepository for persisting all new Wengine state information

Those are the 2 items that should be done for 0.6 to support this feature. It's on the roadmap just not fully done yet.

Cheers,
Chris


From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Thursday, March 7, 2013 8:41 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Workflow2 - State Persistence

Hi Chris and/or Brian,

I'm currently on a project where PDS and AMMOS are joining forces to create a pipeline service and we'd like to use the latest Workflow2 to do that. They asked a question at last week's meeting regarding if the latest and greatest OODT Workflow will be able to do state persistence. I can let Paul Ramirez chime in any further details since he was at the meeting as well, but can the latest and greatest Workflow do that? And if so, under which engine? I'm currently using the following engine:


workflow.engine.factory = org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory


Thanks,

Mike



Re: Workflow2 - State Persistence

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Mike,

Workflow information (state, etc.) is persisted to the WorkflowInstanceRepository. In Wengine/Workflow2 support in trunk it makes heavy use of this, but the support is evolving.

I think for 0.6 I'll add some functionality and features (and would appreciate any help) to allow users to:

  1.  Specify in workflow.properties whether or not prior not completed (or not "finished" category state) workflows should be cleared out and/or should be restarted and run on startup. Maybe a property like org.apache.oodt.cas.workflow.engine.unfinishedWorkflows.complete=true or false
  2.  Finish support in WorkflowInstanceRepository for persisting all new Wengine state information

Those are the 2 items that should be done for 0.6 to support this feature. It's on the roadmap just not fully done yet.

Cheers,
Chris


From: <Cayanan>, "Michael D (388J)" <mi...@jpl.nasa.gov>>
Reply-To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Date: Thursday, March 7, 2013 8:41 AM
To: "user@oodt.apache.org<ma...@oodt.apache.org>" <us...@oodt.apache.org>>
Subject: Workflow2 - State Persistence

Hi Chris and/or Brian,

I'm currently on a project where PDS and AMMOS are joining forces to create a pipeline service and we'd like to use the latest Workflow2 to do that. They asked a question at last week's meeting regarding if the latest and greatest OODT Workflow will be able to do state persistence. I can let Paul Ramirez chime in any further details since he was at the meeting as well, but can the latest and greatest Workflow do that? And if so, under which engine? I'm currently using the following engine:


workflow.engine.factory = org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory


Thanks,

Mike