You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airavata.apache.org by Saminda Wijeratne <sa...@gmail.com> on 2014/01/17 16:32:49 UTC

Orchestration Component implementation review

Following are few thoughts I had during my review of the component,

*Multi-threaded vs single threaded*
If we are going to have multi-threaded job submission the implementation
should work on handling race conditions. Essentially JobSubmitter should be
able to "lock" an experiment request before continuing processing that
request so that other JobSubmitters accessing the experiment requests a the
same time would skip it.

*Orchestrator service*
We might want to think of the possibility in future where we will be having
multiple deployments of an Airavata service. This could particularly be
true for SciGaP. We may have to think how some of the internal data
structures/SPIs should be updated to accomodate such requirements in future.

*Orchestrator Component configurations*
I see alot of places where the orchestrator can have configurations. I
think its too early finalize them, but I think we can start refactoring
them out perhaps to the airavata-server.properties. I'm also seeing the
orchestrator is now hardcoded to use default/admin gateway and username. I
think it should come from the request itself.

*Visibility of API functions*
I think initialize(), shutdown() and startJobSubmitter() functions should
not be part of the API because I don't see a scenario where the gateway
developer would be responsible for using them. They serve a more internal
purpose of managing the orchestrator component IMO. As Amila pointed out so
long ago (wink) functions that do not concern outside parties should not be
used as part of the API.

*Return values of Orchestrator API*
IMO unless it is specifically required to do so I think the functions does
not necessarily need to return anything other than throw exceptions when
needed. For example the launchExperiment can simply return void if all is
succesful and return an exception if something fails. Handling issues with
a try catch is not only simpler but also the explanations are readily
available for the user.

*Data persisted in registry*
ExperimentRequest.getUsername() : I think we should clarify what this
username denotes. In current API, in experiment submission we consider two
types of users. Submission user (the user who submits the experiment to the
Airavata Server - this is inferred by the request itself) and the execution
user (the user who corelates to the application executions of the gateway -
thus this user can be a different user for different gateway, eg: community
user, gateway user).
I think we should persist the date/time of the experiment request as well.
Also when retrying of API functions in the case of a failure in an previous
attempt there should be a way to not to repeat already performed steps or
gracefully roleback and redo those required steps as necessary. While such
actions could be transparent to the user sometimes it might make sense to
allow user to be notified of success/failure of a retry. However this might
mean keeping additional records at the registry level.

Re: Orchestration Component implementation review

Posted by Marlon Pierce <ma...@iu.edu>.

Thanks for the review, Saminda.  There are a lot of good points below
that should go into Jira.  Since the multithreaded version is needed for
this to be really useful, I'd like to see more discussion on this now (a
hangout is probably coming soon).

What are the other high priority items for the orchestrator?


Marlon

On 1/17/14 10:32 AM, Saminda Wijeratne wrote:
> Following are few thoughts I had during my review of the component,
>
> *Multi-threaded vs single threaded*
> If we are going to have multi-threaded job submission the implementation
> should work on handling race conditions. Essentially JobSubmitter should be
> able to "lock" an experiment request before continuing processing that
> request so that other JobSubmitters accessing the experiment requests a the
> same time would skip it.
>
> *Orchestrator service*
> We might want to think of the possibility in future where we will be having
> multiple deployments of an Airavata service. This could particularly be
> true for SciGaP. We may have to think how some of the internal data
> structures/SPIs should be updated to accomodate such requirements in future.
>
> *Orchestrator Component configurations*
> I see alot of places where the orchestrator can have configurations. I
> think its too early finalize them, but I think we can start refactoring
> them out perhaps to the airavata-server.properties. I'm also seeing the
> orchestrator is now hardcoded to use default/admin gateway and username. I
> think it should come from the request itself.
>
> *Visibility of API functions*
> I think initialize(), shutdown() and startJobSubmitter() functions should
> not be part of the API because I don't see a scenario where the gateway
> developer would be responsible for using them. They serve a more internal
> purpose of managing the orchestrator component IMO. As Amila pointed out so
> long ago (wink) functions that do not concern outside parties should not be
> used as part of the API.
>
> *Return values of Orchestrator API*
> IMO unless it is specifically required to do so I think the functions does
> not necessarily need to return anything other than throw exceptions when
> needed. For example the launchExperiment can simply return void if all is
> succesful and return an exception if something fails. Handling issues with
> a try catch is not only simpler but also the explanations are readily
> available for the user.
>
> *Data persisted in registry*
> ExperimentRequest.getUsername() : I think we should clarify what this
> username denotes. In current API, in experiment submission we consider two
> types of users. Submission user (the user who submits the experiment to the
> Airavata Server - this is inferred by the request itself) and the execution
> user (the user who corelates to the application executions of the gateway -
> thus this user can be a different user for different gateway, eg: community
> user, gateway user).
> I think we should persist the date/time of the experiment request as well.
> Also when retrying of API functions in the case of a failure in an previous
> attempt there should be a way to not to repeat already performed steps or
> gracefully roleback and redo those required steps as necessary. While such
> actions could be transparent to the user sometimes it might make sense to
> allow user to be notified of success/failure of a retry. However this might
> mean keeping additional records at the registry level.
>

Re: Orchestration Component implementation review

Posted by Saminda Wijeratne <sa...@gmail.com>.

On Sun, Jan 19, 2014 at 8:48 PM, Amila Jayasekara
<th...@gmail.com>wrote:

>
>
>
> On Sun, Jan 19, 2014 at 5:33 PM, Lahiru Gunathilake <gl...@gmail.com>wrote:
>
>> Hi Chathuri,
>>
>>
>> On Fri, Jan 17, 2014 at 11:40 AM, Chathuri Wimalasena <
>> kamalasini@gmail.com> wrote:
>>
>>> Orchestrator table has only the current state (updated state). Previous
>>> statuses should be saved in the GFac_Job_Status table.
>>>
>> Since the order of the steps are defined, do we need to store the
>> previous states ?
>>
>
> I am also curious to know why we need to store previous states. We have a
> define state diagram for a job. Also we have log files if we want to debug
> a specific issue related to job statuses. So I am not sure why we need to
> store previous job states. Also who and when we will access previous job
> states ?
>
It is not meant for debugging but essentially could be used very easily if
needed unlike using a log file. This feature gets enabled only if
"enable.application.job.status.history" is set in the properties file. Once
set,the history table gets updated automatically without the intervention
of the GFac. Thus this feature doesn't create any overhead to the system
when its not enabled.
Why store previous statuses? Collecting statistics. Airavata itself is not
yet advance enough to use such data to optimize its scheduling but the
gateway could use the data if they need it.

>
> Thanks
> Amila
>
>
>>
>> Regards
>> Lahiru
>>
>>>
>>> Regards,
>>> Chathuri
>>>
>>>
>>> On Fri, Jan 17, 2014 at 11:25 AM, Sachith Withana <sw...@gmail.com>wrote:
>>>
>>>> Thanks Saminda for this informative review.
>>>>
>>>> In the case of the multi-threaded vs Single Threaded, where should we
>>>> have the synchronization enforced?
>>>> To my knowledge, the NewJobWorkers( Getting new Jobs and submitting
>>>> them) and the HangedJobWorkers are accessing the Orchestrator table to
>>>> select the new and hanged jobs.
>>>> Right now, the NewJobworkers are getting all the accepted jobs at once.
>>>> it's not focussed on one experiment.
>>>>
>>>> We need to reflect the changes in the Gfac job Statuses in the
>>>> Orchestrator table as well. So every time the status of a job change
>>>> through the Gfac, it will be accessing the Orchestrator table as well. (
>>>> I've sent an email previously describing the scenario)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samindaw@gmail.com
>>>> > wrote:
>>>>
>>>>> Following are few thoughts I had during my review of the component,
>>>>>
>>>>> *Multi-threaded vs single threaded*
>>>>> If we are going to have multi-threaded job submission the
>>>>> implementation should work on handling race conditions. Essentially
>>>>> JobSubmitter should be able to "lock" an experiment request before
>>>>> continuing processing that request so that other JobSubmitters accessing
>>>>> the experiment requests a the same time would skip it.
>>>>>
>>>>> *Orchestrator service*
>>>>> We might want to think of the possibility in future where we will be
>>>>> having multiple deployments of an Airavata service. This could particularly
>>>>> be true for SciGaP. We may have to think how some of the internal data
>>>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>>>
>>>>> *Orchestrator Component configurations*
>>>>> I see alot of places where the orchestrator can have configurations. I
>>>>> think its too early finalize them, but I think we can start refactoring
>>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>>>> think it should come from the request itself.
>>>>>
>>>>> *Visibility of API functions*
>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>>>> should not be part of the API because I don't see a scenario where the
>>>>> gateway developer would be responsible for using them. They serve a more
>>>>> internal purpose of managing the orchestrator component IMO. As Amila
>>>>> pointed out so long ago (wink) functions that do not concern outside
>>>>> parties should not be used as part of the API.
>>>>>
>>>>> *Return values of Orchestrator API*
>>>>> IMO unless it is specifically required to do so I think the functions
>>>>> does not necessarily need to return anything other than throw exceptions
>>>>> when needed. For example the launchExperiment can simply return void if all
>>>>> is succesful and return an exception if something fails. Handling issues
>>>>> with a try catch is not only simpler but also the explanations are readily
>>>>> available for the user.
>>>>>
>>>>> *Data persisted in registry*
>>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>>> username denotes. In current API, in experiment submission we consider two
>>>>> types of users. Submission user (the user who submits the experiment to the
>>>>> Airavata Server - this is inferred by the request itself) and the execution
>>>>> user (the user who corelates to the application executions of the gateway -
>>>>> thus this user can be a different user for different gateway, eg: community
>>>>> user, gateway user).
>>>>> I think we should persist the date/time of the experiment request as
>>>>> well.
>>>>> Also when retrying of API functions in the case of a failure in an
>>>>> previous attempt there should be a way to not to repeat already performed
>>>>> steps or gracefully roleback and redo those required steps as necessary.
>>>>> While such actions could be transparent to the user sometimes it might make
>>>>> sense to allow user to be notified of success/failure of a retry. However
>>>>> this might mean keeping additional records at the registry level.
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Sachith Withana
>>>>
>>>>
>>>
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>

Re: Orchestration Component implementation review

Posted by Amila Jayasekara <th...@gmail.com>.

On Sun, Jan 19, 2014 at 5:33 PM, Lahiru Gunathilake <gl...@gmail.com>wrote:

> Hi Chathuri,
>
>
> On Fri, Jan 17, 2014 at 11:40 AM, Chathuri Wimalasena <
> kamalasini@gmail.com> wrote:
>
>> Orchestrator table has only the current state (updated state). Previous
>> statuses should be saved in the GFac_Job_Status table.
>>
> Since the order of the steps are defined, do we need to store the previous
> states ?
>

I am also curious to know why we need to store previous states. We have a
define state diagram for a job. Also we have log files if we want to debug
a specific issue related to job statuses. So I am not sure why we need to
store previous job states. Also who and when we will access previous job
states ?

Thanks
Amila


>
> Regards
> Lahiru
>
>>
>> Regards,
>> Chathuri
>>
>>
>> On Fri, Jan 17, 2014 at 11:25 AM, Sachith Withana <sw...@gmail.com>wrote:
>>
>>> Thanks Saminda for this informative review.
>>>
>>> In the case of the multi-threaded vs Single Threaded, where should we
>>> have the synchronization enforced?
>>> To my knowledge, the NewJobWorkers( Getting new Jobs and submitting
>>> them) and the HangedJobWorkers are accessing the Orchestrator table to
>>> select the new and hanged jobs.
>>> Right now, the NewJobworkers are getting all the accepted jobs at once.
>>> it's not focussed on one experiment.
>>>
>>> We need to reflect the changes in the Gfac job Statuses in the
>>> Orchestrator table as well. So every time the status of a job change
>>> through the Gfac, it will be accessing the Orchestrator table as well. (
>>> I've sent an email previously describing the scenario)
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>>>
>>>> Following are few thoughts I had during my review of the component,
>>>>
>>>> *Multi-threaded vs single threaded*
>>>> If we are going to have multi-threaded job submission the
>>>> implementation should work on handling race conditions. Essentially
>>>> JobSubmitter should be able to "lock" an experiment request before
>>>> continuing processing that request so that other JobSubmitters accessing
>>>> the experiment requests a the same time would skip it.
>>>>
>>>> *Orchestrator service*
>>>> We might want to think of the possibility in future where we will be
>>>> having multiple deployments of an Airavata service. This could particularly
>>>> be true for SciGaP. We may have to think how some of the internal data
>>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>>
>>>> *Orchestrator Component configurations*
>>>> I see alot of places where the orchestrator can have configurations. I
>>>> think its too early finalize them, but I think we can start refactoring
>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>>> think it should come from the request itself.
>>>>
>>>> *Visibility of API functions*
>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>>> should not be part of the API because I don't see a scenario where the
>>>> gateway developer would be responsible for using them. They serve a more
>>>> internal purpose of managing the orchestrator component IMO. As Amila
>>>> pointed out so long ago (wink) functions that do not concern outside
>>>> parties should not be used as part of the API.
>>>>
>>>> *Return values of Orchestrator API*
>>>> IMO unless it is specifically required to do so I think the functions
>>>> does not necessarily need to return anything other than throw exceptions
>>>> when needed. For example the launchExperiment can simply return void if all
>>>> is succesful and return an exception if something fails. Handling issues
>>>> with a try catch is not only simpler but also the explanations are readily
>>>> available for the user.
>>>>
>>>> *Data persisted in registry*
>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>> username denotes. In current API, in experiment submission we consider two
>>>> types of users. Submission user (the user who submits the experiment to the
>>>> Airavata Server - this is inferred by the request itself) and the execution
>>>> user (the user who corelates to the application executions of the gateway -
>>>> thus this user can be a different user for different gateway, eg: community
>>>> user, gateway user).
>>>> I think we should persist the date/time of the experiment request as
>>>> well.
>>>> Also when retrying of API functions in the case of a failure in an
>>>> previous attempt there should be a way to not to repeat already performed
>>>> steps or gracefully roleback and redo those required steps as necessary.
>>>> While such actions could be transparent to the user sometimes it might make
>>>> sense to allow user to be notified of success/failure of a retry. However
>>>> this might mean keeping additional records at the registry level.
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks,
>>> Sachith Withana
>>>
>>>
>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Orchestration Component implementation review

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Chathuri,


On Fri, Jan 17, 2014 at 11:40 AM, Chathuri Wimalasena
<ka...@gmail.com>wrote:

> Orchestrator table has only the current state (updated state). Previous
> statuses should be saved in the GFac_Job_Status table.
>
Since the order of the steps are defined, do we need to store the previous
states ?

Regards
Lahiru

>
> Regards,
> Chathuri
>
>
> On Fri, Jan 17, 2014 at 11:25 AM, Sachith Withana <sw...@gmail.com>wrote:
>
>> Thanks Saminda for this informative review.
>>
>> In the case of the multi-threaded vs Single Threaded, where should we
>> have the synchronization enforced?
>> To my knowledge, the NewJobWorkers( Getting new Jobs and submitting them)
>> and the HangedJobWorkers are accessing the Orchestrator table to select the
>> new and hanged jobs.
>> Right now, the NewJobworkers are getting all the accepted jobs at once.
>> it's not focussed on one experiment.
>>
>> We need to reflect the changes in the Gfac job Statuses in the
>> Orchestrator table as well. So every time the status of a job change
>> through the Gfac, it will be accessing the Orchestrator table as well. (
>> I've sent an email previously describing the scenario)
>>
>>
>>
>>
>>
>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>>
>>> Following are few thoughts I had during my review of the component,
>>>
>>> *Multi-threaded vs single threaded*
>>> If we are going to have multi-threaded job submission the implementation
>>> should work on handling race conditions. Essentially JobSubmitter should be
>>> able to "lock" an experiment request before continuing processing that
>>> request so that other JobSubmitters accessing the experiment requests a the
>>> same time would skip it.
>>>
>>> *Orchestrator service*
>>> We might want to think of the possibility in future where we will be
>>> having multiple deployments of an Airavata service. This could particularly
>>> be true for SciGaP. We may have to think how some of the internal data
>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>
>>> *Orchestrator Component configurations*
>>> I see alot of places where the orchestrator can have configurations. I
>>> think its too early finalize them, but I think we can start refactoring
>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>> think it should come from the request itself.
>>>
>>> *Visibility of API functions*
>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>> should not be part of the API because I don't see a scenario where the
>>> gateway developer would be responsible for using them. They serve a more
>>> internal purpose of managing the orchestrator component IMO. As Amila
>>> pointed out so long ago (wink) functions that do not concern outside
>>> parties should not be used as part of the API.
>>>
>>> *Return values of Orchestrator API*
>>> IMO unless it is specifically required to do so I think the functions
>>> does not necessarily need to return anything other than throw exceptions
>>> when needed. For example the launchExperiment can simply return void if all
>>> is succesful and return an exception if something fails. Handling issues
>>> with a try catch is not only simpler but also the explanations are readily
>>> available for the user.
>>>
>>> *Data persisted in registry*
>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>> username denotes. In current API, in experiment submission we consider two
>>> types of users. Submission user (the user who submits the experiment to the
>>> Airavata Server - this is inferred by the request itself) and the execution
>>> user (the user who corelates to the application executions of the gateway -
>>> thus this user can be a different user for different gateway, eg: community
>>> user, gateway user).
>>> I think we should persist the date/time of the experiment request as
>>> well.
>>> Also when retrying of API functions in the case of a failure in an
>>> previous attempt there should be a way to not to repeat already performed
>>> steps or gracefully roleback and redo those required steps as necessary.
>>> While such actions could be transparent to the user sometimes it might make
>>> sense to allow user to be notified of success/failure of a retry. However
>>> this might mean keeping additional records at the registry level.
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Sachith Withana
>>
>>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Orchestration Component implementation review

Posted by Chathuri Wimalasena <ka...@gmail.com>.

Orchestrator table has only the current state (updated state). Previous
statuses should be saved in the GFac_Job_Status table.

Regards,
Chathuri


On Fri, Jan 17, 2014 at 11:25 AM, Sachith Withana <sw...@gmail.com>wrote:

> Thanks Saminda for this informative review.
>
> In the case of the multi-threaded vs Single Threaded, where should we have
> the synchronization enforced?
> To my knowledge, the NewJobWorkers( Getting new Jobs and submitting them)
> and the HangedJobWorkers are accessing the Orchestrator table to select the
> new and hanged jobs.
> Right now, the NewJobworkers are getting all the accepted jobs at once.
> it's not focussed on one experiment.
>
> We need to reflect the changes in the Gfac job Statuses in the
> Orchestrator table as well. So every time the status of a job change
> through the Gfac, it will be accessing the Orchestrator table as well. (
> I've sent an email previously describing the scenario)
>
>
>
>
>
> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>
>> Following are few thoughts I had during my review of the component,
>>
>> *Multi-threaded vs single threaded*
>> If we are going to have multi-threaded job submission the implementation
>> should work on handling race conditions. Essentially JobSubmitter should be
>> able to "lock" an experiment request before continuing processing that
>> request so that other JobSubmitters accessing the experiment requests a the
>> same time would skip it.
>>
>> *Orchestrator service*
>> We might want to think of the possibility in future where we will be
>> having multiple deployments of an Airavata service. This could particularly
>> be true for SciGaP. We may have to think how some of the internal data
>> structures/SPIs should be updated to accomodate such requirements in future.
>>
>> *Orchestrator Component configurations*
>> I see alot of places where the orchestrator can have configurations. I
>> think its too early finalize them, but I think we can start refactoring
>> them out perhaps to the airavata-server.properties. I'm also seeing the
>> orchestrator is now hardcoded to use default/admin gateway and username. I
>> think it should come from the request itself.
>>
>> *Visibility of API functions*
>> I think initialize(), shutdown() and startJobSubmitter() functions should
>> not be part of the API because I don't see a scenario where the gateway
>> developer would be responsible for using them. They serve a more internal
>> purpose of managing the orchestrator component IMO. As Amila pointed out so
>> long ago (wink) functions that do not concern outside parties should not be
>> used as part of the API.
>>
>> *Return values of Orchestrator API*
>> IMO unless it is specifically required to do so I think the functions
>> does not necessarily need to return anything other than throw exceptions
>> when needed. For example the launchExperiment can simply return void if all
>> is succesful and return an exception if something fails. Handling issues
>> with a try catch is not only simpler but also the explanations are readily
>> available for the user.
>>
>> *Data persisted in registry*
>> ExperimentRequest.getUsername() : I think we should clarify what this
>> username denotes. In current API, in experiment submission we consider two
>> types of users. Submission user (the user who submits the experiment to the
>> Airavata Server - this is inferred by the request itself) and the execution
>> user (the user who corelates to the application executions of the gateway -
>> thus this user can be a different user for different gateway, eg: community
>> user, gateway user).
>> I think we should persist the date/time of the experiment request as
>> well.
>> Also when retrying of API functions in the case of a failure in an
>> previous attempt there should be a way to not to repeat already performed
>> steps or gracefully roleback and redo those required steps as necessary.
>> While such actions could be transparent to the user sometimes it might make
>> sense to allow user to be notified of success/failure of a retry. However
>> this might mean keeping additional records at the registry level.
>>
>>
>
>
> --
> Thanks,
> Sachith Withana
>
>

Re: Orchestration Component implementation review

Posted by Sachith Withana <sw...@gmail.com>.

Thanks Saminda for this informative review.

In the case of the multi-threaded vs Single Threaded, where should we have
the synchronization enforced?
To my knowledge, the NewJobWorkers( Getting new Jobs and submitting them)
and the HangedJobWorkers are accessing the Orchestrator table to select the
new and hanged jobs.
Right now, the NewJobworkers are getting all the accepted jobs at once.
it's not focussed on one experiment.

We need to reflect the changes in the Gfac job Statuses in the Orchestrator
table as well. So every time the status of a job change through the Gfac,
it will be accessing the Orchestrator table as well. ( I've sent an email
previously describing the scenario)





On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:

> Following are few thoughts I had during my review of the component,
>
> *Multi-threaded vs single threaded*
> If we are going to have multi-threaded job submission the implementation
> should work on handling race conditions. Essentially JobSubmitter should be
> able to "lock" an experiment request before continuing processing that
> request so that other JobSubmitters accessing the experiment requests a the
> same time would skip it.
>
> *Orchestrator service*
> We might want to think of the possibility in future where we will be
> having multiple deployments of an Airavata service. This could particularly
> be true for SciGaP. We may have to think how some of the internal data
> structures/SPIs should be updated to accomodate such requirements in future.
>
> *Orchestrator Component configurations*
> I see alot of places where the orchestrator can have configurations. I
> think its too early finalize them, but I think we can start refactoring
> them out perhaps to the airavata-server.properties. I'm also seeing the
> orchestrator is now hardcoded to use default/admin gateway and username. I
> think it should come from the request itself.
>
> *Visibility of API functions*
> I think initialize(), shutdown() and startJobSubmitter() functions should
> not be part of the API because I don't see a scenario where the gateway
> developer would be responsible for using them. They serve a more internal
> purpose of managing the orchestrator component IMO. As Amila pointed out so
> long ago (wink) functions that do not concern outside parties should not be
> used as part of the API.
>
> *Return values of Orchestrator API*
> IMO unless it is specifically required to do so I think the functions does
> not necessarily need to return anything other than throw exceptions when
> needed. For example the launchExperiment can simply return void if all is
> succesful and return an exception if something fails. Handling issues with
> a try catch is not only simpler but also the explanations are readily
> available for the user.
>
> *Data persisted in registry*
> ExperimentRequest.getUsername() : I think we should clarify what this
> username denotes. In current API, in experiment submission we consider two
> types of users. Submission user (the user who submits the experiment to the
> Airavata Server - this is inferred by the request itself) and the execution
> user (the user who corelates to the application executions of the gateway -
> thus this user can be a different user for different gateway, eg: community
> user, gateway user).
> I think we should persist the date/time of the experiment request as well.
> Also when retrying of API functions in the case of a failure in an
> previous attempt there should be a way to not to repeat already performed
> steps or gracefully roleback and redo those required steps as necessary.
> While such actions could be transparent to the user sometimes it might make
> sense to allow user to be notified of success/failure of a retry. However
> this might mean keeping additional records at the registry level.
>
>


-- 
Thanks,
Sachith Withana

Re: Orchestration Component implementation review

Posted by Saminda Wijeratne <sa...@gmail.com>.

On Sun, Jan 19, 2014 at 2:31 PM, Lahiru Gunathilake <gl...@gmail.com>wrote:

>
> Hi Saminda,
>
> First thanks for reviewing the Orchestrator component.
>
> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>
>> Following are few thoughts I had during my review of the component,
>>
>> *Multi-threaded vs single threaded*
>> If we are going to have multi-threaded job submission the implementation
>> should work on handling race conditions. Essentially JobSubmitter should be
>> able to "lock" an experiment request before continuing processing that
>> request so that other JobSubmitters accessing the experiment requests a the
>> same time would skip it.
>>
> +1
>
>>
>> *Orchestrator service*
>> We might want to think of the possibility in future where we will be
>> having multiple deployments of an Airavata service. This could particularly
>> be true for SciGaP. We may have to think how some of the internal data
>> structures/SPIs should be updated to accomodate such requirements in future.
>>
>> *Orchestrator Component configurations*
>> I see alot of places where the orchestrator can have configurations. I
>> think its too early finalize them, but I think we can start refactoring
>> them out perhaps to the airavata-server.properties. I'm also seeing the
>> orchestrator is now hardcoded to use default/admin gateway and username. I
>> think it should come from the request itself.
>>
> I think having separate file for each component makes it clear and allow
> to deploy separate. As an example if we are just deploying a light weight
> orchestrator and deploy gfac as a separate JVM we do not need all the
> complex configuration from airavata-server.properties.
>
> I think to come to a conclusion about the configuration, we need to think
> about the production deployment scenarios and think about an easy and clear
> way to do the configuration. If we are planning to deploy some components
> as separate we need to make configuration files to each and configuration.
>
>>
>> *Visibility of API functions*
>> I think initialize(), shutdown() and startJobSubmitter() functions should
>> not be part of the API because I don't see a scenario where the gateway
>> developer would be responsible for using them. They serve a more internal
>> purpose of managing the orchestrator component IMO. As Amila pointed out so
>> long ago (wink) functions that do not concern outside parties should not be
>> used as part of the API.
>>
> +1, in orchestrator-core are we going to focus on API or focus on more
> internal functionality and design the interface methods and
> orchestrator-service component can be use to wrap up them in to meaningful
> API methods ( i was thinking may be we can wrap few mehtods of
> orchestrator-core to one meaningful orchestrator-service operation).
>
> I am not sure we need to have exactly one to one mapping in
> orchestrator-core to the funcations we are going to expose to the gateway
> developer.
>
I doubt it needs to be so. But for the sake of Airavata devs who develop
Airavata or extend Airavata with SPI implementations we should maintain an
intuitive correlation between the two.

>
>> *Return values of Orchestrator API*
>> IMO unless it is specifically required to do so I think the functions
>> does not necessarily need to return anything other than throw exceptions
>> when needed. For example the launchExperiment can simply return void if all
>> is succesful and return an exception if something fails. Handling issues
>> with a try catch is not only simpler but also the explanations are readily
>> available for the user.
>>
> For testing of the component return values will be useful and when we are
> wrapping up these in to real service operations (which will be exposed to
> the gateway developer as an SPI) these values will be useful.
> WDYT ?
>
We need to test both right?

>
>> *Data persisted in registry*
>> ExperimentRequest.getUsername() : I think we should clarify what this
>> username denotes. In current API, in experiment submission we consider two
>> types of users. Submission user (the user who submits the experiment to the
>> Airavata Server - this is inferred by the request itself) and the execution
>> user (the user who corelates to the application executions of the gateway -
>> thus this user can be a different user for different gateway, eg: community
>> user, gateway user).
>> I think we should persist the date/time of the experiment request as
>> well.
>> Also when retrying of API functions in the case of a failure in an
>> previous attempt there should be a way to not to repeat already performed
>> steps or gracefully roleback and redo those required steps as necessary.
>> While such actions could be transparent to the user sometimes it might make
>> sense to allow user to be notified of success/failure of a retry. However
>> this might mean keeping additional records at the registry level.
>>
>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Orchestration Component implementation review

Posted by Lahiru Gunathilake <gl...@gmail.com>.

Hi Saminda,

First thanks for reviewing the Orchestrator component.

On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:

> Following are few thoughts I had during my review of the component,
>
> *Multi-threaded vs single threaded*
> If we are going to have multi-threaded job submission the implementation
> should work on handling race conditions. Essentially JobSubmitter should be
> able to "lock" an experiment request before continuing processing that
> request so that other JobSubmitters accessing the experiment requests a the
> same time would skip it.
>
+1

>
> *Orchestrator service*
> We might want to think of the possibility in future where we will be
> having multiple deployments of an Airavata service. This could particularly
> be true for SciGaP. We may have to think how some of the internal data
> structures/SPIs should be updated to accomodate such requirements in future.
>
> *Orchestrator Component configurations*
> I see alot of places where the orchestrator can have configurations. I
> think its too early finalize them, but I think we can start refactoring
> them out perhaps to the airavata-server.properties. I'm also seeing the
> orchestrator is now hardcoded to use default/admin gateway and username. I
> think it should come from the request itself.
>
I think having separate file for each component makes it clear and allow to
deploy separate. As an example if we are just deploying a light weight
orchestrator and deploy gfac as a separate JVM we do not need all the
complex configuration from airavata-server.properties.

I think to come to a conclusion about the configuration, we need to think
about the production deployment scenarios and think about an easy and clear
way to do the configuration. If we are planning to deploy some components
as separate we need to make configuration files to each and configuration.

>
> *Visibility of API functions*
> I think initialize(), shutdown() and startJobSubmitter() functions should
> not be part of the API because I don't see a scenario where the gateway
> developer would be responsible for using them. They serve a more internal
> purpose of managing the orchestrator component IMO. As Amila pointed out so
> long ago (wink) functions that do not concern outside parties should not be
> used as part of the API.
>
+1, in orchestrator-core are we going to focus on API or focus on more
internal functionality and design the interface methods and
orchestrator-service component can be use to wrap up them in to meaningful
API methods ( i was thinking may be we can wrap few mehtods of
orchestrator-core to one meaningful orchestrator-service operation).

I am not sure we need to have exactly one to one mapping in
orchestrator-core to the funcations we are going to expose to the gateway
developer.

>
> *Return values of Orchestrator API*
> IMO unless it is specifically required to do so I think the functions does
> not necessarily need to return anything other than throw exceptions when
> needed. For example the launchExperiment can simply return void if all is
> succesful and return an exception if something fails. Handling issues with
> a try catch is not only simpler but also the explanations are readily
> available for the user.
>
For testing of the component return values will be useful and when we are
wrapping up these in to real service operations (which will be exposed to
the gateway developer as an SPI) these values will be useful.
WDYT ?

>
> *Data persisted in registry*
> ExperimentRequest.getUsername() : I think we should clarify what this
> username denotes. In current API, in experiment submission we consider two
> types of users. Submission user (the user who submits the experiment to the
> Airavata Server - this is inferred by the request itself) and the execution
> user (the user who corelates to the application executions of the gateway -
> thus this user can be a different user for different gateway, eg: community
> user, gateway user).
> I think we should persist the date/time of the experiment request as well.
> Also when retrying of API functions in the case of a failure in an
> previous attempt there should be a way to not to repeat already performed
> steps or gracefully roleback and redo those required steps as necessary.
> While such actions could be transparent to the user sometimes it might make
> sense to allow user to be notified of success/failure of a retry. However
> this might mean keeping additional records at the registry level.
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Orchestration Component implementation review

Posted by Suresh Marru <sm...@apache.org>.

On Jan 19, 2014, at 12:48 PM, Saminda Wijeratne <sa...@gmail.com> wrote:

> > Also when retrying of API functions in the case of a failure in an previous attempt there should be a way to not to repeat already performed steps or gracefully roleback and redo those required steps as necessary. While such actions could be transparent to the user sometimes it might make sense to allow user to be notified of success/failure of a retry. However this might mean keeping additional records at the registry level.
> >
> > In addition we should also have a way of cleaning up unsubmitted experiment ids. (But not sure whether you want to address this right now). The way I see this is to have a periodic thread which goes through the table and clear up experiments which are not submitted for a defined time.
> > +1. Something else we may have to think of later is the data archiving capabilities. We keep running in to performance issues when the database grows with experiment results. Unless we become experts of distributed database management we should have a way better way to manage our db performance issues.
> >
> 
> -1 on this. I may want to go back a year later and submit a previously created experiment. I think its wrong to put a temporal bound on these, more over these provide as a good source of analaytics to improvise usability. As per data base performance, not in 2014, there should be many solutions to handle zillions of experiments (atleast thats what the social networking world claims).
> I didn't mean that the experiments should be removed from users grasp by archiving them. Its more like an idea of memory hierarchy. The data which is most likely to be used should be available for quick querying. Ofcourse such data distributions should be transparent to the users.

Sure that makes sense and also agree that such a garbache collection is a system level implementation detail and ways of managing a high-speed access cache.

Suresh

Re: Orchestration Component implementation review

Posted by Saminda Wijeratne <sa...@gmail.com>.

On Sun, Jan 19, 2014 at 8:36 AM, Suresh Marru <sm...@apache.org> wrote:

> Great thoughts Saminda and Amila. Agreed about real-world use cases and
> integration will help prioritize. I will embed my feedback below:
>
> On Jan 17, 2014, at 2:57 PM, Saminda Wijeratne <sa...@gmail.com> wrote:
>
> > Marlon, I think until we put this to real use we wont get much feedback
> on what aspects we should focus on more and in what features we should
> expand or prioritize on. So how about having a test plan for the
> Orchestrator. Expose it to real usecases and see how it will survive. WDYT?
> >
> > It might be a little confusing to return a "JobRequest" object from the
> Orchestrator (since its a response). Or perhaps it should be renamed?
> >
> > Sachith, I think we should have a google hangout or a separate mail
> thread (or both) to discuss muti-threaded support. Could you organize this
> please?
> >
> > On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara <
> thejaka.amila@gmail.com> wrote:
> >
> > On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>
> wrote:
> > Following are few thoughts I had during my review of the component,
> >
> > Multi-threaded vs single threaded
> > If we are going to have multi-threaded job submission the implementation
> should work on handling race conditions. Essentially JobSubmitter should be
> able to "lock" an experiment request before continuing processing that
> request so that other JobSubmitters accessing the experiment requests a the
> same time would skip it.
> >
> > +1. These are implementation details.
>
> Agreed. For the implementation, I see this as a solved problem in
> operating systems and distributed systems worlds. Hopefully, we do not have
> to re-invent and rather leverage some libraries.
>
+1


>
> > Orchestrator service
> > We might want to think of the possibility in future where we will be
> having multiple deployments of an Airavata service. This could particularly
> be true for SciGaP. We may have to think how some of the internal data
> structures/SPIs should be updated to accomodate such requirements in future.
> >
> > +1.
> >
> + 1.
> >
> > Orchestrator Component configurations
> > I see alot of places where the orchestrator can have configurations. I
> think its too early finalize them, but I think we can start refactoring
> them out perhaps to the airavata-server.properties. I'm also seeing the
> orchestrator is now hardcoded to use default/admin gateway and username. I
> think it should come from the request itself.
> >
> > +1. But in overall we may need to change the way we handle
> configurations within Airavata. Currently we have multiple configuration
> files and multiple places where we read configurations. IMO we should have
> a separate module to handle configurations. Only this module should be
> aware how to intepret configurations in the file and provide a component
> interface to access those configuration values.
> > +1 we tried this once with "ServerSettings" and "ApplicationSettings",
> but apparently again more configuration files seems to have spawned. So far
> however they seemed to be localized for their component now.
>
> Fully agreed. I think we need to go back to these single configuration for
> all Airavata Server Needs and a single one for the Client SDK’s.
>
> > Visibility of API functions
> > I think initialize(), shutdown() and startJobSubmitter() functions
> should not be part of the API because I don't see a scenario where the
> gateway developer would be responsible for using them. They serve a more
> internal purpose of managing the orchestrator component IMO. As Amila
> pointed out so long ago (wink) functions that do not concern outside
> parties should not be used as part of the API.
> >
> > +1
>
> + 1. These should be within Orchestrator SPI but not exposed through the
> API as the clients should not be able to control these server behavior.
>
> > Return values of Orchestrator API
> > IMO unless it is specifically required to do so I think the functions
> does not necessarily need to return anything other than throw exceptions
> when needed. For example the launchExperiment can simply return void if all
> is succesful and return an exception if something fails. Handling issues
> with a try catch is not only simpler but also the explanations are readily
> available for the user.
> >
> > +1. Also try to have different exception for different scenarios. For
> example if persistence (hypothetical) fails, DatabasePersistenceException,
> if validation fails, ValidationFailedException etc ... Then the developer
> who uses the API can catch these different exceptions and act on them
> appropriately.
> > +1. What needs to be understood here is that the Exception should be a
> Gateway friendly exception. i.e. it should not expose internal details of
> Airavata at the top-level exception and exception message should be self
> explanatory enough for the gateway developer not to remain scratching
> his/her head after reading the exception. A feedback from Sudhakar sometime
> back was to provide suggestions in the exception message on how to resolve
> the issue.
>
> I have drafted some of these in the thrift files, will update the JIRA to
> brainstorm more.
>
> > Data persisted in registry
> > ExperimentRequest.getUsername() : I think we should clarify what this
> username denotes. In current API, in experiment submission we consider two
> types of users. Submission user (the user who submits the experiment to the
> Airavata Server - this is inferred by the request itself) and the execution
> user (the user who corelates to the application executions of the gateway -
> thus this user can be a different user for different gateway, eg: community
> user, gateway user).
> > I think we should persist the date/time of the experiment request as
> well.
> > +1
>
> The user naming is getting more confusing. I will start a separate
> discussion on this.
>
> > Also when retrying of API functions in the case of a failure in an
> previous attempt there should be a way to not to repeat already performed
> steps or gracefully roleback and redo those required steps as necessary.
> While such actions could be transparent to the user sometimes it might make
> sense to allow user to be notified of success/failure of a retry. However
> this might mean keeping additional records at the registry level.
> >
> > In addition we should also have a way of cleaning up unsubmitted
> experiment ids. (But not sure whether you want to address this right now).
> The way I see this is to have a periodic thread which goes through the
> table and clear up experiments which are not submitted for a defined time.
> > +1. Something else we may have to think of later is the data archiving
> capabilities. We keep running in to performance issues when the database
> grows with experiment results. Unless we become experts of distributed
> database management we should have a way better way to manage our db
> performance issues.
> >
>
> -1 on this. I may want to go back a year later and submit a previously
> created experiment. I think its wrong to put a temporal bound on these,
> more over these provide as a good source of analaytics to improvise
> usability. As per data base performance, not in 2014, there should be many
> solutions to handle zillions of experiments (atleast thats what the social
> networking world claims).
>
I didn't mean that the experiments should be removed from users grasp by
archiving them. Its more like an idea of memory hierarchy. The data which
is most likely to be used should be available for quick querying. Ofcourse
such data distributions should be transparent to the users.

>
> >
> > BTW, nice review notes, Saminda.
>
> + 1. And also + 1 to Amila’s attention to detail.
>
> Suresh
>
> >
> > Thanks
> > Amila
> >
> >
> >
>
>

Re: Orchestration Component implementation review

Posted by Suresh Marru <sm...@apache.org>.

Great thoughts Saminda and Amila. Agreed about real-world use cases and integration will help prioritize. I will embed my feedback below:

On Jan 17, 2014, at 2:57 PM, Saminda Wijeratne <sa...@gmail.com> wrote:

> Marlon, I think until we put this to real use we wont get much feedback on what aspects we should focus on more and in what features we should expand or prioritize on. So how about having a test plan for the Orchestrator. Expose it to real usecases and see how it will survive. WDYT?  
> 
> It might be a little confusing to return a "JobRequest" object from the Orchestrator (since its a response). Or perhaps it should be renamed?
> 
> Sachith, I think we should have a google hangout or a separate mail thread (or both) to discuss muti-threaded support. Could you organize this please?
> 
> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara <th...@gmail.com> wrote:
> 
> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com> wrote:
> Following are few thoughts I had during my review of the component,
> 
> Multi-threaded vs single threaded
> If we are going to have multi-threaded job submission the implementation should work on handling race conditions. Essentially JobSubmitter should be able to "lock" an experiment request before continuing processing that request so that other JobSubmitters accessing the experiment requests a the same time would skip it.
> 
> +1. These are implementation details.

Agreed. For the implementation, I see this as a solved problem in operating systems and distributed systems worlds. Hopefully, we do not have to re-invent and rather leverage some libraries. 

> Orchestrator service
> We might want to think of the possibility in future where we will be having multiple deployments of an Airavata service. This could particularly be true for SciGaP. We may have to think how some of the internal data structures/SPIs should be updated to accomodate such requirements in future.
> 
> +1.
>  
+ 1.
> 
> Orchestrator Component configurations
> I see alot of places where the orchestrator can have configurations. I think its too early finalize them, but I think we can start refactoring them out perhaps to the airavata-server.properties. I'm also seeing the orchestrator is now hardcoded to use default/admin gateway and username. I think it should come from the request itself.
> 
> +1. But in overall we may need to change the way we handle configurations within Airavata. Currently we have multiple configuration files and multiple places where we read configurations. IMO we should have a separate module to handle configurations. Only this module should be aware how to intepret configurations in the file and provide a component interface to access those configuration values.
> +1 we tried this once with "ServerSettings" and "ApplicationSettings", but apparently again more configuration files seems to have spawned. So far however they seemed to be localized for their component now. 

Fully agreed. I think we need to go back to these single configuration for all Airavata Server Needs and a single one for the Client SDK’s.

> Visibility of API functions
> I think initialize(), shutdown() and startJobSubmitter() functions should not be part of the API because I don't see a scenario where the gateway developer would be responsible for using them. They serve a more internal purpose of managing the orchestrator component IMO. As Amila pointed out so long ago (wink) functions that do not concern outside parties should not be used as part of the API.
> 
> +1

+ 1. These should be within Orchestrator SPI but not exposed through the API as the clients should not be able to control these server behavior. 

> Return values of Orchestrator API
> IMO unless it is specifically required to do so I think the functions does not necessarily need to return anything other than throw exceptions when needed. For example the launchExperiment can simply return void if all is succesful and return an exception if something fails. Handling issues with a try catch is not only simpler but also the explanations are readily available for the user.
> 
> +1. Also try to have different exception for different scenarios. For example if persistence (hypothetical) fails, DatabasePersistenceException, if validation fails, ValidationFailedException etc ... Then the developer who uses the API can catch these different exceptions and act on them appropriately.
> +1. What needs to be understood here is that the Exception should be a Gateway friendly exception. i.e. it should not expose internal details of Airavata at the top-level exception and exception message should be self explanatory enough for the gateway developer not to remain scratching his/her head after reading the exception. A feedback from Sudhakar sometime back was to provide suggestions in the exception message on how to resolve the issue. 

I have drafted some of these in the thrift files, will update the JIRA to brainstorm more. 

> Data persisted in registry
> ExperimentRequest.getUsername() : I think we should clarify what this username denotes. In current API, in experiment submission we consider two types of users. Submission user (the user who submits the experiment to the Airavata Server - this is inferred by the request itself) and the execution user (the user who corelates to the application executions of the gateway - thus this user can be a different user for different gateway, eg: community user, gateway user).
> I think we should persist the date/time of the experiment request as well. 
> +1 

The user naming is getting more confusing. I will start a separate discussion on this.

> Also when retrying of API functions in the case of a failure in an previous attempt there should be a way to not to repeat already performed steps or gracefully roleback and redo those required steps as necessary. While such actions could be transparent to the user sometimes it might make sense to allow user to be notified of success/failure of a retry. However this might mean keeping additional records at the registry level.
> 
> In addition we should also have a way of cleaning up unsubmitted experiment ids. (But not sure whether you want to address this right now). The way I see this is to have a periodic thread which goes through the table and clear up experiments which are not submitted for a defined time.
> +1. Something else we may have to think of later is the data archiving capabilities. We keep running in to performance issues when the database grows with experiment results. Unless we become experts of distributed database management we should have a way better way to manage our db performance issues. 
> 

-1 on this. I may want to go back a year later and submit a previously created experiment. I think its wrong to put a temporal bound on these, more over these provide as a good source of analaytics to improvise usability. As per data base performance, not in 2014, there should be many solutions to handle zillions of experiments (atleast thats what the social networking world claims).

> 
> BTW, nice review notes, Saminda.

+ 1. And also + 1 to Amila’s attention to detail.

Suresh

> 
> Thanks
> Amila
>  
> 
>

Re: Orchestration Component implementation review

Posted by Sachith Withana <sw...@gmail.com>.

I'm planing on explaining what's in the presentation[attached] in detail.
will also include the overview which shows where and how the orchestrator
fit in to Airavata.

Will update you all on the progress.


On Mon, Jan 20, 2014 at 12:30 PM, Saminda Wijeratne <sa...@gmail.com>wrote:

>
>
>
> On Mon, Jan 20, 2014 at 8:38 AM, Sachith Withana <sw...@gmail.com>wrote:
>
>> Okay. Will do. Will send you a draft asap.
>>
>>
>>  On Mon, Jan 20, 2014 at 11:17 AM, Marlon Pierce <ma...@iu.edu> wrote:
>>
>>> I think we want to make everything explicit.  The Airavata API is
>>> intended for external clients, not communications between Airavata
>>> components (the SPI).  Your figure at
>>>
>>> https://cwiki.apache.org/confluence/display/AIRAVATA/Simple+Gateway+Developer+Guide
>>> summarizes this nicely.  You'll need to explain both the API (what a
>>> gateway does) and the SPI (how Airavata components work together) to do
>>> this.
>>>
>> +1. Like you did for the gateway developer guide an initial draft atleast
> with a bullet point wiki would give you some quick feedback. Try and use
> some flow diagrams to explain activities.
>
>>
>>>
>>> Marlon
>>>
>>> On 1/20/14 11:09 AM, Sachith Withana wrote:
>>> > We are using internal SPIs which are not reflected in the Airavata API.
>>> > should it be explained or just make it a higher level diagram which
>>> won't
>>> > show the SPIs?
>>> >
>>> >
>>> > On Mon, Jan 20, 2014 at 11:06 AM, Marlon Pierce <ma...@iu.edu>
>>> wrote:
>>> >
>>> >> Thanks, Sachith.  Can you explain your question about APIs and SPIs a
>>> >> little more?
>>> >>
>>> >>
>>> >> Marlon
>>> >>
>>> >> On 1/20/14 10:53 AM, Sachith Withana wrote:
>>> >>> Hi All,
>>> >>>
>>> >>> I will go ahead and create the Wiki on the Orchestrator. Will send
>>> you
>>> >> all
>>> >>> a draft as soon as I can.
>>> >>>
>>> >>> One question though, Do we have to explicitly show the SPIs and APIs
>>> >> both?
>>> >>>
>>> >>> On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <ma...@iu.edu>
>>> wrote:
>>> >>>
>>> >>>> +1 for real use cases first. We have at least 3.  But I'm sure we
>>> will
>>> >>>> want to make it as easy as possible for developers to pass back the
>>> >>>> correct, created experimentID when invoking launchExperiment.
>>> >>>>
>>> >>>>
>>> >>>> Marlon
>>> >>>>
>>> >>>> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
>>> >>>>> Marlon, I think until we put this to real use we wont get much
>>> feedback
>>> >>>> on
>>> >>>>> what aspects we should focus on more and in what features we should
>>> >>>> expand
>>> >>>>> or prioritize on. So how about having a test plan for the
>>> Orchestrator.
>>> >>>>> Expose it to real usecases and see how it will survive. WDYT?
>>> >>>>>
>>> >>>>> It might be a little confusing to return a "JobRequest" object
>>> from the
>>> >>>>> Orchestrator (since its a response). Or perhaps it should be
>>> renamed?
>>> >>>>>
>>> >>>>> Sachith, I think we should have a google hangout or a separate mail
>>> >>>> thread
>>> >>>>> (or both) to discuss muti-threaded support. Could you organize this
>>> >>>> please?
>>> >>>>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
>>> >>>>> <th...@gmail.com>wrote:
>>> >>>>>
>>> >>>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <
>>> >> samindaw@gmail.com
>>> >>>>> wrote:
>>> >>>>>>> Following are few thoughts I had during my review of the
>>> component,
>>> >>>>>>>
>>> >>>>>>> *Multi-threaded vs single threaded*
>>> >>>>>>> If we are going to have multi-threaded job submission the
>>> >>>> implementation
>>> >>>>>>> should work on handling race conditions. Essentially JobSubmitter
>>> >>>> should be
>>> >>>>>>> able to "lock" an experiment request before continuing processing
>>> >> that
>>> >>>>>>> request so that other JobSubmitters accessing the experiment
>>> requests
>>> >>>> a the
>>> >>>>>>> same time would skip it.
>>> >>>>>>>
>>> >>>>>> +1. These are implementation details.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>> *Orchestrator service*
>>> >>>>>>> We might want to think of the possibility in future where we
>>> will be
>>> >>>>>>> having multiple deployments of an Airavata service. This could
>>> >>>> particularly
>>> >>>>>>> be true for SciGaP. We may have to think how some of the internal
>>> >> data
>>> >>>>>>> structures/SPIs should be updated to accomodate such
>>> requirements in
>>> >>>> future.
>>> >>>>>> +1.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>> *Orchestrator Component configurations*
>>> >>>>>>> I see alot of places where the orchestrator can have
>>> configurations.
>>> >> I
>>> >>>>>>> think its too early finalize them, but I think we can start
>>> >> refactoring
>>> >>>>>>> them out perhaps to the airavata-server.properties. I'm also
>>> seeing
>>> >> the
>>> >>>>>>> orchestrator is now hardcoded to use default/admin gateway and
>>> >>>> username. I
>>> >>>>>>> think it should come from the request itself.
>>> >>>>>>>
>>> >>>>>> +1. But in overall we may need to change the way we handle
>>> >>>> configurations
>>> >>>>>> within Airavata. Currently we have multiple configuration files
>>> and
>>> >>>>>> multiple places where we read configurations. IMO we should have a
>>> >>>> separate
>>> >>>>>> module to handle configurations. Only this module should be aware
>>> how
>>> >> to
>>> >>>>>> intepret configurations in the file and provide a component
>>> interface
>>> >> to
>>> >>>>>> access those configuration values.
>>> >>>>>>
>>> >>>>> +1 we tried this once with "ServerSettings" and
>>> "ApplicationSettings",
>>> >>>> but
>>> >>>>> apparently again more configuration files seems to have spawned.
>>> So far
>>> >>>>> however they seemed to be localized for their component now.
>>> >>>>>
>>> >>>>>>> *Visibility of API functions*
>>> >>>>>>> I think initialize(), shutdown() and startJobSubmitter()
>>> functions
>>> >>>> should
>>> >>>>>>> not be part of the API because I don't see a scenario where the
>>> >> gateway
>>> >>>>>>> developer would be responsible for using them. They serve a more
>>> >>>> internal
>>> >>>>>>> purpose of managing the orchestrator component IMO. As Amila
>>> pointed
>>> >>>> out so
>>> >>>>>>> long ago (wink) functions that do not concern outside parties
>>> should
>>> >>>> not be
>>> >>>>>>> used as part of the API.
>>> >>>>>>>
>>> >>>>>> +1
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>> *Return values of Orchestrator API*
>>> >>>>>>> IMO unless it is specifically required to do so I think the
>>> functions
>>> >>>>>>> does not necessarily need to return anything other than throw
>>> >>>> exceptions
>>> >>>>>>> when needed. For example the launchExperiment can simply return
>>> void
>>> >>>> if all
>>> >>>>>>> is succesful and return an exception if something fails. Handling
>>> >>>> issues
>>> >>>>>>> with a try catch is not only simpler but also the explanations
>>> are
>>> >>>> readily
>>> >>>>>>> available for the user.
>>> >>>>>>>
>>> >>>>>> +1. Also try to have different exception for different scenarios.
>>> For
>>> >>>>>> example if persistence (hypothetical) fails,
>>> >>>> DatabasePersistenceException,
>>> >>>>>> if validation fails, ValidationFailedException etc ... Then the
>>> >>>> developer
>>> >>>>>> who uses the API can catch these different exceptions and act on
>>> them
>>> >>>>>> appropriately.
>>> >>>>>>
>>> >>>>> +1. What needs to be understood here is that the Exception should
>>> be a
>>> >>>>> Gateway friendly exception. i.e. it should not expose internal
>>> details
>>> >> of
>>> >>>>> Airavata at the top-level exception and exception message should be
>>> >> self
>>> >>>>> explanatory enough for the gateway developer not to remain
>>> scratching
>>> >>>>> his/her head after reading the exception. A feedback from Sudhakar
>>> >>>> sometime
>>> >>>>> back was to provide suggestions in the exception message on how to
>>> >>>> resolve
>>> >>>>> the issue.
>>> >>>>>
>>> >>>>>>> *Data persisted in registry*
>>> >>>>>>> ExperimentRequest.getUsername() : I think we should clarify what
>>> this
>>> >>>>>>> username denotes. In current API, in experiment submission we
>>> >> consider
>>> >>>> two
>>> >>>>>>> types of users. Submission user (the user who submits the
>>> experiment
>>> >>>> to the
>>> >>>>>>> Airavata Server - this is inferred by the request itself) and the
>>> >>>> execution
>>> >>>>>>> user (the user who corelates to the application executions of the
>>> >>>> gateway -
>>> >>>>>>> thus this user can be a different user for different gateway, eg:
>>> >>>> community
>>> >>>>>>> user, gateway user).
>>> >>>>>>> I think we should persist the date/time of the experiment
>>> request as
>>> >>>>>>> well.
>>> >>>>>>>
>>> >>>>>> +1
>>> >>>>>>
>>> >>>>>>>  Also when retrying of API functions in the case of a failure in
>>> an
>>> >>>>>>> previous attempt there should be a way to not to repeat already
>>> >>>> performed
>>> >>>>>>> steps or gracefully roleback and redo those required steps as
>>> >>>> necessary.
>>> >>>>>>> While such actions could be transparent to the user sometimes it
>>> >> might
>>> >>>> make
>>> >>>>>>> sense to allow user to be notified of success/failure of a retry.
>>> >>>> However
>>> >>>>>>> this might mean keeping additional records at the registry level.
>>> >>>>>>>
>>> >>>>>> In addition we should also have a way of cleaning up unsubmitted
>>> >>>>>> experiment ids. (But not sure whether you want to address this
>>> right
>>> >>>> now).
>>> >>>>>> The way I see this is to have a periodic thread which goes
>>> through the
>>> >>>>>> table and clear up experiments which are not submitted for a
>>> defined
>>> >>>> time.
>>> >>>>> +1. Something else we may have to think of later is the data
>>> archiving
>>> >>>>> capabilities. We keep running in to performance issues when the
>>> >> database
>>> >>>>> grows with experiment results. Unless we become experts of
>>> distributed
>>> >>>>> database management we should have a way better way to manage our
>>> db
>>> >>>>> performance issues.
>>> >>>>>
>>> >>>>>
>>> >>>>>> BTW, nice review notes, Saminda.
>>> >>>>>>
>>> >>>>>> Thanks
>>> >>>>>> Amila
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>
>>> >>
>>> >
>>>
>>>
>>
>>
>> --
>> Thanks,
>> Sachith Withana
>>
>>
>


-- 
Thanks,
Sachith Withana

Re: Orchestration Component implementation review

Posted by Saminda Wijeratne <sa...@gmail.com>.

On Mon, Jan 20, 2014 at 8:38 AM, Sachith Withana <sw...@gmail.com>wrote:

> Okay. Will do. Will send you a draft asap.
>
>
> On Mon, Jan 20, 2014 at 11:17 AM, Marlon Pierce <ma...@iu.edu> wrote:
>
>> I think we want to make everything explicit.  The Airavata API is
>> intended for external clients, not communications between Airavata
>> components (the SPI).  Your figure at
>>
>> https://cwiki.apache.org/confluence/display/AIRAVATA/Simple+Gateway+Developer+Guide
>> summarizes this nicely.  You'll need to explain both the API (what a
>> gateway does) and the SPI (how Airavata components work together) to do
>> this.
>>
> +1. Like you did for the gateway developer guide an initial draft atleast
with a bullet point wiki would give you some quick feedback. Try and use
some flow diagrams to explain activities.

>
>>
>> Marlon
>>
>> On 1/20/14 11:09 AM, Sachith Withana wrote:
>> > We are using internal SPIs which are not reflected in the Airavata API.
>> > should it be explained or just make it a higher level diagram which
>> won't
>> > show the SPIs?
>> >
>> >
>> > On Mon, Jan 20, 2014 at 11:06 AM, Marlon Pierce <ma...@iu.edu>
>> wrote:
>> >
>> >> Thanks, Sachith.  Can you explain your question about APIs and SPIs a
>> >> little more?
>> >>
>> >>
>> >> Marlon
>> >>
>> >> On 1/20/14 10:53 AM, Sachith Withana wrote:
>> >>> Hi All,
>> >>>
>> >>> I will go ahead and create the Wiki on the Orchestrator. Will send you
>> >> all
>> >>> a draft as soon as I can.
>> >>>
>> >>> One question though, Do we have to explicitly show the SPIs and APIs
>> >> both?
>> >>>
>> >>> On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <ma...@iu.edu>
>> wrote:
>> >>>
>> >>>> +1 for real use cases first. We have at least 3.  But I'm sure we
>> will
>> >>>> want to make it as easy as possible for developers to pass back the
>> >>>> correct, created experimentID when invoking launchExperiment.
>> >>>>
>> >>>>
>> >>>> Marlon
>> >>>>
>> >>>> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
>> >>>>> Marlon, I think until we put this to real use we wont get much
>> feedback
>> >>>> on
>> >>>>> what aspects we should focus on more and in what features we should
>> >>>> expand
>> >>>>> or prioritize on. So how about having a test plan for the
>> Orchestrator.
>> >>>>> Expose it to real usecases and see how it will survive. WDYT?
>> >>>>>
>> >>>>> It might be a little confusing to return a "JobRequest" object from
>> the
>> >>>>> Orchestrator (since its a response). Or perhaps it should be
>> renamed?
>> >>>>>
>> >>>>> Sachith, I think we should have a google hangout or a separate mail
>> >>>> thread
>> >>>>> (or both) to discuss muti-threaded support. Could you organize this
>> >>>> please?
>> >>>>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
>> >>>>> <th...@gmail.com>wrote:
>> >>>>>
>> >>>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <
>> >> samindaw@gmail.com
>> >>>>> wrote:
>> >>>>>>> Following are few thoughts I had during my review of the
>> component,
>> >>>>>>>
>> >>>>>>> *Multi-threaded vs single threaded*
>> >>>>>>> If we are going to have multi-threaded job submission the
>> >>>> implementation
>> >>>>>>> should work on handling race conditions. Essentially JobSubmitter
>> >>>> should be
>> >>>>>>> able to "lock" an experiment request before continuing processing
>> >> that
>> >>>>>>> request so that other JobSubmitters accessing the experiment
>> requests
>> >>>> a the
>> >>>>>>> same time would skip it.
>> >>>>>>>
>> >>>>>> +1. These are implementation details.
>> >>>>>>
>> >>>>>>
>> >>>>>>> *Orchestrator service*
>> >>>>>>> We might want to think of the possibility in future where we will
>> be
>> >>>>>>> having multiple deployments of an Airavata service. This could
>> >>>> particularly
>> >>>>>>> be true for SciGaP. We may have to think how some of the internal
>> >> data
>> >>>>>>> structures/SPIs should be updated to accomodate such requirements
>> in
>> >>>> future.
>> >>>>>> +1.
>> >>>>>>
>> >>>>>>
>> >>>>>>> *Orchestrator Component configurations*
>> >>>>>>> I see alot of places where the orchestrator can have
>> configurations.
>> >> I
>> >>>>>>> think its too early finalize them, but I think we can start
>> >> refactoring
>> >>>>>>> them out perhaps to the airavata-server.properties. I'm also
>> seeing
>> >> the
>> >>>>>>> orchestrator is now hardcoded to use default/admin gateway and
>> >>>> username. I
>> >>>>>>> think it should come from the request itself.
>> >>>>>>>
>> >>>>>> +1. But in overall we may need to change the way we handle
>> >>>> configurations
>> >>>>>> within Airavata. Currently we have multiple configuration files and
>> >>>>>> multiple places where we read configurations. IMO we should have a
>> >>>> separate
>> >>>>>> module to handle configurations. Only this module should be aware
>> how
>> >> to
>> >>>>>> intepret configurations in the file and provide a component
>> interface
>> >> to
>> >>>>>> access those configuration values.
>> >>>>>>
>> >>>>> +1 we tried this once with "ServerSettings" and
>> "ApplicationSettings",
>> >>>> but
>> >>>>> apparently again more configuration files seems to have spawned. So
>> far
>> >>>>> however they seemed to be localized for their component now.
>> >>>>>
>> >>>>>>> *Visibility of API functions*
>> >>>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>> >>>> should
>> >>>>>>> not be part of the API because I don't see a scenario where the
>> >> gateway
>> >>>>>>> developer would be responsible for using them. They serve a more
>> >>>> internal
>> >>>>>>> purpose of managing the orchestrator component IMO. As Amila
>> pointed
>> >>>> out so
>> >>>>>>> long ago (wink) functions that do not concern outside parties
>> should
>> >>>> not be
>> >>>>>>> used as part of the API.
>> >>>>>>>
>> >>>>>> +1
>> >>>>>>
>> >>>>>>
>> >>>>>>> *Return values of Orchestrator API*
>> >>>>>>> IMO unless it is specifically required to do so I think the
>> functions
>> >>>>>>> does not necessarily need to return anything other than throw
>> >>>> exceptions
>> >>>>>>> when needed. For example the launchExperiment can simply return
>> void
>> >>>> if all
>> >>>>>>> is succesful and return an exception if something fails. Handling
>> >>>> issues
>> >>>>>>> with a try catch is not only simpler but also the explanations are
>> >>>> readily
>> >>>>>>> available for the user.
>> >>>>>>>
>> >>>>>> +1. Also try to have different exception for different scenarios.
>> For
>> >>>>>> example if persistence (hypothetical) fails,
>> >>>> DatabasePersistenceException,
>> >>>>>> if validation fails, ValidationFailedException etc ... Then the
>> >>>> developer
>> >>>>>> who uses the API can catch these different exceptions and act on
>> them
>> >>>>>> appropriately.
>> >>>>>>
>> >>>>> +1. What needs to be understood here is that the Exception should
>> be a
>> >>>>> Gateway friendly exception. i.e. it should not expose internal
>> details
>> >> of
>> >>>>> Airavata at the top-level exception and exception message should be
>> >> self
>> >>>>> explanatory enough for the gateway developer not to remain
>> scratching
>> >>>>> his/her head after reading the exception. A feedback from Sudhakar
>> >>>> sometime
>> >>>>> back was to provide suggestions in the exception message on how to
>> >>>> resolve
>> >>>>> the issue.
>> >>>>>
>> >>>>>>> *Data persisted in registry*
>> >>>>>>> ExperimentRequest.getUsername() : I think we should clarify what
>> this
>> >>>>>>> username denotes. In current API, in experiment submission we
>> >> consider
>> >>>> two
>> >>>>>>> types of users. Submission user (the user who submits the
>> experiment
>> >>>> to the
>> >>>>>>> Airavata Server - this is inferred by the request itself) and the
>> >>>> execution
>> >>>>>>> user (the user who corelates to the application executions of the
>> >>>> gateway -
>> >>>>>>> thus this user can be a different user for different gateway, eg:
>> >>>> community
>> >>>>>>> user, gateway user).
>> >>>>>>> I think we should persist the date/time of the experiment request
>> as
>> >>>>>>> well.
>> >>>>>>>
>> >>>>>> +1
>> >>>>>>
>> >>>>>>>  Also when retrying of API functions in the case of a failure in
>> an
>> >>>>>>> previous attempt there should be a way to not to repeat already
>> >>>> performed
>> >>>>>>> steps or gracefully roleback and redo those required steps as
>> >>>> necessary.
>> >>>>>>> While such actions could be transparent to the user sometimes it
>> >> might
>> >>>> make
>> >>>>>>> sense to allow user to be notified of success/failure of a retry.
>> >>>> However
>> >>>>>>> this might mean keeping additional records at the registry level.
>> >>>>>>>
>> >>>>>> In addition we should also have a way of cleaning up unsubmitted
>> >>>>>> experiment ids. (But not sure whether you want to address this
>> right
>> >>>> now).
>> >>>>>> The way I see this is to have a periodic thread which goes through
>> the
>> >>>>>> table and clear up experiments which are not submitted for a
>> defined
>> >>>> time.
>> >>>>> +1. Something else we may have to think of later is the data
>> archiving
>> >>>>> capabilities. We keep running in to performance issues when the
>> >> database
>> >>>>> grows with experiment results. Unless we become experts of
>> distributed
>> >>>>> database management we should have a way better way to manage our db
>> >>>>> performance issues.
>> >>>>>
>> >>>>>
>> >>>>>> BTW, nice review notes, Saminda.
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> Amila
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>
>> >
>>
>>
>
>
> --
> Thanks,
> Sachith Withana
>
>

Re: Orchestration Component implementation review

Posted by Sachith Withana <sw...@gmail.com>.

Okay. Will do. Will send you a draft asap.


On Mon, Jan 20, 2014 at 11:17 AM, Marlon Pierce <ma...@iu.edu> wrote:

> I think we want to make everything explicit.  The Airavata API is
> intended for external clients, not communications between Airavata
> components (the SPI).  Your figure at
>
> https://cwiki.apache.org/confluence/display/AIRAVATA/Simple+Gateway+Developer+Guide
> summarizes this nicely.  You'll need to explain both the API (what a
> gateway does) and the SPI (how Airavata components work together) to do
> this.
>
>
> Marlon
>
> On 1/20/14 11:09 AM, Sachith Withana wrote:
> > We are using internal SPIs which are not reflected in the Airavata API.
> > should it be explained or just make it a higher level diagram which won't
> > show the SPIs?
> >
> >
> > On Mon, Jan 20, 2014 at 11:06 AM, Marlon Pierce <ma...@iu.edu> wrote:
> >
> >> Thanks, Sachith.  Can you explain your question about APIs and SPIs a
> >> little more?
> >>
> >>
> >> Marlon
> >>
> >> On 1/20/14 10:53 AM, Sachith Withana wrote:
> >>> Hi All,
> >>>
> >>> I will go ahead and create the Wiki on the Orchestrator. Will send you
> >> all
> >>> a draft as soon as I can.
> >>>
> >>> One question though, Do we have to explicitly show the SPIs and APIs
> >> both?
> >>>
> >>> On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <ma...@iu.edu>
> wrote:
> >>>
> >>>> +1 for real use cases first. We have at least 3.  But I'm sure we will
> >>>> want to make it as easy as possible for developers to pass back the
> >>>> correct, created experimentID when invoking launchExperiment.
> >>>>
> >>>>
> >>>> Marlon
> >>>>
> >>>> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
> >>>>> Marlon, I think until we put this to real use we wont get much
> feedback
> >>>> on
> >>>>> what aspects we should focus on more and in what features we should
> >>>> expand
> >>>>> or prioritize on. So how about having a test plan for the
> Orchestrator.
> >>>>> Expose it to real usecases and see how it will survive. WDYT?
> >>>>>
> >>>>> It might be a little confusing to return a "JobRequest" object from
> the
> >>>>> Orchestrator (since its a response). Or perhaps it should be renamed?
> >>>>>
> >>>>> Sachith, I think we should have a google hangout or a separate mail
> >>>> thread
> >>>>> (or both) to discuss muti-threaded support. Could you organize this
> >>>> please?
> >>>>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
> >>>>> <th...@gmail.com>wrote:
> >>>>>
> >>>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <
> >> samindaw@gmail.com
> >>>>> wrote:
> >>>>>>> Following are few thoughts I had during my review of the component,
> >>>>>>>
> >>>>>>> *Multi-threaded vs single threaded*
> >>>>>>> If we are going to have multi-threaded job submission the
> >>>> implementation
> >>>>>>> should work on handling race conditions. Essentially JobSubmitter
> >>>> should be
> >>>>>>> able to "lock" an experiment request before continuing processing
> >> that
> >>>>>>> request so that other JobSubmitters accessing the experiment
> requests
> >>>> a the
> >>>>>>> same time would skip it.
> >>>>>>>
> >>>>>> +1. These are implementation details.
> >>>>>>
> >>>>>>
> >>>>>>> *Orchestrator service*
> >>>>>>> We might want to think of the possibility in future where we will
> be
> >>>>>>> having multiple deployments of an Airavata service. This could
> >>>> particularly
> >>>>>>> be true for SciGaP. We may have to think how some of the internal
> >> data
> >>>>>>> structures/SPIs should be updated to accomodate such requirements
> in
> >>>> future.
> >>>>>> +1.
> >>>>>>
> >>>>>>
> >>>>>>> *Orchestrator Component configurations*
> >>>>>>> I see alot of places where the orchestrator can have
> configurations.
> >> I
> >>>>>>> think its too early finalize them, but I think we can start
> >> refactoring
> >>>>>>> them out perhaps to the airavata-server.properties. I'm also seeing
> >> the
> >>>>>>> orchestrator is now hardcoded to use default/admin gateway and
> >>>> username. I
> >>>>>>> think it should come from the request itself.
> >>>>>>>
> >>>>>> +1. But in overall we may need to change the way we handle
> >>>> configurations
> >>>>>> within Airavata. Currently we have multiple configuration files and
> >>>>>> multiple places where we read configurations. IMO we should have a
> >>>> separate
> >>>>>> module to handle configurations. Only this module should be aware
> how
> >> to
> >>>>>> intepret configurations in the file and provide a component
> interface
> >> to
> >>>>>> access those configuration values.
> >>>>>>
> >>>>> +1 we tried this once with "ServerSettings" and
> "ApplicationSettings",
> >>>> but
> >>>>> apparently again more configuration files seems to have spawned. So
> far
> >>>>> however they seemed to be localized for their component now.
> >>>>>
> >>>>>>> *Visibility of API functions*
> >>>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
> >>>> should
> >>>>>>> not be part of the API because I don't see a scenario where the
> >> gateway
> >>>>>>> developer would be responsible for using them. They serve a more
> >>>> internal
> >>>>>>> purpose of managing the orchestrator component IMO. As Amila
> pointed
> >>>> out so
> >>>>>>> long ago (wink) functions that do not concern outside parties
> should
> >>>> not be
> >>>>>>> used as part of the API.
> >>>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>>
> >>>>>>> *Return values of Orchestrator API*
> >>>>>>> IMO unless it is specifically required to do so I think the
> functions
> >>>>>>> does not necessarily need to return anything other than throw
> >>>> exceptions
> >>>>>>> when needed. For example the launchExperiment can simply return
> void
> >>>> if all
> >>>>>>> is succesful and return an exception if something fails. Handling
> >>>> issues
> >>>>>>> with a try catch is not only simpler but also the explanations are
> >>>> readily
> >>>>>>> available for the user.
> >>>>>>>
> >>>>>> +1. Also try to have different exception for different scenarios.
> For
> >>>>>> example if persistence (hypothetical) fails,
> >>>> DatabasePersistenceException,
> >>>>>> if validation fails, ValidationFailedException etc ... Then the
> >>>> developer
> >>>>>> who uses the API can catch these different exceptions and act on
> them
> >>>>>> appropriately.
> >>>>>>
> >>>>> +1. What needs to be understood here is that the Exception should be
> a
> >>>>> Gateway friendly exception. i.e. it should not expose internal
> details
> >> of
> >>>>> Airavata at the top-level exception and exception message should be
> >> self
> >>>>> explanatory enough for the gateway developer not to remain scratching
> >>>>> his/her head after reading the exception. A feedback from Sudhakar
> >>>> sometime
> >>>>> back was to provide suggestions in the exception message on how to
> >>>> resolve
> >>>>> the issue.
> >>>>>
> >>>>>>> *Data persisted in registry*
> >>>>>>> ExperimentRequest.getUsername() : I think we should clarify what
> this
> >>>>>>> username denotes. In current API, in experiment submission we
> >> consider
> >>>> two
> >>>>>>> types of users. Submission user (the user who submits the
> experiment
> >>>> to the
> >>>>>>> Airavata Server - this is inferred by the request itself) and the
> >>>> execution
> >>>>>>> user (the user who corelates to the application executions of the
> >>>> gateway -
> >>>>>>> thus this user can be a different user for different gateway, eg:
> >>>> community
> >>>>>>> user, gateway user).
> >>>>>>> I think we should persist the date/time of the experiment request
> as
> >>>>>>> well.
> >>>>>>>
> >>>>>> +1
> >>>>>>
> >>>>>>>  Also when retrying of API functions in the case of a failure in an
> >>>>>>> previous attempt there should be a way to not to repeat already
> >>>> performed
> >>>>>>> steps or gracefully roleback and redo those required steps as
> >>>> necessary.
> >>>>>>> While such actions could be transparent to the user sometimes it
> >> might
> >>>> make
> >>>>>>> sense to allow user to be notified of success/failure of a retry.
> >>>> However
> >>>>>>> this might mean keeping additional records at the registry level.
> >>>>>>>
> >>>>>> In addition we should also have a way of cleaning up unsubmitted
> >>>>>> experiment ids. (But not sure whether you want to address this right
> >>>> now).
> >>>>>> The way I see this is to have a periodic thread which goes through
> the
> >>>>>> table and clear up experiments which are not submitted for a defined
> >>>> time.
> >>>>> +1. Something else we may have to think of later is the data
> archiving
> >>>>> capabilities. We keep running in to performance issues when the
> >> database
> >>>>> grows with experiment results. Unless we become experts of
> distributed
> >>>>> database management we should have a way better way to manage our db
> >>>>> performance issues.
> >>>>>
> >>>>>
> >>>>>> BTW, nice review notes, Saminda.
> >>>>>>
> >>>>>> Thanks
> >>>>>> Amila
> >>>>>>
> >>>>>>
> >>>>>>
> >>
> >
>
>


-- 
Thanks,
Sachith Withana

Re: Orchestration Component implementation review

Posted by Marlon Pierce <ma...@iu.edu>.

I think we want to make everything explicit.  The Airavata API is
intended for external clients, not communications between Airavata
components (the SPI).  Your figure at
https://cwiki.apache.org/confluence/display/AIRAVATA/Simple+Gateway+Developer+Guide
summarizes this nicely.  You'll need to explain both the API (what a
gateway does) and the SPI (how Airavata components work together) to do
this.


Marlon

On 1/20/14 11:09 AM, Sachith Withana wrote:
> We are using internal SPIs which are not reflected in the Airavata API.
> should it be explained or just make it a higher level diagram which won't
> show the SPIs?
>
>
> On Mon, Jan 20, 2014 at 11:06 AM, Marlon Pierce <ma...@iu.edu> wrote:
>
>> Thanks, Sachith.  Can you explain your question about APIs and SPIs a
>> little more?
>>
>>
>> Marlon
>>
>> On 1/20/14 10:53 AM, Sachith Withana wrote:
>>> Hi All,
>>>
>>> I will go ahead and create the Wiki on the Orchestrator. Will send you
>> all
>>> a draft as soon as I can.
>>>
>>> One question though, Do we have to explicitly show the SPIs and APIs
>> both?
>>>
>>> On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <ma...@iu.edu> wrote:
>>>
>>>> +1 for real use cases first. We have at least 3.  But I'm sure we will
>>>> want to make it as easy as possible for developers to pass back the
>>>> correct, created experimentID when invoking launchExperiment.
>>>>
>>>>
>>>> Marlon
>>>>
>>>> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
>>>>> Marlon, I think until we put this to real use we wont get much feedback
>>>> on
>>>>> what aspects we should focus on more and in what features we should
>>>> expand
>>>>> or prioritize on. So how about having a test plan for the Orchestrator.
>>>>> Expose it to real usecases and see how it will survive. WDYT?
>>>>>
>>>>> It might be a little confusing to return a "JobRequest" object from the
>>>>> Orchestrator (since its a response). Or perhaps it should be renamed?
>>>>>
>>>>> Sachith, I think we should have a google hangout or a separate mail
>>>> thread
>>>>> (or both) to discuss muti-threaded support. Could you organize this
>>>> please?
>>>>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
>>>>> <th...@gmail.com>wrote:
>>>>>
>>>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <
>> samindaw@gmail.com
>>>>> wrote:
>>>>>>> Following are few thoughts I had during my review of the component,
>>>>>>>
>>>>>>> *Multi-threaded vs single threaded*
>>>>>>> If we are going to have multi-threaded job submission the
>>>> implementation
>>>>>>> should work on handling race conditions. Essentially JobSubmitter
>>>> should be
>>>>>>> able to "lock" an experiment request before continuing processing
>> that
>>>>>>> request so that other JobSubmitters accessing the experiment requests
>>>> a the
>>>>>>> same time would skip it.
>>>>>>>
>>>>>> +1. These are implementation details.
>>>>>>
>>>>>>
>>>>>>> *Orchestrator service*
>>>>>>> We might want to think of the possibility in future where we will be
>>>>>>> having multiple deployments of an Airavata service. This could
>>>> particularly
>>>>>>> be true for SciGaP. We may have to think how some of the internal
>> data
>>>>>>> structures/SPIs should be updated to accomodate such requirements in
>>>> future.
>>>>>> +1.
>>>>>>
>>>>>>
>>>>>>> *Orchestrator Component configurations*
>>>>>>> I see alot of places where the orchestrator can have configurations.
>> I
>>>>>>> think its too early finalize them, but I think we can start
>> refactoring
>>>>>>> them out perhaps to the airavata-server.properties. I'm also seeing
>> the
>>>>>>> orchestrator is now hardcoded to use default/admin gateway and
>>>> username. I
>>>>>>> think it should come from the request itself.
>>>>>>>
>>>>>> +1. But in overall we may need to change the way we handle
>>>> configurations
>>>>>> within Airavata. Currently we have multiple configuration files and
>>>>>> multiple places where we read configurations. IMO we should have a
>>>> separate
>>>>>> module to handle configurations. Only this module should be aware how
>> to
>>>>>> intepret configurations in the file and provide a component interface
>> to
>>>>>> access those configuration values.
>>>>>>
>>>>> +1 we tried this once with "ServerSettings" and "ApplicationSettings",
>>>> but
>>>>> apparently again more configuration files seems to have spawned. So far
>>>>> however they seemed to be localized for their component now.
>>>>>
>>>>>>> *Visibility of API functions*
>>>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>>> should
>>>>>>> not be part of the API because I don't see a scenario where the
>> gateway
>>>>>>> developer would be responsible for using them. They serve a more
>>>> internal
>>>>>>> purpose of managing the orchestrator component IMO. As Amila pointed
>>>> out so
>>>>>>> long ago (wink) functions that do not concern outside parties should
>>>> not be
>>>>>>> used as part of the API.
>>>>>>>
>>>>>> +1
>>>>>>
>>>>>>
>>>>>>> *Return values of Orchestrator API*
>>>>>>> IMO unless it is specifically required to do so I think the functions
>>>>>>> does not necessarily need to return anything other than throw
>>>> exceptions
>>>>>>> when needed. For example the launchExperiment can simply return void
>>>> if all
>>>>>>> is succesful and return an exception if something fails. Handling
>>>> issues
>>>>>>> with a try catch is not only simpler but also the explanations are
>>>> readily
>>>>>>> available for the user.
>>>>>>>
>>>>>> +1. Also try to have different exception for different scenarios. For
>>>>>> example if persistence (hypothetical) fails,
>>>> DatabasePersistenceException,
>>>>>> if validation fails, ValidationFailedException etc ... Then the
>>>> developer
>>>>>> who uses the API can catch these different exceptions and act on them
>>>>>> appropriately.
>>>>>>
>>>>> +1. What needs to be understood here is that the Exception should be a
>>>>> Gateway friendly exception. i.e. it should not expose internal details
>> of
>>>>> Airavata at the top-level exception and exception message should be
>> self
>>>>> explanatory enough for the gateway developer not to remain scratching
>>>>> his/her head after reading the exception. A feedback from Sudhakar
>>>> sometime
>>>>> back was to provide suggestions in the exception message on how to
>>>> resolve
>>>>> the issue.
>>>>>
>>>>>>> *Data persisted in registry*
>>>>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>>>>> username denotes. In current API, in experiment submission we
>> consider
>>>> two
>>>>>>> types of users. Submission user (the user who submits the experiment
>>>> to the
>>>>>>> Airavata Server - this is inferred by the request itself) and the
>>>> execution
>>>>>>> user (the user who corelates to the application executions of the
>>>> gateway -
>>>>>>> thus this user can be a different user for different gateway, eg:
>>>> community
>>>>>>> user, gateway user).
>>>>>>> I think we should persist the date/time of the experiment request as
>>>>>>> well.
>>>>>>>
>>>>>> +1
>>>>>>
>>>>>>>  Also when retrying of API functions in the case of a failure in an
>>>>>>> previous attempt there should be a way to not to repeat already
>>>> performed
>>>>>>> steps or gracefully roleback and redo those required steps as
>>>> necessary.
>>>>>>> While such actions could be transparent to the user sometimes it
>> might
>>>> make
>>>>>>> sense to allow user to be notified of success/failure of a retry.
>>>> However
>>>>>>> this might mean keeping additional records at the registry level.
>>>>>>>
>>>>>> In addition we should also have a way of cleaning up unsubmitted
>>>>>> experiment ids. (But not sure whether you want to address this right
>>>> now).
>>>>>> The way I see this is to have a periodic thread which goes through the
>>>>>> table and clear up experiments which are not submitted for a defined
>>>> time.
>>>>> +1. Something else we may have to think of later is the data archiving
>>>>> capabilities. We keep running in to performance issues when the
>> database
>>>>> grows with experiment results. Unless we become experts of distributed
>>>>> database management we should have a way better way to manage our db
>>>>> performance issues.
>>>>>
>>>>>
>>>>>> BTW, nice review notes, Saminda.
>>>>>>
>>>>>> Thanks
>>>>>> Amila
>>>>>>
>>>>>>
>>>>>>
>>
>

Re: Orchestration Component implementation review

Posted by Sachith Withana <sw...@gmail.com>.

We are using internal SPIs which are not reflected in the Airavata API.
should it be explained or just make it a higher level diagram which won't
show the SPIs?


On Mon, Jan 20, 2014 at 11:06 AM, Marlon Pierce <ma...@iu.edu> wrote:

> Thanks, Sachith.  Can you explain your question about APIs and SPIs a
> little more?
>
>
> Marlon
>
> On 1/20/14 10:53 AM, Sachith Withana wrote:
> > Hi All,
> >
> > I will go ahead and create the Wiki on the Orchestrator. Will send you
> all
> > a draft as soon as I can.
> >
> > One question though, Do we have to explicitly show the SPIs and APIs
> both?
> >
> >
> > On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <ma...@iu.edu> wrote:
> >
> >> +1 for real use cases first. We have at least 3.  But I'm sure we will
> >> want to make it as easy as possible for developers to pass back the
> >> correct, created experimentID when invoking launchExperiment.
> >>
> >>
> >> Marlon
> >>
> >> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
> >>> Marlon, I think until we put this to real use we wont get much feedback
> >> on
> >>> what aspects we should focus on more and in what features we should
> >> expand
> >>> or prioritize on. So how about having a test plan for the Orchestrator.
> >>> Expose it to real usecases and see how it will survive. WDYT?
> >>>
> >>> It might be a little confusing to return a "JobRequest" object from the
> >>> Orchestrator (since its a response). Or perhaps it should be renamed?
> >>>
> >>> Sachith, I think we should have a google hangout or a separate mail
> >> thread
> >>> (or both) to discuss muti-threaded support. Could you organize this
> >> please?
> >>>
> >>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
> >>> <th...@gmail.com>wrote:
> >>>
> >>>>
> >>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <
> samindaw@gmail.com
> >>> wrote:
> >>>>> Following are few thoughts I had during my review of the component,
> >>>>>
> >>>>> *Multi-threaded vs single threaded*
> >>>>> If we are going to have multi-threaded job submission the
> >> implementation
> >>>>> should work on handling race conditions. Essentially JobSubmitter
> >> should be
> >>>>> able to "lock" an experiment request before continuing processing
> that
> >>>>> request so that other JobSubmitters accessing the experiment requests
> >> a the
> >>>>> same time would skip it.
> >>>>>
> >>>> +1. These are implementation details.
> >>>>
> >>>>
> >>>>> *Orchestrator service*
> >>>>> We might want to think of the possibility in future where we will be
> >>>>> having multiple deployments of an Airavata service. This could
> >> particularly
> >>>>> be true for SciGaP. We may have to think how some of the internal
> data
> >>>>> structures/SPIs should be updated to accomodate such requirements in
> >> future.
> >>>> +1.
> >>>>
> >>>>
> >>>>> *Orchestrator Component configurations*
> >>>>> I see alot of places where the orchestrator can have configurations.
> I
> >>>>> think its too early finalize them, but I think we can start
> refactoring
> >>>>> them out perhaps to the airavata-server.properties. I'm also seeing
> the
> >>>>> orchestrator is now hardcoded to use default/admin gateway and
> >> username. I
> >>>>> think it should come from the request itself.
> >>>>>
> >>>> +1. But in overall we may need to change the way we handle
> >> configurations
> >>>> within Airavata. Currently we have multiple configuration files and
> >>>> multiple places where we read configurations. IMO we should have a
> >> separate
> >>>> module to handle configurations. Only this module should be aware how
> to
> >>>> intepret configurations in the file and provide a component interface
> to
> >>>> access those configuration values.
> >>>>
> >>> +1 we tried this once with "ServerSettings" and "ApplicationSettings",
> >> but
> >>> apparently again more configuration files seems to have spawned. So far
> >>> however they seemed to be localized for their component now.
> >>>
> >>>>> *Visibility of API functions*
> >>>>> I think initialize(), shutdown() and startJobSubmitter() functions
> >> should
> >>>>> not be part of the API because I don't see a scenario where the
> gateway
> >>>>> developer would be responsible for using them. They serve a more
> >> internal
> >>>>> purpose of managing the orchestrator component IMO. As Amila pointed
> >> out so
> >>>>> long ago (wink) functions that do not concern outside parties should
> >> not be
> >>>>> used as part of the API.
> >>>>>
> >>>> +1
> >>>>
> >>>>
> >>>>> *Return values of Orchestrator API*
> >>>>> IMO unless it is specifically required to do so I think the functions
> >>>>> does not necessarily need to return anything other than throw
> >> exceptions
> >>>>> when needed. For example the launchExperiment can simply return void
> >> if all
> >>>>> is succesful and return an exception if something fails. Handling
> >> issues
> >>>>> with a try catch is not only simpler but also the explanations are
> >> readily
> >>>>> available for the user.
> >>>>>
> >>>> +1. Also try to have different exception for different scenarios. For
> >>>> example if persistence (hypothetical) fails,
> >> DatabasePersistenceException,
> >>>> if validation fails, ValidationFailedException etc ... Then the
> >> developer
> >>>> who uses the API can catch these different exceptions and act on them
> >>>> appropriately.
> >>>>
> >>> +1. What needs to be understood here is that the Exception should be a
> >>> Gateway friendly exception. i.e. it should not expose internal details
> of
> >>> Airavata at the top-level exception and exception message should be
> self
> >>> explanatory enough for the gateway developer not to remain scratching
> >>> his/her head after reading the exception. A feedback from Sudhakar
> >> sometime
> >>> back was to provide suggestions in the exception message on how to
> >> resolve
> >>> the issue.
> >>>
> >>>>> *Data persisted in registry*
> >>>>> ExperimentRequest.getUsername() : I think we should clarify what this
> >>>>> username denotes. In current API, in experiment submission we
> consider
> >> two
> >>>>> types of users. Submission user (the user who submits the experiment
> >> to the
> >>>>> Airavata Server - this is inferred by the request itself) and the
> >> execution
> >>>>> user (the user who corelates to the application executions of the
> >> gateway -
> >>>>> thus this user can be a different user for different gateway, eg:
> >> community
> >>>>> user, gateway user).
> >>>>> I think we should persist the date/time of the experiment request as
> >>>>> well.
> >>>>>
> >>>> +1
> >>>>
> >>>>>  Also when retrying of API functions in the case of a failure in an
> >>>>> previous attempt there should be a way to not to repeat already
> >> performed
> >>>>> steps or gracefully roleback and redo those required steps as
> >> necessary.
> >>>>> While such actions could be transparent to the user sometimes it
> might
> >> make
> >>>>> sense to allow user to be notified of success/failure of a retry.
> >> However
> >>>>> this might mean keeping additional records at the registry level.
> >>>>>
> >>>> In addition we should also have a way of cleaning up unsubmitted
> >>>> experiment ids. (But not sure whether you want to address this right
> >> now).
> >>>> The way I see this is to have a periodic thread which goes through the
> >>>> table and clear up experiments which are not submitted for a defined
> >> time.
> >>> +1. Something else we may have to think of later is the data archiving
> >>> capabilities. We keep running in to performance issues when the
> database
> >>> grows with experiment results. Unless we become experts of distributed
> >>> database management we should have a way better way to manage our db
> >>> performance issues.
> >>>
> >>>
> >>>> BTW, nice review notes, Saminda.
> >>>>
> >>>> Thanks
> >>>> Amila
> >>>>
> >>>>
> >>>>
> >>
> >
>
>


-- 
Thanks,
Sachith Withana

Re: Orchestration Component implementation review

Posted by Marlon Pierce <ma...@iu.edu>.

Thanks, Sachith.  Can you explain your question about APIs and SPIs a
little more? 


Marlon

On 1/20/14 10:53 AM, Sachith Withana wrote:
> Hi All,
>
> I will go ahead and create the Wiki on the Orchestrator. Will send you all
> a draft as soon as I can.
>
> One question though, Do we have to explicitly show the SPIs and APIs both?
>
>
> On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <ma...@iu.edu> wrote:
>
>> +1 for real use cases first. We have at least 3.  But I'm sure we will
>> want to make it as easy as possible for developers to pass back the
>> correct, created experimentID when invoking launchExperiment.
>>
>>
>> Marlon
>>
>> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
>>> Marlon, I think until we put this to real use we wont get much feedback
>> on
>>> what aspects we should focus on more and in what features we should
>> expand
>>> or prioritize on. So how about having a test plan for the Orchestrator.
>>> Expose it to real usecases and see how it will survive. WDYT?
>>>
>>> It might be a little confusing to return a "JobRequest" object from the
>>> Orchestrator (since its a response). Or perhaps it should be renamed?
>>>
>>> Sachith, I think we should have a google hangout or a separate mail
>> thread
>>> (or both) to discuss muti-threaded support. Could you organize this
>> please?
>>>
>>> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
>>> <th...@gmail.com>wrote:
>>>
>>>>
>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samindaw@gmail.com
>>> wrote:
>>>>> Following are few thoughts I had during my review of the component,
>>>>>
>>>>> *Multi-threaded vs single threaded*
>>>>> If we are going to have multi-threaded job submission the
>> implementation
>>>>> should work on handling race conditions. Essentially JobSubmitter
>> should be
>>>>> able to "lock" an experiment request before continuing processing that
>>>>> request so that other JobSubmitters accessing the experiment requests
>> a the
>>>>> same time would skip it.
>>>>>
>>>> +1. These are implementation details.
>>>>
>>>>
>>>>> *Orchestrator service*
>>>>> We might want to think of the possibility in future where we will be
>>>>> having multiple deployments of an Airavata service. This could
>> particularly
>>>>> be true for SciGaP. We may have to think how some of the internal data
>>>>> structures/SPIs should be updated to accomodate such requirements in
>> future.
>>>> +1.
>>>>
>>>>
>>>>> *Orchestrator Component configurations*
>>>>> I see alot of places where the orchestrator can have configurations. I
>>>>> think its too early finalize them, but I think we can start refactoring
>>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>>> orchestrator is now hardcoded to use default/admin gateway and
>> username. I
>>>>> think it should come from the request itself.
>>>>>
>>>> +1. But in overall we may need to change the way we handle
>> configurations
>>>> within Airavata. Currently we have multiple configuration files and
>>>> multiple places where we read configurations. IMO we should have a
>> separate
>>>> module to handle configurations. Only this module should be aware how to
>>>> intepret configurations in the file and provide a component interface to
>>>> access those configuration values.
>>>>
>>> +1 we tried this once with "ServerSettings" and "ApplicationSettings",
>> but
>>> apparently again more configuration files seems to have spawned. So far
>>> however they seemed to be localized for their component now.
>>>
>>>>> *Visibility of API functions*
>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>> should
>>>>> not be part of the API because I don't see a scenario where the gateway
>>>>> developer would be responsible for using them. They serve a more
>> internal
>>>>> purpose of managing the orchestrator component IMO. As Amila pointed
>> out so
>>>>> long ago (wink) functions that do not concern outside parties should
>> not be
>>>>> used as part of the API.
>>>>>
>>>> +1
>>>>
>>>>
>>>>> *Return values of Orchestrator API*
>>>>> IMO unless it is specifically required to do so I think the functions
>>>>> does not necessarily need to return anything other than throw
>> exceptions
>>>>> when needed. For example the launchExperiment can simply return void
>> if all
>>>>> is succesful and return an exception if something fails. Handling
>> issues
>>>>> with a try catch is not only simpler but also the explanations are
>> readily
>>>>> available for the user.
>>>>>
>>>> +1. Also try to have different exception for different scenarios. For
>>>> example if persistence (hypothetical) fails,
>> DatabasePersistenceException,
>>>> if validation fails, ValidationFailedException etc ... Then the
>> developer
>>>> who uses the API can catch these different exceptions and act on them
>>>> appropriately.
>>>>
>>> +1. What needs to be understood here is that the Exception should be a
>>> Gateway friendly exception. i.e. it should not expose internal details of
>>> Airavata at the top-level exception and exception message should be self
>>> explanatory enough for the gateway developer not to remain scratching
>>> his/her head after reading the exception. A feedback from Sudhakar
>> sometime
>>> back was to provide suggestions in the exception message on how to
>> resolve
>>> the issue.
>>>
>>>>> *Data persisted in registry*
>>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>>> username denotes. In current API, in experiment submission we consider
>> two
>>>>> types of users. Submission user (the user who submits the experiment
>> to the
>>>>> Airavata Server - this is inferred by the request itself) and the
>> execution
>>>>> user (the user who corelates to the application executions of the
>> gateway -
>>>>> thus this user can be a different user for different gateway, eg:
>> community
>>>>> user, gateway user).
>>>>> I think we should persist the date/time of the experiment request as
>>>>> well.
>>>>>
>>>> +1
>>>>
>>>>>  Also when retrying of API functions in the case of a failure in an
>>>>> previous attempt there should be a way to not to repeat already
>> performed
>>>>> steps or gracefully roleback and redo those required steps as
>> necessary.
>>>>> While such actions could be transparent to the user sometimes it might
>> make
>>>>> sense to allow user to be notified of success/failure of a retry.
>> However
>>>>> this might mean keeping additional records at the registry level.
>>>>>
>>>> In addition we should also have a way of cleaning up unsubmitted
>>>> experiment ids. (But not sure whether you want to address this right
>> now).
>>>> The way I see this is to have a periodic thread which goes through the
>>>> table and clear up experiments which are not submitted for a defined
>> time.
>>> +1. Something else we may have to think of later is the data archiving
>>> capabilities. We keep running in to performance issues when the database
>>> grows with experiment results. Unless we become experts of distributed
>>> database management we should have a way better way to manage our db
>>> performance issues.
>>>
>>>
>>>> BTW, nice review notes, Saminda.
>>>>
>>>> Thanks
>>>> Amila
>>>>
>>>>
>>>>
>>
>

Re: Orchestration Component implementation review

Posted by Sachith Withana <sw...@gmail.com>.

Hi All,

I will go ahead and create the Wiki on the Orchestrator. Will send you all
a draft as soon as I can.

One question though, Do we have to explicitly show the SPIs and APIs both?


On Mon, Jan 20, 2014 at 9:46 AM, Marlon Pierce <ma...@iu.edu> wrote:

> +1 for real use cases first. We have at least 3.  But I'm sure we will
> want to make it as easy as possible for developers to pass back the
> correct, created experimentID when invoking launchExperiment.
>
>
> Marlon
>
> On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
> > Marlon, I think until we put this to real use we wont get much feedback
> on
> > what aspects we should focus on more and in what features we should
> expand
> > or prioritize on. So how about having a test plan for the Orchestrator.
> > Expose it to real usecases and see how it will survive. WDYT?
> >
> > It might be a little confusing to return a "JobRequest" object from the
> > Orchestrator (since its a response). Or perhaps it should be renamed?
> >
> > Sachith, I think we should have a google hangout or a separate mail
> thread
> > (or both) to discuss muti-threaded support. Could you organize this
> please?
> >
> >
> > On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
> > <th...@gmail.com>wrote:
> >
> >>
> >>
> >> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <samindaw@gmail.com
> >wrote:
> >>
> >>> Following are few thoughts I had during my review of the component,
> >>>
> >>> *Multi-threaded vs single threaded*
> >>> If we are going to have multi-threaded job submission the
> implementation
> >>> should work on handling race conditions. Essentially JobSubmitter
> should be
> >>> able to "lock" an experiment request before continuing processing that
> >>> request so that other JobSubmitters accessing the experiment requests
> a the
> >>> same time would skip it.
> >>>
> >> +1. These are implementation details.
> >>
> >>
> >>> *Orchestrator service*
> >>> We might want to think of the possibility in future where we will be
> >>> having multiple deployments of an Airavata service. This could
> particularly
> >>> be true for SciGaP. We may have to think how some of the internal data
> >>> structures/SPIs should be updated to accomodate such requirements in
> future.
> >>>
> >> +1.
> >>
> >>
> >>> *Orchestrator Component configurations*
> >>> I see alot of places where the orchestrator can have configurations. I
> >>> think its too early finalize them, but I think we can start refactoring
> >>> them out perhaps to the airavata-server.properties. I'm also seeing the
> >>> orchestrator is now hardcoded to use default/admin gateway and
> username. I
> >>> think it should come from the request itself.
> >>>
> >> +1. But in overall we may need to change the way we handle
> configurations
> >> within Airavata. Currently we have multiple configuration files and
> >> multiple places where we read configurations. IMO we should have a
> separate
> >> module to handle configurations. Only this module should be aware how to
> >> intepret configurations in the file and provide a component interface to
> >> access those configuration values.
> >>
> > +1 we tried this once with "ServerSettings" and "ApplicationSettings",
> but
> > apparently again more configuration files seems to have spawned. So far
> > however they seemed to be localized for their component now.
> >
> >>
> >>> *Visibility of API functions*
> >>> I think initialize(), shutdown() and startJobSubmitter() functions
> should
> >>> not be part of the API because I don't see a scenario where the gateway
> >>> developer would be responsible for using them. They serve a more
> internal
> >>> purpose of managing the orchestrator component IMO. As Amila pointed
> out so
> >>> long ago (wink) functions that do not concern outside parties should
> not be
> >>> used as part of the API.
> >>>
> >> +1
> >>
> >>
> >>> *Return values of Orchestrator API*
> >>> IMO unless it is specifically required to do so I think the functions
> >>> does not necessarily need to return anything other than throw
> exceptions
> >>> when needed. For example the launchExperiment can simply return void
> if all
> >>> is succesful and return an exception if something fails. Handling
> issues
> >>> with a try catch is not only simpler but also the explanations are
> readily
> >>> available for the user.
> >>>
> >> +1. Also try to have different exception for different scenarios. For
> >> example if persistence (hypothetical) fails,
> DatabasePersistenceException,
> >> if validation fails, ValidationFailedException etc ... Then the
> developer
> >> who uses the API can catch these different exceptions and act on them
> >> appropriately.
> >>
> > +1. What needs to be understood here is that the Exception should be a
> > Gateway friendly exception. i.e. it should not expose internal details of
> > Airavata at the top-level exception and exception message should be self
> > explanatory enough for the gateway developer not to remain scratching
> > his/her head after reading the exception. A feedback from Sudhakar
> sometime
> > back was to provide suggestions in the exception message on how to
> resolve
> > the issue.
> >
> >>
> >>> *Data persisted in registry*
> >>> ExperimentRequest.getUsername() : I think we should clarify what this
> >>> username denotes. In current API, in experiment submission we consider
> two
> >>> types of users. Submission user (the user who submits the experiment
> to the
> >>> Airavata Server - this is inferred by the request itself) and the
> execution
> >>> user (the user who corelates to the application executions of the
> gateway -
> >>> thus this user can be a different user for different gateway, eg:
> community
> >>> user, gateway user).
> >>> I think we should persist the date/time of the experiment request as
> >>> well.
> >>>
> >> +1
> >>
> >>>  Also when retrying of API functions in the case of a failure in an
> >>> previous attempt there should be a way to not to repeat already
> performed
> >>> steps or gracefully roleback and redo those required steps as
> necessary.
> >>> While such actions could be transparent to the user sometimes it might
> make
> >>> sense to allow user to be notified of success/failure of a retry.
> However
> >>> this might mean keeping additional records at the registry level.
> >>>
> >> In addition we should also have a way of cleaning up unsubmitted
> >> experiment ids. (But not sure whether you want to address this right
> now).
> >> The way I see this is to have a periodic thread which goes through the
> >> table and clear up experiments which are not submitted for a defined
> time.
> >>
> > +1. Something else we may have to think of later is the data archiving
> > capabilities. We keep running in to performance issues when the database
> > grows with experiment results. Unless we become experts of distributed
> > database management we should have a way better way to manage our db
> > performance issues.
> >
> >
> >> BTW, nice review notes, Saminda.
> >>
> >> Thanks
> >> Amila
> >>
> >>
> >>
>
>


-- 
Thanks,
Sachith Withana

Re: Orchestration Component implementation review

Posted by Marlon Pierce <ma...@iu.edu>.

+1 for real use cases first. We have at least 3.  But I'm sure we will
want to make it as easy as possible for developers to pass back the
correct, created experimentID when invoking launchExperiment.


Marlon

On 1/17/14 2:57 PM, Saminda Wijeratne wrote:
> Marlon, I think until we put this to real use we wont get much feedback on
> what aspects we should focus on more and in what features we should expand
> or prioritize on. So how about having a test plan for the Orchestrator.
> Expose it to real usecases and see how it will survive. WDYT?
>
> It might be a little confusing to return a "JobRequest" object from the
> Orchestrator (since its a response). Or perhaps it should be renamed?
>
> Sachith, I think we should have a google hangout or a separate mail thread
> (or both) to discuss muti-threaded support. Could you organize this please?
>
>
> On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
> <th...@gmail.com>wrote:
>
>>
>>
>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>>
>>> Following are few thoughts I had during my review of the component,
>>>
>>> *Multi-threaded vs single threaded*
>>> If we are going to have multi-threaded job submission the implementation
>>> should work on handling race conditions. Essentially JobSubmitter should be
>>> able to "lock" an experiment request before continuing processing that
>>> request so that other JobSubmitters accessing the experiment requests a the
>>> same time would skip it.
>>>
>> +1. These are implementation details.
>>
>>
>>> *Orchestrator service*
>>> We might want to think of the possibility in future where we will be
>>> having multiple deployments of an Airavata service. This could particularly
>>> be true for SciGaP. We may have to think how some of the internal data
>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>
>> +1.
>>
>>
>>> *Orchestrator Component configurations*
>>> I see alot of places where the orchestrator can have configurations. I
>>> think its too early finalize them, but I think we can start refactoring
>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>> think it should come from the request itself.
>>>
>> +1. But in overall we may need to change the way we handle configurations
>> within Airavata. Currently we have multiple configuration files and
>> multiple places where we read configurations. IMO we should have a separate
>> module to handle configurations. Only this module should be aware how to
>> intepret configurations in the file and provide a component interface to
>> access those configuration values.
>>
> +1 we tried this once with "ServerSettings" and "ApplicationSettings", but
> apparently again more configuration files seems to have spawned. So far
> however they seemed to be localized for their component now.
>
>>
>>> *Visibility of API functions*
>>> I think initialize(), shutdown() and startJobSubmitter() functions should
>>> not be part of the API because I don't see a scenario where the gateway
>>> developer would be responsible for using them. They serve a more internal
>>> purpose of managing the orchestrator component IMO. As Amila pointed out so
>>> long ago (wink) functions that do not concern outside parties should not be
>>> used as part of the API.
>>>
>> +1
>>
>>
>>> *Return values of Orchestrator API*
>>> IMO unless it is specifically required to do so I think the functions
>>> does not necessarily need to return anything other than throw exceptions
>>> when needed. For example the launchExperiment can simply return void if all
>>> is succesful and return an exception if something fails. Handling issues
>>> with a try catch is not only simpler but also the explanations are readily
>>> available for the user.
>>>
>> +1. Also try to have different exception for different scenarios. For
>> example if persistence (hypothetical) fails, DatabasePersistenceException,
>> if validation fails, ValidationFailedException etc ... Then the developer
>> who uses the API can catch these different exceptions and act on them
>> appropriately.
>>
> +1. What needs to be understood here is that the Exception should be a
> Gateway friendly exception. i.e. it should not expose internal details of
> Airavata at the top-level exception and exception message should be self
> explanatory enough for the gateway developer not to remain scratching
> his/her head after reading the exception. A feedback from Sudhakar sometime
> back was to provide suggestions in the exception message on how to resolve
> the issue.
>
>>
>>> *Data persisted in registry*
>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>> username denotes. In current API, in experiment submission we consider two
>>> types of users. Submission user (the user who submits the experiment to the
>>> Airavata Server - this is inferred by the request itself) and the execution
>>> user (the user who corelates to the application executions of the gateway -
>>> thus this user can be a different user for different gateway, eg: community
>>> user, gateway user).
>>> I think we should persist the date/time of the experiment request as
>>> well.
>>>
>> +1
>>
>>>  Also when retrying of API functions in the case of a failure in an
>>> previous attempt there should be a way to not to repeat already performed
>>> steps or gracefully roleback and redo those required steps as necessary.
>>> While such actions could be transparent to the user sometimes it might make
>>> sense to allow user to be notified of success/failure of a retry. However
>>> this might mean keeping additional records at the registry level.
>>>
>> In addition we should also have a way of cleaning up unsubmitted
>> experiment ids. (But not sure whether you want to address this right now).
>> The way I see this is to have a periodic thread which goes through the
>> table and clear up experiments which are not submitted for a defined time.
>>
> +1. Something else we may have to think of later is the data archiving
> capabilities. We keep running in to performance issues when the database
> grows with experiment results. Unless we become experts of distributed
> database management we should have a way better way to manage our db
> performance issues.
>
>
>> BTW, nice review notes, Saminda.
>>
>> Thanks
>> Amila
>>
>>
>>

Re: Orchestration Component implementation review

Posted by Saminda Wijeratne <sa...@gmail.com>.

Marlon, I think until we put this to real use we wont get much feedback on
what aspects we should focus on more and in what features we should expand
or prioritize on. So how about having a test plan for the Orchestrator.
Expose it to real usecases and see how it will survive. WDYT?

It might be a little confusing to return a "JobRequest" object from the
Orchestrator (since its a response). Or perhaps it should be renamed?

Sachith, I think we should have a google hangout or a separate mail thread
(or both) to discuss muti-threaded support. Could you organize this please?


On Fri, Jan 17, 2014 at 10:29 AM, Amila Jayasekara
<th...@gmail.com>wrote:

>
>
>
> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>
>> Following are few thoughts I had during my review of the component,
>>
>> *Multi-threaded vs single threaded*
>> If we are going to have multi-threaded job submission the implementation
>> should work on handling race conditions. Essentially JobSubmitter should be
>> able to "lock" an experiment request before continuing processing that
>> request so that other JobSubmitters accessing the experiment requests a the
>> same time would skip it.
>>
>
> +1. These are implementation details.
>
>
>>
>> *Orchestrator service*
>> We might want to think of the possibility in future where we will be
>> having multiple deployments of an Airavata service. This could particularly
>> be true for SciGaP. We may have to think how some of the internal data
>> structures/SPIs should be updated to accomodate such requirements in future.
>>
>
> +1.
>
>
>>
>> *Orchestrator Component configurations*
>> I see alot of places where the orchestrator can have configurations. I
>> think its too early finalize them, but I think we can start refactoring
>> them out perhaps to the airavata-server.properties. I'm also seeing the
>> orchestrator is now hardcoded to use default/admin gateway and username. I
>> think it should come from the request itself.
>>
>
> +1. But in overall we may need to change the way we handle configurations
> within Airavata. Currently we have multiple configuration files and
> multiple places where we read configurations. IMO we should have a separate
> module to handle configurations. Only this module should be aware how to
> intepret configurations in the file and provide a component interface to
> access those configuration values.
>
+1 we tried this once with "ServerSettings" and "ApplicationSettings", but
apparently again more configuration files seems to have spawned. So far
however they seemed to be localized for their component now.

>
>
>>
>> *Visibility of API functions*
>> I think initialize(), shutdown() and startJobSubmitter() functions should
>> not be part of the API because I don't see a scenario where the gateway
>> developer would be responsible for using them. They serve a more internal
>> purpose of managing the orchestrator component IMO. As Amila pointed out so
>> long ago (wink) functions that do not concern outside parties should not be
>> used as part of the API.
>>
>
> +1
>
>
>>
>> *Return values of Orchestrator API*
>> IMO unless it is specifically required to do so I think the functions
>> does not necessarily need to return anything other than throw exceptions
>> when needed. For example the launchExperiment can simply return void if all
>> is succesful and return an exception if something fails. Handling issues
>> with a try catch is not only simpler but also the explanations are readily
>> available for the user.
>>
>
> +1. Also try to have different exception for different scenarios. For
> example if persistence (hypothetical) fails, DatabasePersistenceException,
> if validation fails, ValidationFailedException etc ... Then the developer
> who uses the API can catch these different exceptions and act on them
> appropriately.
>
+1. What needs to be understood here is that the Exception should be a
Gateway friendly exception. i.e. it should not expose internal details of
Airavata at the top-level exception and exception message should be self
explanatory enough for the gateway developer not to remain scratching
his/her head after reading the exception. A feedback from Sudhakar sometime
back was to provide suggestions in the exception message on how to resolve
the issue.

>
>
>>
>> *Data persisted in registry*
>> ExperimentRequest.getUsername() : I think we should clarify what this
>> username denotes. In current API, in experiment submission we consider two
>> types of users. Submission user (the user who submits the experiment to the
>> Airavata Server - this is inferred by the request itself) and the execution
>> user (the user who corelates to the application executions of the gateway -
>> thus this user can be a different user for different gateway, eg: community
>> user, gateway user).
>> I think we should persist the date/time of the experiment request as
>> well.
>>
> +1
>
>>  Also when retrying of API functions in the case of a failure in an
>> previous attempt there should be a way to not to repeat already performed
>> steps or gracefully roleback and redo those required steps as necessary.
>> While such actions could be transparent to the user sometimes it might make
>> sense to allow user to be notified of success/failure of a retry. However
>> this might mean keeping additional records at the registry level.
>>
>
> In addition we should also have a way of cleaning up unsubmitted
> experiment ids. (But not sure whether you want to address this right now).
> The way I see this is to have a periodic thread which goes through the
> table and clear up experiments which are not submitted for a defined time.
>
+1. Something else we may have to think of later is the data archiving
capabilities. We keep running in to performance issues when the database
grows with experiment results. Unless we become experts of distributed
database management we should have a way better way to manage our db
performance issues.


> BTW, nice review notes, Saminda.
>
> Thanks
> Amila
>
>
>

Re: Orchestration Component implementation review

Posted by Saminda Wijeratne <sa...@gmail.com>.

On Sun, Jan 19, 2014 at 2:46 PM, Lahiru Gunathilake <gl...@gmail.com>wrote:

>
>
>
> On Fri, Jan 17, 2014 at 1:29 PM, Amila Jayasekara <thejaka.amila@gmail.com
> > wrote:
>
>>
>>
>>
>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>>
>>> Following are few thoughts I had during my review of the component,
>>>
>>> *Multi-threaded vs single threaded*
>>> If we are going to have multi-threaded job submission the implementation
>>> should work on handling race conditions. Essentially JobSubmitter should be
>>> able to "lock" an experiment request before continuing processing that
>>> request so that other JobSubmitters accessing the experiment requests a the
>>> same time would skip it.
>>>
>>
>> +1. These are implementation details.
>>
>>
>>>
>>> *Orchestrator service*
>>> We might want to think of the possibility in future where we will be
>>> having multiple deployments of an Airavata service. This could particularly
>>> be true for SciGaP. We may have to think how some of the internal data
>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>
>>
>> +1.
>>
>>
>>>
>>> *Orchestrator Component configurations*
>>> I see alot of places where the orchestrator can have configurations. I
>>> think its too early finalize them, but I think we can start refactoring
>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>> think it should come from the request itself.
>>>
>>
>> +1. But in overall we may need to change the way we handle configurations
>> within Airavata. Currently we have multiple configuration files and
>> multiple places where we read configurations. IMO we should have a separate
>> module to handle configurations. Only this module should be aware how to
>> intepret configurations in the file and provide a component interface to
>> access those configuration values.
>>
> I like this approach, this will work with deploying components separate
> (if we are planning to do that). We can come to an approach like we have in
> Airavata API (Managers), like we have ServerSetting might composed of
> multiple settings (GFACsettings, OrchestratorSettings). So during isolated
> deployments some Settings objects could be null.
>
We can use the approach we use for RegistrySettings. When a
"registry.properties" exists it will pick up the registry configuration
properties from that file or else It will pick up from
airavata-server.properties/airavata-client.properties.

>
>>
>>>
>>> *Visibility of API functions*
>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>> should not be part of the API because I don't see a scenario where the
>>> gateway developer would be responsible for using them. They serve a more
>>> internal purpose of managing the orchestrator component IMO. As Amila
>>> pointed out so long ago (wink) functions that do not concern outside
>>> parties should not be used as part of the API.
>>>
>>
>> +1
>>
>>
>>>
>>> *Return values of Orchestrator API*
>>> IMO unless it is specifically required to do so I think the functions
>>> does not necessarily need to return anything other than throw exceptions
>>> when needed. For example the launchExperiment can simply return void if all
>>> is succesful and return an exception if something fails. Handling issues
>>> with a try catch is not only simpler but also the explanations are readily
>>> available for the user.
>>>
>>
>> +1. Also try to have different exception for different scenarios. For
>> example if persistence (hypothetical) fails, DatabasePersistenceException,
>> if validation fails, ValidationFailedException etc ... Then the developer
>> who uses the API can catch these different exceptions and act on them
>> appropriately.
>>
>>
>>>
>>> *Data persisted in registry*
>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>> username denotes. In current API, in experiment submission we consider two
>>> types of users. Submission user (the user who submits the experiment to the
>>> Airavata Server - this is inferred by the request itself) and the execution
>>> user (the user who corelates to the application executions of the gateway -
>>> thus this user can be a different user for different gateway, eg: community
>>> user, gateway user).
>>> I think we should persist the date/time of the experiment request as
>>> well.
>>>
>> +1
>>
>>>  Also when retrying of API functions in the case of a failure in an
>>> previous attempt there should be a way to not to repeat already performed
>>> steps or gracefully roleback and redo those required steps as necessary.
>>> While such actions could be transparent to the user sometimes it might make
>>> sense to allow user to be notified of success/failure of a retry. However
>>> this might mean keeping additional records at the registry level.
>>>
>>
>> In addition we should also have a way of cleaning up unsubmitted
>> experiment ids. (But not sure whether you want to address this right now).
>> The way I see this is to have a periodic thread which goes through the
>> table and clear up experiments which are not submitted for a defined time.
>>
> +1
>
> Lahiru
>
>>
>> BTW, nice review notes, Saminda.
>>
>> Thanks
>> Amila
>>
>>
>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Orchestration Component implementation review

Posted by Marlon Pierce <ma...@iu.edu>.

+1 also. We have lots of discussions, photographs of whiteboards, etc,
but we need to capture this in the wiki.  Volunteers?


Marlon

On 1/20/14 12:42 AM, Saminda Wijeratne wrote:
> On Sun, Jan 19, 2014 at 8:09 PM, Danushka Menikkumbura <
> danushka.menikkumbura@gmail.com> wrote:
>
>> Hi all,
>>
>> Sorry for jumping in.
>>
>> Can we please have the following just to make things little easy?. Also to
>> make sure everybody is on the same page.
>>
>> 1. High-level architecture diagram of the Orchestration component
>> including how it fits into the current architecture.
>>
>> 2. Threading model and the data model.
>>
> +1. Lahiru since that we now have considerable amount of implementation
> done on the Orchestrator do you think we could create a simple wiki article
> with a summarized facts about the component and with those images you've
> attached on JIRA?
>
>> Cheers,
>> Danushka
>>
>>
>> On Mon, Jan 20, 2014 at 4:16 AM, Lahiru Gunathilake <gl...@gmail.com>wrote:
>>
>>>
>>>
>>> On Fri, Jan 17, 2014 at 1:29 PM, Amila Jayasekara <
>>> thejaka.amila@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>>>>
>>>>> Following are few thoughts I had during my review of the component,
>>>>>
>>>>> *Multi-threaded vs single threaded*
>>>>> If we are going to have multi-threaded job submission the
>>>>> implementation should work on handling race conditions. Essentially
>>>>> JobSubmitter should be able to "lock" an experiment request before
>>>>> continuing processing that request so that other JobSubmitters accessing
>>>>> the experiment requests a the same time would skip it.
>>>>>
>>>> +1. These are implementation details.
>>>>
>>>>
>>>>> *Orchestrator service*
>>>>> We might want to think of the possibility in future where we will be
>>>>> having multiple deployments of an Airavata service. This could particularly
>>>>> be true for SciGaP. We may have to think how some of the internal data
>>>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>>>
>>>> +1.
>>>>
>>>>
>>>>> *Orchestrator Component configurations*
>>>>> I see alot of places where the orchestrator can have configurations. I
>>>>> think its too early finalize them, but I think we can start refactoring
>>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>>>> think it should come from the request itself.
>>>>>
>>>> +1. But in overall we may need to change the way we handle
>>>> configurations within Airavata. Currently we have multiple configuration
>>>> files and multiple places where we read configurations. IMO we should have
>>>> a separate module to handle configurations. Only this module should be
>>>> aware how to intepret configurations in the file and provide a component
>>>> interface to access those configuration values.
>>>>
>>> I like this approach, this will work with deploying components separate
>>> (if we are planning to do that). We can come to an approach like we have in
>>> Airavata API (Managers), like we have ServerSetting might composed of
>>> multiple settings (GFACsettings, OrchestratorSettings). So during isolated
>>> deployments some Settings objects could be null.
>>>
>>>>
>>>>> *Visibility of API functions*
>>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>>>> should not be part of the API because I don't see a scenario where the
>>>>> gateway developer would be responsible for using them. They serve a more
>>>>> internal purpose of managing the orchestrator component IMO. As Amila
>>>>> pointed out so long ago (wink) functions that do not concern outside
>>>>> parties should not be used as part of the API.
>>>>>
>>>> +1
>>>>
>>>>
>>>>> *Return values of Orchestrator API*
>>>>> IMO unless it is specifically required to do so I think the functions
>>>>> does not necessarily need to return anything other than throw exceptions
>>>>> when needed. For example the launchExperiment can simply return void if all
>>>>> is succesful and return an exception if something fails. Handling issues
>>>>> with a try catch is not only simpler but also the explanations are readily
>>>>> available for the user.
>>>>>
>>>> +1. Also try to have different exception for different scenarios. For
>>>> example if persistence (hypothetical) fails, DatabasePersistenceException,
>>>> if validation fails, ValidationFailedException etc ... Then the developer
>>>> who uses the API can catch these different exceptions and act on them
>>>> appropriately.
>>>>
>>>>
>>>>> *Data persisted in registry*
>>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>>> username denotes. In current API, in experiment submission we consider two
>>>>> types of users. Submission user (the user who submits the experiment to the
>>>>> Airavata Server - this is inferred by the request itself) and the execution
>>>>> user (the user who corelates to the application executions of the gateway -
>>>>> thus this user can be a different user for different gateway, eg: community
>>>>> user, gateway user).
>>>>> I think we should persist the date/time of the experiment request as
>>>>> well.
>>>>>
>>>> +1
>>>>
>>>>>  Also when retrying of API functions in the case of a failure in an
>>>>> previous attempt there should be a way to not to repeat already performed
>>>>> steps or gracefully roleback and redo those required steps as necessary.
>>>>> While such actions could be transparent to the user sometimes it might make
>>>>> sense to allow user to be notified of success/failure of a retry. However
>>>>> this might mean keeping additional records at the registry level.
>>>>>
>>>> In addition we should also have a way of cleaning up unsubmitted
>>>> experiment ids. (But not sure whether you want to address this right now).
>>>> The way I see this is to have a periodic thread which goes through the
>>>> table and clear up experiments which are not submitted for a defined time.
>>>>
>>> +1
>>>
>>> Lahiru
>>>
>>>> BTW, nice review notes, Saminda.
>>>>
>>>> Thanks
>>>> Amila
>>>>
>>>>
>>>>
>>>
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>>
>>

Re: Orchestration Component implementation review

Posted by Saminda Wijeratne <sa...@gmail.com>.

On Sun, Jan 19, 2014 at 8:09 PM, Danushka Menikkumbura <
danushka.menikkumbura@gmail.com> wrote:

> Hi all,
>
> Sorry for jumping in.
>
> Can we please have the following just to make things little easy?. Also to
> make sure everybody is on the same page.
>
> 1. High-level architecture diagram of the Orchestration component
> including how it fits into the current architecture.
>
> 2. Threading model and the data model.
>
+1. Lahiru since that we now have considerable amount of implementation
done on the Orchestrator do you think we could create a simple wiki article
with a summarized facts about the component and with those images you've
attached on JIRA?

>
> Cheers,
> Danushka
>
>
> On Mon, Jan 20, 2014 at 4:16 AM, Lahiru Gunathilake <gl...@gmail.com>wrote:
>
>>
>>
>>
>> On Fri, Jan 17, 2014 at 1:29 PM, Amila Jayasekara <
>> thejaka.amila@gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>>>
>>>> Following are few thoughts I had during my review of the component,
>>>>
>>>> *Multi-threaded vs single threaded*
>>>> If we are going to have multi-threaded job submission the
>>>> implementation should work on handling race conditions. Essentially
>>>> JobSubmitter should be able to "lock" an experiment request before
>>>> continuing processing that request so that other JobSubmitters accessing
>>>> the experiment requests a the same time would skip it.
>>>>
>>>
>>> +1. These are implementation details.
>>>
>>>
>>>>
>>>> *Orchestrator service*
>>>> We might want to think of the possibility in future where we will be
>>>> having multiple deployments of an Airavata service. This could particularly
>>>> be true for SciGaP. We may have to think how some of the internal data
>>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>>
>>>
>>> +1.
>>>
>>>
>>>>
>>>> *Orchestrator Component configurations*
>>>> I see alot of places where the orchestrator can have configurations. I
>>>> think its too early finalize them, but I think we can start refactoring
>>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>>> think it should come from the request itself.
>>>>
>>>
>>> +1. But in overall we may need to change the way we handle
>>> configurations within Airavata. Currently we have multiple configuration
>>> files and multiple places where we read configurations. IMO we should have
>>> a separate module to handle configurations. Only this module should be
>>> aware how to intepret configurations in the file and provide a component
>>> interface to access those configuration values.
>>>
>> I like this approach, this will work with deploying components separate
>> (if we are planning to do that). We can come to an approach like we have in
>> Airavata API (Managers), like we have ServerSetting might composed of
>> multiple settings (GFACsettings, OrchestratorSettings). So during isolated
>> deployments some Settings objects could be null.
>>
>>>
>>>
>>>>
>>>> *Visibility of API functions*
>>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>>> should not be part of the API because I don't see a scenario where the
>>>> gateway developer would be responsible for using them. They serve a more
>>>> internal purpose of managing the orchestrator component IMO. As Amila
>>>> pointed out so long ago (wink) functions that do not concern outside
>>>> parties should not be used as part of the API.
>>>>
>>>
>>> +1
>>>
>>>
>>>>
>>>> *Return values of Orchestrator API*
>>>> IMO unless it is specifically required to do so I think the functions
>>>> does not necessarily need to return anything other than throw exceptions
>>>> when needed. For example the launchExperiment can simply return void if all
>>>> is succesful and return an exception if something fails. Handling issues
>>>> with a try catch is not only simpler but also the explanations are readily
>>>> available for the user.
>>>>
>>>
>>> +1. Also try to have different exception for different scenarios. For
>>> example if persistence (hypothetical) fails, DatabasePersistenceException,
>>> if validation fails, ValidationFailedException etc ... Then the developer
>>> who uses the API can catch these different exceptions and act on them
>>> appropriately.
>>>
>>>
>>>>
>>>> *Data persisted in registry*
>>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>>> username denotes. In current API, in experiment submission we consider two
>>>> types of users. Submission user (the user who submits the experiment to the
>>>> Airavata Server - this is inferred by the request itself) and the execution
>>>> user (the user who corelates to the application executions of the gateway -
>>>> thus this user can be a different user for different gateway, eg: community
>>>> user, gateway user).
>>>> I think we should persist the date/time of the experiment request as
>>>> well.
>>>>
>>> +1
>>>
>>>>  Also when retrying of API functions in the case of a failure in an
>>>> previous attempt there should be a way to not to repeat already performed
>>>> steps or gracefully roleback and redo those required steps as necessary.
>>>> While such actions could be transparent to the user sometimes it might make
>>>> sense to allow user to be notified of success/failure of a retry. However
>>>> this might mean keeping additional records at the registry level.
>>>>
>>>
>>> In addition we should also have a way of cleaning up unsubmitted
>>> experiment ids. (But not sure whether you want to address this right now).
>>> The way I see this is to have a periodic thread which goes through the
>>> table and clear up experiments which are not submitted for a defined time.
>>>
>> +1
>>
>> Lahiru
>>
>>>
>>> BTW, nice review notes, Saminda.
>>>
>>> Thanks
>>> Amila
>>>
>>>
>>>
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>

Re: Orchestration Component implementation review

Posted by Danushka Menikkumbura <da...@gmail.com>.

Hi all,

Sorry for jumping in.

Can we please have the following just to make things little easy?. Also to
make sure everybody is on the same page.

1. High-level architecture diagram of the Orchestration component including
how it fits into the current architecture.

2. Threading model and the data model.

Cheers,
Danushka


On Mon, Jan 20, 2014 at 4:16 AM, Lahiru Gunathilake <gl...@gmail.com>wrote:

>
>
>
> On Fri, Jan 17, 2014 at 1:29 PM, Amila Jayasekara <thejaka.amila@gmail.com
> > wrote:
>
>>
>>
>>
>> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>>
>>> Following are few thoughts I had during my review of the component,
>>>
>>> *Multi-threaded vs single threaded*
>>> If we are going to have multi-threaded job submission the implementation
>>> should work on handling race conditions. Essentially JobSubmitter should be
>>> able to "lock" an experiment request before continuing processing that
>>> request so that other JobSubmitters accessing the experiment requests a the
>>> same time would skip it.
>>>
>>
>> +1. These are implementation details.
>>
>>
>>>
>>> *Orchestrator service*
>>> We might want to think of the possibility in future where we will be
>>> having multiple deployments of an Airavata service. This could particularly
>>> be true for SciGaP. We may have to think how some of the internal data
>>> structures/SPIs should be updated to accomodate such requirements in future.
>>>
>>
>> +1.
>>
>>
>>>
>>> *Orchestrator Component configurations*
>>> I see alot of places where the orchestrator can have configurations. I
>>> think its too early finalize them, but I think we can start refactoring
>>> them out perhaps to the airavata-server.properties. I'm also seeing the
>>> orchestrator is now hardcoded to use default/admin gateway and username. I
>>> think it should come from the request itself.
>>>
>>
>> +1. But in overall we may need to change the way we handle configurations
>> within Airavata. Currently we have multiple configuration files and
>> multiple places where we read configurations. IMO we should have a separate
>> module to handle configurations. Only this module should be aware how to
>> intepret configurations in the file and provide a component interface to
>> access those configuration values.
>>
> I like this approach, this will work with deploying components separate
> (if we are planning to do that). We can come to an approach like we have in
> Airavata API (Managers), like we have ServerSetting might composed of
> multiple settings (GFACsettings, OrchestratorSettings). So during isolated
> deployments some Settings objects could be null.
>
>>
>>
>>>
>>> *Visibility of API functions*
>>> I think initialize(), shutdown() and startJobSubmitter() functions
>>> should not be part of the API because I don't see a scenario where the
>>> gateway developer would be responsible for using them. They serve a more
>>> internal purpose of managing the orchestrator component IMO. As Amila
>>> pointed out so long ago (wink) functions that do not concern outside
>>> parties should not be used as part of the API.
>>>
>>
>> +1
>>
>>
>>>
>>> *Return values of Orchestrator API*
>>> IMO unless it is specifically required to do so I think the functions
>>> does not necessarily need to return anything other than throw exceptions
>>> when needed. For example the launchExperiment can simply return void if all
>>> is succesful and return an exception if something fails. Handling issues
>>> with a try catch is not only simpler but also the explanations are readily
>>> available for the user.
>>>
>>
>> +1. Also try to have different exception for different scenarios. For
>> example if persistence (hypothetical) fails, DatabasePersistenceException,
>> if validation fails, ValidationFailedException etc ... Then the developer
>> who uses the API can catch these different exceptions and act on them
>> appropriately.
>>
>>
>>>
>>> *Data persisted in registry*
>>> ExperimentRequest.getUsername() : I think we should clarify what this
>>> username denotes. In current API, in experiment submission we consider two
>>> types of users. Submission user (the user who submits the experiment to the
>>> Airavata Server - this is inferred by the request itself) and the execution
>>> user (the user who corelates to the application executions of the gateway -
>>> thus this user can be a different user for different gateway, eg: community
>>> user, gateway user).
>>> I think we should persist the date/time of the experiment request as
>>> well.
>>>
>> +1
>>
>>>  Also when retrying of API functions in the case of a failure in an
>>> previous attempt there should be a way to not to repeat already performed
>>> steps or gracefully roleback and redo those required steps as necessary.
>>> While such actions could be transparent to the user sometimes it might make
>>> sense to allow user to be notified of success/failure of a retry. However
>>> this might mean keeping additional records at the registry level.
>>>
>>
>> In addition we should also have a way of cleaning up unsubmitted
>> experiment ids. (But not sure whether you want to address this right now).
>> The way I see this is to have a periodic thread which goes through the
>> table and clear up experiments which are not submitted for a defined time.
>>
> +1
>
> Lahiru
>
>>
>> BTW, nice review notes, Saminda.
>>
>> Thanks
>> Amila
>>
>>
>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Re: Orchestration Component implementation review

Posted by Lahiru Gunathilake <gl...@gmail.com>.

On Fri, Jan 17, 2014 at 1:29 PM, Amila Jayasekara
<th...@gmail.com>wrote:

>
>
>
> On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:
>
>> Following are few thoughts I had during my review of the component,
>>
>> *Multi-threaded vs single threaded*
>> If we are going to have multi-threaded job submission the implementation
>> should work on handling race conditions. Essentially JobSubmitter should be
>> able to "lock" an experiment request before continuing processing that
>> request so that other JobSubmitters accessing the experiment requests a the
>> same time would skip it.
>>
>
> +1. These are implementation details.
>
>
>>
>> *Orchestrator service*
>> We might want to think of the possibility in future where we will be
>> having multiple deployments of an Airavata service. This could particularly
>> be true for SciGaP. We may have to think how some of the internal data
>> structures/SPIs should be updated to accomodate such requirements in future.
>>
>
> +1.
>
>
>>
>> *Orchestrator Component configurations*
>> I see alot of places where the orchestrator can have configurations. I
>> think its too early finalize them, but I think we can start refactoring
>> them out perhaps to the airavata-server.properties. I'm also seeing the
>> orchestrator is now hardcoded to use default/admin gateway and username. I
>> think it should come from the request itself.
>>
>
> +1. But in overall we may need to change the way we handle configurations
> within Airavata. Currently we have multiple configuration files and
> multiple places where we read configurations. IMO we should have a separate
> module to handle configurations. Only this module should be aware how to
> intepret configurations in the file and provide a component interface to
> access those configuration values.
>
I like this approach, this will work with deploying components separate (if
we are planning to do that). We can come to an approach like we have in
Airavata API (Managers), like we have ServerSetting might composed of
multiple settings (GFACsettings, OrchestratorSettings). So during isolated
deployments some Settings objects could be null.

>
>
>>
>> *Visibility of API functions*
>> I think initialize(), shutdown() and startJobSubmitter() functions should
>> not be part of the API because I don't see a scenario where the gateway
>> developer would be responsible for using them. They serve a more internal
>> purpose of managing the orchestrator component IMO. As Amila pointed out so
>> long ago (wink) functions that do not concern outside parties should not be
>> used as part of the API.
>>
>
> +1
>
>
>>
>> *Return values of Orchestrator API*
>> IMO unless it is specifically required to do so I think the functions
>> does not necessarily need to return anything other than throw exceptions
>> when needed. For example the launchExperiment can simply return void if all
>> is succesful and return an exception if something fails. Handling issues
>> with a try catch is not only simpler but also the explanations are readily
>> available for the user.
>>
>
> +1. Also try to have different exception for different scenarios. For
> example if persistence (hypothetical) fails, DatabasePersistenceException,
> if validation fails, ValidationFailedException etc ... Then the developer
> who uses the API can catch these different exceptions and act on them
> appropriately.
>
>
>>
>> *Data persisted in registry*
>> ExperimentRequest.getUsername() : I think we should clarify what this
>> username denotes. In current API, in experiment submission we consider two
>> types of users. Submission user (the user who submits the experiment to the
>> Airavata Server - this is inferred by the request itself) and the execution
>> user (the user who corelates to the application executions of the gateway -
>> thus this user can be a different user for different gateway, eg: community
>> user, gateway user).
>> I think we should persist the date/time of the experiment request as
>> well.
>>
> +1
>
>> Also when retrying of API functions in the case of a failure in an
>> previous attempt there should be a way to not to repeat already performed
>> steps or gracefully roleback and redo those required steps as necessary.
>> While such actions could be transparent to the user sometimes it might make
>> sense to allow user to be notified of success/failure of a retry. However
>> this might mean keeping additional records at the registry level.
>>
>
> In addition we should also have a way of cleaning up unsubmitted
> experiment ids. (But not sure whether you want to address this right now).
> The way I see this is to have a periodic thread which goes through the
> table and clear up experiments which are not submitted for a defined time.
>
+1

Lahiru

>
> BTW, nice review notes, Saminda.
>
> Thanks
> Amila
>
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Orchestration Component implementation review

Posted by Amila Jayasekara <th...@gmail.com>.

On Fri, Jan 17, 2014 at 10:32 AM, Saminda Wijeratne <sa...@gmail.com>wrote:

> Following are few thoughts I had during my review of the component,
>
> *Multi-threaded vs single threaded*
> If we are going to have multi-threaded job submission the implementation
> should work on handling race conditions. Essentially JobSubmitter should be
> able to "lock" an experiment request before continuing processing that
> request so that other JobSubmitters accessing the experiment requests a the
> same time would skip it.
>

+1. These are implementation details.


>
> *Orchestrator service*
> We might want to think of the possibility in future where we will be
> having multiple deployments of an Airavata service. This could particularly
> be true for SciGaP. We may have to think how some of the internal data
> structures/SPIs should be updated to accomodate such requirements in future.
>

+1.


>
> *Orchestrator Component configurations*
> I see alot of places where the orchestrator can have configurations. I
> think its too early finalize them, but I think we can start refactoring
> them out perhaps to the airavata-server.properties. I'm also seeing the
> orchestrator is now hardcoded to use default/admin gateway and username. I
> think it should come from the request itself.
>

+1. But in overall we may need to change the way we handle configurations
within Airavata. Currently we have multiple configuration files and
multiple places where we read configurations. IMO we should have a separate
module to handle configurations. Only this module should be aware how to
intepret configurations in the file and provide a component interface to
access those configuration values.


>
> *Visibility of API functions*
> I think initialize(), shutdown() and startJobSubmitter() functions should
> not be part of the API because I don't see a scenario where the gateway
> developer would be responsible for using them. They serve a more internal
> purpose of managing the orchestrator component IMO. As Amila pointed out so
> long ago (wink) functions that do not concern outside parties should not be
> used as part of the API.
>

+1


>
> *Return values of Orchestrator API*
> IMO unless it is specifically required to do so I think the functions does
> not necessarily need to return anything other than throw exceptions when
> needed. For example the launchExperiment can simply return void if all is
> succesful and return an exception if something fails. Handling issues with
> a try catch is not only simpler but also the explanations are readily
> available for the user.
>

+1. Also try to have different exception for different scenarios. For
example if persistence (hypothetical) fails, DatabasePersistenceException,
if validation fails, ValidationFailedException etc ... Then the developer
who uses the API can catch these different exceptions and act on them
appropriately.


>
> *Data persisted in registry*
> ExperimentRequest.getUsername() : I think we should clarify what this
> username denotes. In current API, in experiment submission we consider two
> types of users. Submission user (the user who submits the experiment to the
> Airavata Server - this is inferred by the request itself) and the execution
> user (the user who corelates to the application executions of the gateway -
> thus this user can be a different user for different gateway, eg: community
> user, gateway user).
> I think we should persist the date/time of the experiment request as well.
>
+1

> Also when retrying of API functions in the case of a failure in an
> previous attempt there should be a way to not to repeat already performed
> steps or gracefully roleback and redo those required steps as necessary.
> While such actions could be transparent to the user sometimes it might make
> sense to allow user to be notified of success/failure of a retry. However
> this might mean keeping additional records at the registry level.
>

In addition we should also have a way of cleaning up unsubmitted experiment
ids. (But not sure whether you want to address this right now). The way I
see this is to have a periodic thread which goes through the table and
clear up experiments which are not submitted for a defined time.

BTW, nice review notes, Saminda.

Thanks
Amila

Re: Orchestration Component implementation review

Posted by Raminder Singh <ra...@gmail.com>.

+1 for returning JobRequest object with pre-populated ExperimentID and other details. I can extend this object for header data also. That way we can make sure user is setting the right information.  

We have discussed to approaches to multi-threaded orchestrator, Pull based (request 1st saved to DB and a thread is polling to run the jobs) and On-demand (request is served right away and we update the status table to help management of the jobs like recovery). Multithreaded implementation need to evolve and i agree with Saminda about the improvements. 

Thanks
Raminder 

On Jan 17, 2014, at 11:42 AM, Marlon Pierce <ma...@iu.edu> wrote:

> I have a little comment on the API.  The two step process that we came
> up with requires the user to first call createExperiment to get an
> experiment ID and then call launchExperiment(JobRequest jobRequest). 
> The jobRequest object should include the experimentID returned by
> createExperiment() but we have no way of enforcing this.
> 
> I think this will be confusing to a developer and may introduce other
> problems.  How about having createExperiment() return a JobRequest
> object with default values, including the correct experimentID?  The
> client code can update these as needed to override defaults and then
> send back to the orchestrator through launchExperiment().
> 
> 
> Marlon
> 
> On 1/17/14 10:32 AM, Saminda Wijeratne wrote:
>> Following are few thoughts I had during my review of the component,
>> 
>> *Multi-threaded vs single threaded*
>> If we are going to have multi-threaded job submission the implementation
>> should work on handling race conditions. Essentially JobSubmitter should be
>> able to "lock" an experiment request before continuing processing that
>> request so that other JobSubmitters accessing the experiment requests a the
>> same time would skip it.
>> 
>> *Orchestrator service*
>> We might want to think of the possibility in future where we will be having
>> multiple deployments of an Airavata service. This could particularly be
>> true for SciGaP. We may have to think how some of the internal data
>> structures/SPIs should be updated to accomodate such requirements in future.
>> 
>> *Orchestrator Component configurations*
>> I see alot of places where the orchestrator can have configurations. I
>> think its too early finalize them, but I think we can start refactoring
>> them out perhaps to the airavata-server.properties. I'm also seeing the
>> orchestrator is now hardcoded to use default/admin gateway and username. I
>> think it should come from the request itself.
>> 
>> *Visibility of API functions*
>> I think initialize(), shutdown() and startJobSubmitter() functions should
>> not be part of the API because I don't see a scenario where the gateway
>> developer would be responsible for using them. They serve a more internal
>> purpose of managing the orchestrator component IMO. As Amila pointed out so
>> long ago (wink) functions that do not concern outside parties should not be
>> used as part of the API.
>> 
>> *Return values of Orchestrator API*
>> IMO unless it is specifically required to do so I think the functions does
>> not necessarily need to return anything other than throw exceptions when
>> needed. For example the launchExperiment can simply return void if all is
>> succesful and return an exception if something fails. Handling issues with
>> a try catch is not only simpler but also the explanations are readily
>> available for the user.
>> 
>> *Data persisted in registry*
>> ExperimentRequest.getUsername() : I think we should clarify what this
>> username denotes. In current API, in experiment submission we consider two
>> types of users. Submission user (the user who submits the experiment to the
>> Airavata Server - this is inferred by the request itself) and the execution
>> user (the user who corelates to the application executions of the gateway -
>> thus this user can be a different user for different gateway, eg: community
>> user, gateway user).
>> I think we should persist the date/time of the experiment request as well.
>> Also when retrying of API functions in the case of a failure in an previous
>> attempt there should be a way to not to repeat already performed steps or
>> gracefully roleback and redo those required steps as necessary. While such
>> actions could be transparent to the user sometimes it might make sense to
>> allow user to be notified of success/failure of a retry. However this might
>> mean keeping additional records at the registry level.
>> 
>

Re: Orchestration Component implementation review

Posted by Marlon Pierce <ma...@iu.edu>.

I have a little comment on the API.  The two step process that we came
up with requires the user to first call createExperiment to get an
experiment ID and then call launchExperiment(JobRequest jobRequest). 
The jobRequest object should include the experimentID returned by
createExperiment() but we have no way of enforcing this.

I think this will be confusing to a developer and may introduce other
problems.  How about having createExperiment() return a JobRequest
object with default values, including the correct experimentID?  The
client code can update these as needed to override defaults and then
send back to the orchestrator through launchExperiment().


Marlon

On 1/17/14 10:32 AM, Saminda Wijeratne wrote:
> Following are few thoughts I had during my review of the component,
>
> *Multi-threaded vs single threaded*
> If we are going to have multi-threaded job submission the implementation
> should work on handling race conditions. Essentially JobSubmitter should be
> able to "lock" an experiment request before continuing processing that
> request so that other JobSubmitters accessing the experiment requests a the
> same time would skip it.
>
> *Orchestrator service*
> We might want to think of the possibility in future where we will be having
> multiple deployments of an Airavata service. This could particularly be
> true for SciGaP. We may have to think how some of the internal data
> structures/SPIs should be updated to accomodate such requirements in future.
>
> *Orchestrator Component configurations*
> I see alot of places where the orchestrator can have configurations. I
> think its too early finalize them, but I think we can start refactoring
> them out perhaps to the airavata-server.properties. I'm also seeing the
> orchestrator is now hardcoded to use default/admin gateway and username. I
> think it should come from the request itself.
>
> *Visibility of API functions*
> I think initialize(), shutdown() and startJobSubmitter() functions should
> not be part of the API because I don't see a scenario where the gateway
> developer would be responsible for using them. They serve a more internal
> purpose of managing the orchestrator component IMO. As Amila pointed out so
> long ago (wink) functions that do not concern outside parties should not be
> used as part of the API.
>
> *Return values of Orchestrator API*
> IMO unless it is specifically required to do so I think the functions does
> not necessarily need to return anything other than throw exceptions when
> needed. For example the launchExperiment can simply return void if all is
> succesful and return an exception if something fails. Handling issues with
> a try catch is not only simpler but also the explanations are readily
> available for the user.
>
> *Data persisted in registry*
> ExperimentRequest.getUsername() : I think we should clarify what this
> username denotes. In current API, in experiment submission we consider two
> types of users. Submission user (the user who submits the experiment to the
> Airavata Server - this is inferred by the request itself) and the execution
> user (the user who corelates to the application executions of the gateway -
> thus this user can be a different user for different gateway, eg: community
> user, gateway user).
> I think we should persist the date/time of the experiment request as well.
> Also when retrying of API functions in the case of a failure in an previous
> attempt there should be a way to not to repeat already performed steps or
> gracefully roleback and redo those required steps as necessary. While such
> actions could be transparent to the user sometimes it might make sense to
> allow user to be notified of success/failure of a retry. However this might
> mean keeping additional records at the registry level.
>