You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by Suresh Marru <sm...@apache.org> on 2012/05/13 15:49:36 UTC

[DISCUSS] Enhance Airavata support for CLI's

Hi All,

I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options. 

Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.

Application refers to the commands airavata executes on computational resources.

Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.  

Arguments: 
*  should support application arguments and provide a way to specify both required and optional. 
In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).

* Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.

* Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c

* Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'. 

* If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead. 

* If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally. 

* If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application. 

* Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file. 

Outputs:
* Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr. 
* All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified. 
* If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified. 
* If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags. 
* Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed). 
* A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory. 
* Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out. 

Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki. 

Cheers,
Suresh

[1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
[2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02



Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Marlon Pierce <ma...@iu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Joe--

It doesn't exist yet.  I just realized we need to create one. We'll try using the Airavata wiki, and we should be able to give write access to anyone who wants it.  I'll set the tables up later this afternoon if circumstances cooperate.


Marlon


On 5/15/12 9:07 AM, Joseph Hargitai wrote:
> Marlon,
> 
> where is the use case depository? Is it browsable? We'll have a few of our own to add.
> 
> best,
> joe
>  
> ________________________________________
> From: Marlon Pierce [marpierc@iu.edu]
> Sent: Tuesday, May 15, 2012 8:53 AM
> To: airavata-dev@incubator.apache.org
> Subject: Re: [DISCUSS] Enhance Airavata support for CLI's
> 
> We have a large collection of use cases, so it would be a good exercise to apply the email below to specific applications.
> 
> 
> Marlon
> 
> 
> On 5/13/12 9:49 AM, Suresh Marru wrote:
>> Hi All,
> 
>> I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options.
> 
>> Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.
> 
>> Application refers to the commands airavata executes on computational resources.
> 
>> Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.
> 
>> Arguments:
>> *  should support application arguments and provide a way to specify both required and optional.
>> In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).
> 
>> * Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.
> 
>> * Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c
> 
>> * Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'.
> 
>> * If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead.
> 
>> * If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally.
> 
>> * If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application.
> 
>> * Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file.
> 
>> Outputs:
>> * Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr.
>> * All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified.
>> * If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified.
>> * If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags.
>> * Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed).
>> * A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory.
>> * Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out.
> 
>> Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki.
> 
>> Cheers,
>> Suresh
> 
>> [1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
>> [2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
> 
> 
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPslYEAAoJEEfVXEODPFIDAoAH/3nYHuA6cNOhKxgzPyx5ezeK
LeUo094BDWN08zimvOkkjoFc2NBs+bZn3xxZ2jBr9Gc285Cq9HYJfdBrSKyLGarZ
9rg9+Z0jzwRPDffWY00vB57/UtjPdZZ2o5BVUxbLecbWn5WbwLRYXXFAcegu+rKE
1YFDUfZvmH2En8Hjb20wHeAnjBaYFlouW2uJws4Wn0Wk8/N3kKoAjdewik4iCi8N
gwKEULBxm/RdVc0kKFmdYMCMCvjDnmFGyJThn9tnPANaMZe40bwMe6tZo8RKqT+p
J1/gr66NokXsfo7uGoDABWi0LU3ywjIhelTpH5NtJm87ZSbscT29aZEW6S2oxyo=
=mewy
-----END PGP SIGNATURE-----

RE: [DISCUSS] Enhance Airavata support for CLI's

Posted by Joseph Hargitai <jo...@einstein.yu.edu>.
Marlon,

where is the use case depository? Is it browsable? We'll have a few of our own to add.

best,
joe
 
________________________________________
From: Marlon Pierce [marpierc@iu.edu]
Sent: Tuesday, May 15, 2012 8:53 AM
To: airavata-dev@incubator.apache.org
Subject: Re: [DISCUSS] Enhance Airavata support for CLI's

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We have a large collection of use cases, so it would be a good exercise to apply the email below to specific applications.


Marlon


On 5/13/12 9:49 AM, Suresh Marru wrote:
> Hi All,
>
> I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options.
>
> Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.
>
> Application refers to the commands airavata executes on computational resources.
>
> Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.
>
> Arguments:
> *  should support application arguments and provide a way to specify both required and optional.
> In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).
>
> * Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.
>
> * Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c
>
> * Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'.
>
> * If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead.
>
> * If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally.
>
> * If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application.
>
> * Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file.
>
> Outputs:
> * Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr.
> * All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified.
> * If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified.
> * If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags.
> * Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed).
> * A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory.
> * Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out.
>
> Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki.
>
> Cheers,
> Suresh
>
> [1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
> [2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
>
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPslG7AAoJEEfVXEODPFID6mEH/3rHydPBk/TV4xHthrAFG9DV
mc2FrbWPbdz0ofArPHAkpQm+3cQo/Q8FuyWONY9Rn5HetIG4huUnbGGC5Hc6lQpg
Bc+jyaPgmFVLpO2dGNrZYm5TZF0CL/dSlyUKAa4G3FCrMTZUzUP+Cn0N3n7cnyfM
COFpBNiT6Auh3q121Mve02cqZuzEyUbc6r+T2dz7Y5GeYsQIeGMDzmfQEZ4fS9Ps
2I3kEoPz2cJgPKaFDBDemZaG5oyrsvBpCkTI0s93i5HCJ/ltQGz805H4S4AfQNLk
UpGR5xOCBXiiCttl56UNuTW/l78j0ETIP6DGtbn1wa23L4fw/qOSIv95MUwIXos=
=VZUy
-----END PGP SIGNATURE-----



Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Marlon Pierce <ma...@iu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

+1 for using the wiki instead.

On 5/15/12 9:06 AM, Suresh Marru wrote:
> On May 15, 2012, at 8:55 AM, Marlon Pierce wrote:
> 
> I'm thinking of a google spreadsheet....
> 
>> + 1 this is good idea to map requirements. May be we can use the tables in the wiki. 
> 
>> Suresh
> 
> On 5/15/12 8:53 AM, Marlon Pierce wrote:
>>>> We have a large collection of use cases, so it would be a good exercise to apply the email below to specific applications.
>>>>
>>>>
>>>> Marlon
>>>>
>>>>
>>>> On 5/13/12 9:49 AM, Suresh Marru wrote:
>>>>> Hi All,
>>>>
>>>>> I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options. 
>>>>
>>>>> Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.
>>>>
>>>>> Application refers to the commands airavata executes on computational resources.
>>>>
>>>>> Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.  
>>>>
>>>>> Arguments: 
>>>>> *  should support application arguments and provide a way to specify both required and optional. 
>>>>> In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).
>>>>
>>>>> * Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.
>>>>
>>>>> * Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c
>>>>
>>>>> * Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'. 
>>>>
>>>>> * If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead. 
>>>>
>>>>> * If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally. 
>>>>
>>>>> * If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application. 
>>>>
>>>>> * Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file. 
>>>>
>>>>> Outputs:
>>>>> * Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr. 
>>>>> * All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified. 
>>>>> * If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified. 
>>>>> * If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags. 
>>>>> * Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed). 
>>>>> * A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory. 
>>>>> * Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out. 
>>>>
>>>>> Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki. 
>>>>
>>>>> Cheers,
>>>>> Suresh
>>>>
>>>>> [1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
>>>>> [2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
>>>>
>>>>
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPslUEAAoJEEfVXEODPFIDRsgH/1OGGpN3DcT6Ellfw8HHDHwW
ZbCTMl48qOgcODyx6bOLFiQuj5tRsPSQtXgSeUTnirbHuuNjLhB9ptRuQ3Lxbf10
xoMdisfS7g9j7hqcTvyJCZGeTNMtkbjVayuLqF+u+EKajVGCTEoUiKbkFoZD8iYK
YHTS59oOudcbBmkCXVqgkFrha++VDEyJ+u9j779Mauvuhd13vo/RtSNQHqWfvEtc
THgFAnwN/6dBXnfZSF9aDGqPky0mSEUFCty3tZEnmT/q//yHLH6pYIocBAgXmXQc
hkqedWtZ+4atfan0YaYR7wfJ44FfIYYnoY0rzoDRLMPt5hj0zRBLG1gQImoBH10=
=uLEl
-----END PGP SIGNATURE-----

Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Suresh Marru <sm...@apache.org>.
On May 15, 2012, at 8:55 AM, Marlon Pierce wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm thinking of a google spreadsheet....

+ 1 this is good idea to map requirements. May be we can use the tables in the wiki. 

Suresh

> On 5/15/12 8:53 AM, Marlon Pierce wrote:
>> We have a large collection of use cases, so it would be a good exercise to apply the email below to specific applications.
>> 
>> 
>> Marlon
>> 
>> 
>> On 5/13/12 9:49 AM, Suresh Marru wrote:
>>> Hi All,
>> 
>>> I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options. 
>> 
>>> Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.
>> 
>>> Application refers to the commands airavata executes on computational resources.
>> 
>>> Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.  
>> 
>>> Arguments: 
>>> *  should support application arguments and provide a way to specify both required and optional. 
>>> In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).
>> 
>>> * Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.
>> 
>>> * Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c
>> 
>>> * Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'. 
>> 
>>> * If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead. 
>> 
>>> * If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally. 
>> 
>>> * If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application. 
>> 
>>> * Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file. 
>> 
>>> Outputs:
>>> * Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr. 
>>> * All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified. 
>>> * If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified. 
>>> * If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags. 
>>> * Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed). 
>>> * A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory. 
>>> * Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out. 
>> 
>>> Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki. 
>> 
>>> Cheers,
>>> Suresh
>> 
>>> [1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
>>> [2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
>> 
>> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iQEcBAEBAgAGBQJPslI0AAoJEEfVXEODPFIDOrAIAKI6yUXoWTVx6vrX2xCZlTta
> vRxQS/Kpc7OVtO6IFJKtpODfrQ10GCgynweewt8rF7c8JztFbLWqNmSCFiYnRdrc
> B+ZAg5EZRDwW+bs9OO0FhFhp/DkcJKE97o0Kx0YRDPsAQj+SS9OCpzneFR/6mbQ8
> 3AI2x/byBIE4jwaBUZjH31hmXzS1M7ibYR5J10gBqO2ONgeTShipWgbR/QyjebFs
> /g3dtfaVwiaB99qRa6bVf3dyAB2wIWMtwRvtoAzqQTdYHMnkiE+azF2/02tfRXiu
> LIizzd/ErW3XVHVpUbALdu4Grue3YeaOUmG69yjq8Ipzjk9i+BVA22dvaWebKb0=
> =4Bss
> -----END PGP SIGNATURE-----


Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Lahiru Gunathilake <gl...@gmail.com>.
Hi Suresh,

Please see my inline comments.
On Tue, May 15, 2012 at 8:55 AM, Marlon Pierce <ma...@iu.edu> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> I'm thinking of a google spreadsheet....
>
> On 5/15/12 8:53 AM, Marlon Pierce wrote:
> > We have a large collection of use cases, so it would be a good exercise
> to apply the email below to specific applications.
> >
> >
> > Marlon
> >
> >
> > On 5/13/12 9:49 AM, Suresh Marru wrote:
> >> Hi All,
> >
> >> I am trying to revisit the Airavata support for all command line
> options we pass to applications. Airavata's goal is to make end users
> oblivious to any application execution details, but application service
> providers need flexibility to configure all possible application options.
> >
> >> Some terminology like arguments vs parameters vs attributes get
> ambiguous. They differ by definition but in practice they are often used
> interchangeably. For Airavata, we should avoid a confusion between whats
> exposed in wsdl's vs whats passed to application. This matches the
> semantics as well, for instance, an argument is an instance of parameter.
> This discussion is about what Airavata passes to the command line
> applications. I am not suggesting any changes to wsdl's and schemas which
> use xml definitions. For applications I am suggesting to use the
> terminology per POSIX standard definitions [1]. I also propose that we
> should try and follow the utility syntax guidelines [2]. If an application
> does not follow these guidelines, we suggest it be wrapped by a shell
> script so we can pass arguments and flags confirming to standard practices.
> >
> >> Application refers to the commands airavata executes on computational
> resources.
> >
> >> Working directory. Airavata should insist on executing each invocation
> in a unique working directory. Some applications try and change to a static
> directory, but if proper uniqueness is not followed for output and log
> files, we risk overwriting executions producing unintended outputs. Also,
> avoid writing to home directories and source directories. This might have
> side effects and a overrun log file might fill the disk space and freeze
> further usage of that account.
> >
> >> Arguments:
> >> *  should support application arguments and provide a way to specify
> both required and optional.
> >> In the case of optional parameters, the resulting wsdl's attributes
> should have minOccurs=0 and airavata should skip passing that value to
> application (if not specified).
> >
> >> * Airavata *should not* support arguments with operands followed by
> commands. These additional commands get forked without having control over
> the process id and monitoring and exit status of these series of commands
> gets tricky. More over, the underlying grid job managers do not like
> treating a chain of commands as one executable. Rather encourage explicitly
> specifying the execution chain and associated I/O.
> >
> >> * Airavata should also support flags only ( they serve different
> purpose than option flags). Flags normally prefix with '--'. These flags
> control the execution of the application like --verbose, --fast, --use-fft,
> e.t.c
> >
> >> * Arguments can be passed to the application as standardinput (with
> redirector operator) or as name-value pairs or with option flags. The
> option flags should always prefix with the POSIX standard of '-'.
> >
> >> * If the arguments are preceded by an option flag they do not need to
> be ordered. But if the arguments are passed just as values, applications
> are sensitive to the order the arguments are passed. In this case, optional
> arguments have to carefully handled, as missing an argument in between will
> mislead.
>
+1 for these features.

> >
> >> * If an argument is a file type, and if the file has a remote supported
> protocols of (http, ftp, gsiftp, s3) then the file has to be staged first
> and only local path passed to the application. Application should be able
> to consume the full local path and if only basename is required, it should
> be able to handle it internally.
>
I think we already support this other than we do not support S3 file
transfer in Airavata.

> >
> >> * If an application requires a remove ftp url as an argument, then it
> should be specified as a string, in which case Airavata will skip staging
> that url and will pass the url as is to the application.
>
+1

> >
> >> * Implicit Parameters: As much as possible, Airavata should insist on
> one-on-one match between inputs specified in service description to whats
> passed to application. But there will be exceptions like fortran
> applications which uses NAMELIST standard to specify all inputs in a config
> file and pass only this file to the application. In these cases, the
> application still needs to stage some data files to the remote compute
> server but these file names or implicitly specified in the application. The
> application typically looks for these files relative to working directory
> or to input namelist file.
> >
> >> Outputs:
> >> * Airavata should support standard outputs and errors and optionally
> provide a way to specify the names of stdout and stderr.
> >> * All outputs required to be staged out of the compute machine or
> scratch working directory be explicitly specified.
> >> * If the output file name(s) are predetermined or specified at in a
> config file, then the name should be specified in application description.
> In the cases, where output file names are not deterministic, a regular
> expression or a containing directory should be specified.
> >> * If the application requires the output file name be passed at command
> line like -out output.txt, then airavata should provide support for these
> outputs flags.
>
I don't think this is a valid requirement ... Have you seen any
applications which provide outputs like this ?

> >> * Airavata should support outputs which can be optionally produced. If
> an optional output is not generated but application exits with exit code 0,
> then the application should be marked as success. (A different discussion
> on application execution success criteria is needed).
>
If  we are going to support this how are we going to find
an erroneous situation with optional outputs. right now we throw errors if
any of the output is empty.

> >> * A default output data directory should be created on the remote
> compute resource. The application description should be able to specific an
> overriding name for this directory.
>
We already support this.

> >> * Airavata should support applications/shell script wrappers which
> print name-value pairs of output content or file paths to standard out.
>
I am not again clear about this usecase and how are we going to suppoort
this, I htink this is unnecessary feature if nobody is going to give output
with name value pairs.

Lahiru

> >
> >> Once we discuss this topic, we should raise JIRAs for any missing
> features and also add these on website/wiki.
> >
> >> Cheers,
> >> Suresh
> >
> >> [1] -
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
> >> [2] -
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
> >
> >
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJPslI0AAoJEEfVXEODPFIDOrAIAKI6yUXoWTVx6vrX2xCZlTta
> vRxQS/Kpc7OVtO6IFJKtpODfrQ10GCgynweewt8rF7c8JztFbLWqNmSCFiYnRdrc
> B+ZAg5EZRDwW+bs9OO0FhFhp/DkcJKE97o0Kx0YRDPsAQj+SS9OCpzneFR/6mbQ8
> 3AI2x/byBIE4jwaBUZjH31hmXzS1M7ibYR5J10gBqO2ONgeTShipWgbR/QyjebFs
> /g3dtfaVwiaB99qRa6bVf3dyAB2wIWMtwRvtoAzqQTdYHMnkiE+azF2/02tfRXiu
> LIizzd/ErW3XVHVpUbALdu4Grue3YeaOUmG69yjq8Ipzjk9i+BVA22dvaWebKb0=
> =4Bss
> -----END PGP SIGNATURE-----
>



-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Marlon Pierce <ma...@iu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm thinking of a google spreadsheet....

On 5/15/12 8:53 AM, Marlon Pierce wrote:
> We have a large collection of use cases, so it would be a good exercise to apply the email below to specific applications.
> 
> 
> Marlon
> 
> 
> On 5/13/12 9:49 AM, Suresh Marru wrote:
>> Hi All,
> 
>> I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options. 
> 
>> Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.
> 
>> Application refers to the commands airavata executes on computational resources.
> 
>> Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.  
> 
>> Arguments: 
>> *  should support application arguments and provide a way to specify both required and optional. 
>> In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).
> 
>> * Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.
> 
>> * Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c
> 
>> * Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'. 
> 
>> * If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead. 
> 
>> * If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally. 
> 
>> * If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application. 
> 
>> * Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file. 
> 
>> Outputs:
>> * Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr. 
>> * All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified. 
>> * If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified. 
>> * If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags. 
>> * Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed). 
>> * A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory. 
>> * Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out. 
> 
>> Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki. 
> 
>> Cheers,
>> Suresh
> 
>> [1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
>> [2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPslI0AAoJEEfVXEODPFIDOrAIAKI6yUXoWTVx6vrX2xCZlTta
vRxQS/Kpc7OVtO6IFJKtpODfrQ10GCgynweewt8rF7c8JztFbLWqNmSCFiYnRdrc
B+ZAg5EZRDwW+bs9OO0FhFhp/DkcJKE97o0Kx0YRDPsAQj+SS9OCpzneFR/6mbQ8
3AI2x/byBIE4jwaBUZjH31hmXzS1M7ibYR5J10gBqO2ONgeTShipWgbR/QyjebFs
/g3dtfaVwiaB99qRa6bVf3dyAB2wIWMtwRvtoAzqQTdYHMnkiE+azF2/02tfRXiu
LIizzd/ErW3XVHVpUbALdu4Grue3YeaOUmG69yjq8Ipzjk9i+BVA22dvaWebKb0=
=4Bss
-----END PGP SIGNATURE-----

Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Marlon Pierce <ma...@iu.edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

We have a large collection of use cases, so it would be a good exercise to apply the email below to specific applications.


Marlon


On 5/13/12 9:49 AM, Suresh Marru wrote:
> Hi All,
> 
> I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options. 
> 
> Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.
> 
> Application refers to the commands airavata executes on computational resources.
> 
> Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.  
> 
> Arguments: 
> *  should support application arguments and provide a way to specify both required and optional. 
> In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).
> 
> * Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.
> 
> * Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c
> 
> * Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'. 
> 
> * If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead. 
> 
> * If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally. 
> 
> * If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application. 
> 
> * Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file. 
> 
> Outputs:
> * Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr. 
> * All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified. 
> * If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified. 
> * If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags. 
> * Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed). 
> * A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory. 
> * Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out. 
> 
> Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki. 
> 
> Cheers,
> Suresh
> 
> [1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
> [2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
> 
> 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.16 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJPslG7AAoJEEfVXEODPFID6mEH/3rHydPBk/TV4xHthrAFG9DV
mc2FrbWPbdz0ofArPHAkpQm+3cQo/Q8FuyWONY9Rn5HetIG4huUnbGGC5Hc6lQpg
Bc+jyaPgmFVLpO2dGNrZYm5TZF0CL/dSlyUKAa4G3FCrMTZUzUP+Cn0N3n7cnyfM
COFpBNiT6Auh3q121Mve02cqZuzEyUbc6r+T2dz7Y5GeYsQIeGMDzmfQEZ4fS9Ps
2I3kEoPz2cJgPKaFDBDemZaG5oyrsvBpCkTI0s93i5HCJ/ltQGz805H4S4AfQNLk
UpGR5xOCBXiiCttl56UNuTW/l78j0ETIP6DGtbn1wa23L4fw/qOSIv95MUwIXos=
=VZUy
-----END PGP SIGNATURE-----

Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Suresh Marru <sm...@apache.org>.
Hi All,

As we re-organize the data models in preparation for a 1.0 release, we need to make sure Airavata have good CLI support for the applications. Please review this thread and put in any requirements you may have - http://markmail.org/thread/hd7azhp7w7o7eqyq <http://markmail.org/thread/hd7azhp7w7o7eqyq>

Suresh

> On Oct 30, 2013, at 2:00 PM, Suresh Marru <sm...@apache.org> wrote:
> 
> Hi All,
> 
> We let this topic live in the archives for 17 months, its probably has baked enough. How about we revisit this thread and startup application use cases and make it into a list of features Airavata should support for Command Line Applications.
> 
> You can browse through the thread at - http://markmail.org/thread/hd7azhp7w7o7eqyq <http://markmail.org/thread/hd7azhp7w7o7eqyq>
> 
> As we brainstorm, I will stratup a wiki document and capture the outcomes and finally we can review the document and vote on it.
> 
> Suresh
> 
> 
> On Sun, May 13, 2012 at 9:49 AM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
> Hi All,
> 
> I am trying to revisit the Airavata support for all command line options we pass to applications. Airavata's goal is to make end users oblivious to any application execution details, but application service providers need flexibility to configure all possible application options.
> 
> Some terminology like arguments vs parameters vs attributes get ambiguous. They differ by definition but in practice they are often used interchangeably. For Airavata, we should avoid a confusion between whats exposed in wsdl's vs whats passed to application. This matches the semantics as well, for instance, an argument is an instance of parameter. This discussion is about what Airavata passes to the command line applications. I am not suggesting any changes to wsdl's and schemas which use xml definitions. For applications I am suggesting to use the terminology per POSIX standard definitions [1]. I also propose that we should try and follow the utility syntax guidelines [2]. If an application does not follow these guidelines, we suggest it be wrapped by a shell script so we can pass arguments and flags confirming to standard practices.
> 
> Application refers to the commands airavata executes on computational resources.
> 
> Working directory. Airavata should insist on executing each invocation in a unique working directory. Some applications try and change to a static directory, but if proper uniqueness is not followed for output and log files, we risk overwriting executions producing unintended outputs. Also, avoid writing to home directories and source directories. This might have side effects and a overrun log file might fill the disk space and freeze further usage of that account.
> 
> Arguments:
> *  should support application arguments and provide a way to specify both required and optional.
> In the case of optional parameters, the resulting wsdl's attributes should have minOccurs=0 and airavata should skip passing that value to application (if not specified).
> 
> * Airavata *should not* support arguments with operands followed by commands. These additional commands get forked without having control over the process id and monitoring and exit status of these series of commands gets tricky. More over, the underlying grid job managers do not like treating a chain of commands as one executable. Rather encourage explicitly specifying the execution chain and associated I/O.
> 
> * Airavata should also support flags only ( they serve different purpose than option flags). Flags normally prefix with '--'. These flags control the execution of the application like --verbose, --fast, --use-fft, e.t.c
> 
> * Arguments can be passed to the application as standardinput (with redirector operator) or as name-value pairs or with option flags. The option flags should always prefix with the POSIX standard of '-'.
> 
> * If the arguments are preceded by an option flag they do not need to be ordered. But if the arguments are passed just as values, applications are sensitive to the order the arguments are passed. In this case, optional arguments have to carefully handled, as missing an argument in between will mislead.
> 
> * If an argument is a file type, and if the file has a remote supported protocols of (http, ftp, gsiftp, s3) then the file has to be staged first and only local path passed to the application. Application should be able to consume the full local path and if only basename is required, it should be able to handle it internally.
> 
> * If an application requires a remove ftp url as an argument, then it should be specified as a string, in which case Airavata will skip staging that url and will pass the url as is to the application.
> 
> * Implicit Parameters: As much as possible, Airavata should insist on one-on-one match between inputs specified in service description to whats passed to application. But there will be exceptions like fortran applications which uses NAMELIST standard to specify all inputs in a config file and pass only this file to the application. In these cases, the application still needs to stage some data files to the remote compute server but these file names or implicitly specified in the application. The application typically looks for these files relative to working directory or to input namelist file.
> 
> Outputs:
> * Airavata should support standard outputs and errors and optionally provide a way to specify the names of stdout and stderr.
> * All outputs required to be staged out of the compute machine or scratch working directory be explicitly specified.
> * If the output file name(s) are predetermined or specified at in a config file, then the name should be specified in application description. In the cases, where output file names are not deterministic, a regular expression or a containing directory should be specified.
> * If the application requires the output file name be passed at command line like -out output.txt, then airavata should provide support for these outputs flags.
> * Airavata should support outputs which can be optionally produced. If an optional output is not generated but application exits with exit code 0, then the application should be marked as success. (A different discussion on application execution success criteria is needed).
> * A default output data directory should be created on the remote compute resource. The application description should be able to specific an overriding name for this directory.
> * Airavata should support applications/shell script wrappers which print name-value pairs of output content or file paths to standard out.
> 
> Once we discuss this topic, we should raise JIRAs for any missing features and also add these on website/wiki.
> 
> Cheers,
> Suresh
> 
> [1] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html>
> [2] - http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02 <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02>
> 
> 
> 


Re: [DISCUSS] Enhance Airavata support for CLI's

Posted by Suresh Marru <sm...@apache.org>.
Hi All,

We let this topic live in the archives for 17 months, its probably has
baked enough. How about we revisit this thread and startup application use
cases and make it into a list of features Airavata should support for
Command Line Applications.

You can browse through the thread at -
http://markmail.org/thread/hd7azhp7w7o7eqyq

As we brainstorm, I will stratup a wiki document and capture the outcomes
and finally we can review the document and vote on it.

Suresh


On Sun, May 13, 2012 at 9:49 AM, Suresh Marru <sm...@apache.org> wrote:

> Hi All,
>
> I am trying to revisit the Airavata support for all command line options
> we pass to applications. Airavata's goal is to make end users oblivious to
> any application execution details, but application service providers need
> flexibility to configure all possible application options.
>
> Some terminology like arguments vs parameters vs attributes get ambiguous.
> They differ by definition but in practice they are often used
> interchangeably. For Airavata, we should avoid a confusion between whats
> exposed in wsdl's vs whats passed to application. This matches the
> semantics as well, for instance, an argument is an instance of parameter.
> This discussion is about what Airavata passes to the command line
> applications. I am not suggesting any changes to wsdl's and schemas which
> use xml definitions. For applications I am suggesting to use the
> terminology per POSIX standard definitions [1]. I also propose that we
> should try and follow the utility syntax guidelines [2]. If an application
> does not follow these guidelines, we suggest it be wrapped by a shell
> script so we can pass arguments and flags confirming to standard practices.
>
> Application refers to the commands airavata executes on computational
> resources.
>
> Working directory. Airavata should insist on executing each invocation in
> a unique working directory. Some applications try and change to a static
> directory, but if proper uniqueness is not followed for output and log
> files, we risk overwriting executions producing unintended outputs. Also,
> avoid writing to home directories and source directories. This might have
> side effects and a overrun log file might fill the disk space and freeze
> further usage of that account.
>
> Arguments:
> *  should support application arguments and provide a way to specify both
> required and optional.
> In the case of optional parameters, the resulting wsdl's attributes should
> have minOccurs=0 and airavata should skip passing that value to application
> (if not specified).
>
> * Airavata *should not* support arguments with operands followed by
> commands. These additional commands get forked without having control over
> the process id and monitoring and exit status of these series of commands
> gets tricky. More over, the underlying grid job managers do not like
> treating a chain of commands as one executable. Rather encourage explicitly
> specifying the execution chain and associated I/O.
>
> * Airavata should also support flags only ( they serve different purpose
> than option flags). Flags normally prefix with '--'. These flags control
> the execution of the application like --verbose, --fast, --use-fft, e.t.c
>
> * Arguments can be passed to the application as standardinput (with
> redirector operator) or as name-value pairs or with option flags. The
> option flags should always prefix with the POSIX standard of '-'.
>
> * If the arguments are preceded by an option flag they do not need to be
> ordered. But if the arguments are passed just as values, applications are
> sensitive to the order the arguments are passed. In this case, optional
> arguments have to carefully handled, as missing an argument in between will
> mislead.
>
> * If an argument is a file type, and if the file has a remote supported
> protocols of (http, ftp, gsiftp, s3) then the file has to be staged first
> and only local path passed to the application. Application should be able
> to consume the full local path and if only basename is required, it should
> be able to handle it internally.
>
> * If an application requires a remove ftp url as an argument, then it
> should be specified as a string, in which case Airavata will skip staging
> that url and will pass the url as is to the application.
>
> * Implicit Parameters: As much as possible, Airavata should insist on
> one-on-one match between inputs specified in service description to whats
> passed to application. But there will be exceptions like fortran
> applications which uses NAMELIST standard to specify all inputs in a config
> file and pass only this file to the application. In these cases, the
> application still needs to stage some data files to the remote compute
> server but these file names or implicitly specified in the application. The
> application typically looks for these files relative to working directory
> or to input namelist file.
>
> Outputs:
> * Airavata should support standard outputs and errors and optionally
> provide a way to specify the names of stdout and stderr.
> * All outputs required to be staged out of the compute machine or scratch
> working directory be explicitly specified.
> * If the output file name(s) are predetermined or specified at in a config
> file, then the name should be specified in application description. In the
> cases, where output file names are not deterministic, a regular expression
> or a containing directory should be specified.
> * If the application requires the output file name be passed at command
> line like -out output.txt, then airavata should provide support for these
> outputs flags.
> * Airavata should support outputs which can be optionally produced. If an
> optional output is not generated but application exits with exit code 0,
> then the application should be marked as success. (A different discussion
> on application execution success criteria is needed).
> * A default output data directory should be created on the remote compute
> resource. The application description should be able to specific an
> overriding name for this directory.
> * Airavata should support applications/shell script wrappers which print
> name-value pairs of output content or file paths to standard out.
>
> Once we discuss this topic, we should raise JIRAs for any missing features
> and also add these on website/wiki.
>
> Cheers,
> Suresh
>
> [1] -
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html
> [2] -
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02
>
>
>