You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@airavata.apache.org by Lahiru Gunathilake <gl...@gmail.com> on 2012/08/08 18:48:12 UTC

Scratch working directory and static working directory

Hi All,

I have seen these two fields for Application Description and I
remember this was there from the very beginning. I am not quite sure
why do we need these two values and what made them difference.

Anyone has any idea why do we have these two parameters ?

Lahiru

-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Scratch working directory and static working directory

Posted by Suresh Marru <sm...@apache.org>.

On Aug 8, 2012, at 10:07 AM, Subhra Sarkar <su...@g.clemson.edu> wrote:

> Hi Lahiru,
> 
> Below are my two cents on this -
> 
> *Scratch working directory*: Directory you submit your computational jobs
> from, interactive or otherwise. For performance overhead and other issues,
> you're not supposed/allowed to submit jobs from head node's native file
> system (e.g. ext4) in clustered environment. So, you transfer your job
> scripts and input data to the scratch space (running on a different file
> system for back and forth data transaction/communication between
> computational nodes and a the distributed file system, e.g. lustre) before
> submitting jobs.
> 
> *Static working directory*: As the name suggests, it should be static and
> solely under control of user. This is the directory where the output/error
> files are stored (parameters you specify in your PBS script).
> 
> So, in most cases, you can specify the values for scratch working directory
> and static working directory as the same (e.g. /lustre/scratch/<user
> name>/<working-directory>).
> 
> Hope this helps!

Bravo Subra!! this is great to see a core developer asking a doubt and a person from community answer this, keep it going :) 

Firstly, I take the blame on this confusion, I recently re-authored the schema but associated no documentation will fix it (reminder to self - AIRAVATA-532). Subhra reasoning on the needed for scratch directories on HPC machines is right. In this context, both scratch and static work directories mean the same, the location to specify input and output locations and there working directory needs and will be specified in the batch script. One difference between both these is:
Scratch working directory will specify a base directory, and for each invocation, a unique sub directory will be created and this $scratch+unique-subdir will be used as the working directory in batch scripts.
Static work directory will force gfac to not to create any unique directory, but to use a static location for all executions. Some applications have this need of cd'ing to a specific location and then execute the job. But this location has risks of over writing any files as all executions are executed from a single location. So the  static directory should be sparingly used and only if the application needs them.

Suresh

> 
> Best regards,
> *Subhra Sankha Sarkar*
> *Graduate Student (Class of 2012)*
> *School of Computing*
> *Clemson University, SC 29634 - 0974*
> *USA*
> *M: (302) 740-7294*
> 
> 
> 
> On Wed, Aug 8, 2012 at 12:48 PM, Lahiru Gunathilake <gl...@gmail.com>wrote:
> 
>> Hi All,
>> 
>> I have seen these two fields for Application Description and I
>> remember this was there from the very beginning. I am not quite sure
>> why do we need these two values and what made them difference.
>> 
>> Anyone has any idea why do we have these two parameters ?
>> 
>> Lahiru
>> 
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>

Re: Scratch working directory and static working directory

Posted by Subhra Sarkar <su...@g.clemson.edu>.

Hi Lahiru,

Below are my two cents on this -

*Scratch working directory*: Directory you submit your computational jobs
from, interactive or otherwise. For performance overhead and other issues,
you're not supposed/allowed to submit jobs from head node's native file
system (e.g. ext4) in clustered environment. So, you transfer your job
scripts and input data to the scratch space (running on a different file
system for back and forth data transaction/communication between
computational nodes and a the distributed file system, e.g. lustre) before
submitting jobs.

*Static working directory*: As the name suggests, it should be static and
solely under control of user. This is the directory where the output/error
files are stored (parameters you specify in your PBS script).

So, in most cases, you can specify the values for scratch working directory
and static working directory as the same (e.g. /lustre/scratch/<user
name>/<working-directory>).

Hope this helps!

Best regards,
*Subhra Sankha Sarkar*
*Graduate Student (Class of 2012)*
*School of Computing*
*Clemson University, SC 29634 - 0974*
*USA*
*M: (302) 740-7294*

On Wed, Aug 8, 2012 at 12:48 PM, Lahiru Gunathilake <gl...@gmail.com>wrote:

> Hi All,
>
> I have seen these two fields for Application Description and I
> remember this was there from the very beginning. I am not quite sure
> why do we need these two values and what made them difference.
>
> Anyone has any idea why do we have these two parameters ?
>
> Lahiru
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>