You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by "Pamidighantam, Sudhakar" <pa...@iu.edu> on 2020/03/26 14:21:41 UTC

MFT and data access for running jobs

Dimuthu:

When the MFT becomes available would there be a way to define the remote working directory as a device to provide access to the data there.
You know this has been a long standing need for particularly long running jobs.

Thanks,
Sudhakar.


Re: MFT and data access for running jobs

Posted by "Pamidighantam, Sudhakar" <pa...@iu.edu>.
Thanks Dimuthu:

For 3, to start with whatever is defined as output fields in the application interface could be moved periodically.

Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Thursday, March 26, 2020 at 1:18 PM
To: Airavata Dev <de...@airavata.apache.org>
Subject: Re: MFT and data access for running jobs

Copying data periodically to gateway storage/ user's desktop can be done from MFT if we have following

1. If it's copying to gateway storage, gateway storage side should have a MFT Agent installed
2. If it's for user's desktop, user should have MFT Agent installed and provide write access to a particular directory.
3. However in both cases, we need to have another service instructing to MFT when, what, where to copy at each iteration.
4. In addition to that, we need some changes to Airavata API models to store and configure metadata for periodic synchronizations.

I believe this is a good GSoC project if someone is willing to take on and I would like to act as a mentor.

Dimuthu

On Thu, Mar 26, 2020 at 12:47 PM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
I am not suggesting we mount any disk but potentially transfer the data in that remote HPC disk to storage in the gateway and provide access from gateway storage or directly to users desktop periodically/or on prompt  during the run.

Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Thursday, March 26, 2020 at 12:42 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: MFT and data access for running jobs

Sudhakar,

What you are asking is not a direct MFT use case. It's more like a NFS mount of a remote file system to a local file system. MFT is mainly focussing on handling the data transfer path not synching data between two endpoints at realtime.

Thanks
Dimuthu

On Thu, Mar 26, 2020 at 12:29 PM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
Dimuthu:

Yes, the working directory on remote HPC cluster.

The workflow may look like this..

The user launches a job..
The remote working directory, dynamically defined by Airavata during the launch of the experiment is registered as a remote disk accessible
The contents are made available readonly for  users to read/download
Remove this as accessible when the experiment ends
Continue with the rest of the Helix tasks
…


Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Thursday, March 26, 2020 at 12:23 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: MFT and data access for running jobs

Sudhakar,

I’m not sure whether I grabbed your point about this remote working directory correctly. Are you taking about the working directory of the cluster? Can you please explain the workflow with more details?

Thanks
Dimuthu

On Thu, Mar 26, 2020 at 10:21 AM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
Dimuthu:

When the MFT becomes available would there be a way to define the remote working directory as a device to provide access to the data there.
You know this has been a long standing need for particularly long running jobs.

Thanks,
Sudhakar.


Re: MFT and data access for running jobs

Posted by DImuthu Upeksha <di...@gmail.com>.
Copying data periodically to gateway storage/ user's desktop can be done
from MFT if we have following

1. If it's copying to gateway storage, gateway storage side should have a
MFT Agent installed
2. If it's for user's desktop, user should have MFT Agent installed and
provide write access to a particular directory.
3. However in both cases, we need to have another service instructing to
MFT when, what, where to copy at each iteration.
4. In addition to that, we need some changes to Airavata API models to
store and configure metadata for periodic synchronizations.

I believe this is a good GSoC project if someone is willing to take on and
I would like to act as a mentor.

Dimuthu

On Thu, Mar 26, 2020 at 12:47 PM Pamidighantam, Sudhakar <pa...@iu.edu>
wrote:

> I am not suggesting we mount any disk but potentially transfer the data in
> that remote HPC disk to storage in the gateway and provide access from
> gateway storage or directly to users desktop periodically/or on prompt
>  during the run.
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
> *From: *DImuthu Upeksha <di...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Thursday, March 26, 2020 at 12:42 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: MFT and data access for running jobs
>
>
>
> Sudhakar,
>
>
>
> What you are asking is not a direct MFT use case. It's more like a NFS
> mount of a remote file system to a local file system. MFT is mainly
> focussing on handling the data transfer path not synching data between two
> endpoints at realtime.
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Thu, Mar 26, 2020 at 12:29 PM Pamidighantam, Sudhakar <pa...@iu.edu>
> wrote:
>
> Dimuthu:
>
>
>
> Yes, the working directory on remote HPC cluster.
>
>
>
> The workflow may look like this..
>
>
>
> The user launches a job..
>
> The remote working directory, dynamically defined by Airavata during the
> launch of the experiment is registered as a remote disk accessible
>
> The contents are made available readonly for  users to read/download
>
> Remove this as accessible when the experiment ends
>
> Continue with the rest of the Helix tasks
>
> …
>
>
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
> *From: *DImuthu Upeksha <di...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Thursday, March 26, 2020 at 12:23 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: MFT and data access for running jobs
>
>
>
> Sudhakar,
>
> I’m not sure whether I grabbed your point about this remote working
> directory correctly. Are you taking about the working directory of the
> cluster? Can you please explain the workflow with more details?
>
> Thanks
> Dimuthu
>
>
>
> On Thu, Mar 26, 2020 at 10:21 AM Pamidighantam, Sudhakar <pa...@iu.edu>
> wrote:
>
> Dimuthu:
>
>
>
> When the MFT becomes available would there be a way to define the remote
> working directory as a device to provide access to the data there.
>
> You know this has been a long standing need for particularly long running
> jobs.
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
>

Re: MFT and data access for running jobs

Posted by "Pamidighantam, Sudhakar" <pa...@iu.edu>.
I am not suggesting we mount any disk but potentially transfer the data in that remote HPC disk to storage in the gateway and provide access from gateway storage or directly to users desktop periodically/or on prompt  during the run.

Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Thursday, March 26, 2020 at 12:42 PM
To: Airavata Dev <de...@airavata.apache.org>
Subject: Re: MFT and data access for running jobs

Sudhakar,

What you are asking is not a direct MFT use case. It's more like a NFS mount of a remote file system to a local file system. MFT is mainly focussing on handling the data transfer path not synching data between two endpoints at realtime.

Thanks
Dimuthu

On Thu, Mar 26, 2020 at 12:29 PM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
Dimuthu:

Yes, the working directory on remote HPC cluster.

The workflow may look like this..

The user launches a job..
The remote working directory, dynamically defined by Airavata during the launch of the experiment is registered as a remote disk accessible
The contents are made available readonly for  users to read/download
Remove this as accessible when the experiment ends
Continue with the rest of the Helix tasks
…


Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Thursday, March 26, 2020 at 12:23 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: MFT and data access for running jobs

Sudhakar,

I’m not sure whether I grabbed your point about this remote working directory correctly. Are you taking about the working directory of the cluster? Can you please explain the workflow with more details?

Thanks
Dimuthu

On Thu, Mar 26, 2020 at 10:21 AM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
Dimuthu:

When the MFT becomes available would there be a way to define the remote working directory as a device to provide access to the data there.
You know this has been a long standing need for particularly long running jobs.

Thanks,
Sudhakar.


Re: MFT and data access for running jobs

Posted by DImuthu Upeksha <di...@gmail.com>.
Sudhakar,

What you are asking is not a direct MFT use case. It's more like a NFS
mount of a remote file system to a local file system. MFT is mainly
focussing on handling the data transfer path not synching data between two
endpoints at realtime.

Thanks
Dimuthu

On Thu, Mar 26, 2020 at 12:29 PM Pamidighantam, Sudhakar <pa...@iu.edu>
wrote:

> Dimuthu:
>
>
>
> Yes, the working directory on remote HPC cluster.
>
>
>
> The workflow may look like this..
>
>
>
> The user launches a job..
>
> The remote working directory, dynamically defined by Airavata during the
> launch of the experiment is registered as a remote disk accessible
>
> The contents are made available readonly for  users to read/download
>
> Remove this as accessible when the experiment ends
>
> Continue with the rest of the Helix tasks
>
> …
>
>
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
> *From: *DImuthu Upeksha <di...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Thursday, March 26, 2020 at 12:23 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: MFT and data access for running jobs
>
>
>
> Sudhakar,
>
> I’m not sure whether I grabbed your point about this remote working
> directory correctly. Are you taking about the working directory of the
> cluster? Can you please explain the workflow with more details?
>
> Thanks
> Dimuthu
>
>
>
> On Thu, Mar 26, 2020 at 10:21 AM Pamidighantam, Sudhakar <pa...@iu.edu>
> wrote:
>
> Dimuthu:
>
>
>
> When the MFT becomes available would there be a way to define the remote
> working directory as a device to provide access to the data there.
>
> You know this has been a long standing need for particularly long running
> jobs.
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
>

Re: MFT and data access for running jobs

Posted by "Pamidighantam, Sudhakar" <pa...@iu.edu>.
Dimuthu:

Yes, the working directory on remote HPC cluster.

The workflow may look like this..

The user launches a job..
The remote working directory, dynamically defined by Airavata during the launch of the experiment is registered as a remote disk accessible
The contents are made available readonly for  users to read/download
Remove this as accessible when the experiment ends
Continue with the rest of the Helix tasks
…


Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Thursday, March 26, 2020 at 12:23 PM
To: Airavata Dev <de...@airavata.apache.org>
Subject: Re: MFT and data access for running jobs

Sudhakar,

I’m not sure whether I grabbed your point about this remote working directory correctly. Are you taking about the working directory of the cluster? Can you please explain the workflow with more details?

Thanks
Dimuthu

On Thu, Mar 26, 2020 at 10:21 AM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
Dimuthu:

When the MFT becomes available would there be a way to define the remote working directory as a device to provide access to the data there.
You know this has been a long standing need for particularly long running jobs.

Thanks,
Sudhakar.


Re: MFT and data access for running jobs

Posted by DImuthu Upeksha <di...@gmail.com>.
Sudhakar,

I’m not sure whether I grabbed your point about this remote working
directory correctly. Are you taking about the working directory of the
cluster? Can you please explain the workflow with more details?

Thanks
Dimuthu

On Thu, Mar 26, 2020 at 10:21 AM Pamidighantam, Sudhakar <pa...@iu.edu>
wrote:

> Dimuthu:
>
>
>
> When the MFT becomes available would there be a way to define the remote
> working directory as a device to provide access to the data there.
>
> You know this has been a long standing need for particularly long running
> jobs.
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>