You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by Aravind Ramalingam <po...@gmail.com> on 2020/04/05 00:09:37 UTC

Apache Airavata MFT - AWS/GCS support

Hello,

We set up the MFT project on local system and tested out SCP transfer
between JetStream VMs, we were wondering how the support can be extended
for AWS/GCS.

As per our understanding, the current implementation has support for two
protocols i.e. local-transport and scp-transport. Would we have to
modify/add to the code base to extend support for AWS/GCS clients?

Could you please provide suggestions for this use case.

Thank you
Aravind Ramalingam

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by "Rajvanshi, Akshay" <ak...@iu.edu>.
Thank you so much for the prompt response. We will start a new email thread and update the spreadsheet for Dropbox.

Kind Regards
Akshay

Get Outlook for iOS<https://aka.ms/o0ukef>
________________________________
From: DImuthu Upeksha <di...@gmail.com>
Sent: Wednesday, April 22, 2020 10:51:28 PM
To: Airavata Dev <de...@airavata.apache.org>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

And obviously, mark it in the spreadsheet :)

On Wed, Apr 22, 2020 at 10:50 PM DImuthu Upeksha <di...@gmail.com>> wrote:
I think a good start point is summarizing what are the available authentication / authorization methods in dropbox and finding a good java client. Then you can follow the same approach you have followed in GCS to implement this. Please start a new mail thread for future dropbox transport discussions.

Dimuthu

On Wed, Apr 22, 2020 at 10:42 PM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

Thank you for accepting our pull request. As a part of another contribution, we were thinking of implementing the Dropbox transport protocol. Do you have any suggestions for us regarding that ?

Kind Regards
Akshay Rajvanshi

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Wednesday, April 22, 2020 at 22:27
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind, Sharanya,

I merged your PR [13] for GCS transport. It looks really good now. Thanks for the contribution and please update the status in spreadsheet [14]. Looking forward for more contributions from you and feel free to take up any available transport in the sheet if you have time in the future.

[13] https://github.com/apache/airavata-mft/pull/6
[14] https://docs.google.com/spreadsheets/d/1M7-Reda-pCi1l-TSSstI6Yi1pSbtINUqlBFcy5UrOW0/edit?usp=sharing

Thanks
Dimuthu

On Wed, Apr 22, 2020 at 9:19 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Can you debug and fix this? Could be a good chance to learn about MFT core

Dimuthu

On Wed, Apr 22, 2020 at 4:53 PM Ravichandran, Sharanya <sh...@iu.edu>> wrote:

Hello,



We tried transferring files between various protocols and we noticed that when we made a transfer to/from azure ,after the completion of the transfer ,the logging for transfer percentage was duplicated multiple times. We wanted to know if this was expected ?



Please find below the snapshots of the transfers for you reference :



GCS to Azure

[cid:171a4eee9a64cff311]

Azure to Azure :

[cid:171a4eee9a65b16b22]



Thanks ,

Sharanya R.



________________________________
From: DImuthu Upeksha <di...@gmail.com>>
Sent: 22 April 2020 02:38
To: Airavata Dev
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

If you look at the error line, it's calling super.getS3Secret(request, responseObserver); which should not be done in GRPC services. This will not give client side errors as the client has got all the data it needed in earlier lines. I will remove this invocation and commit. Thanks for reporting this

Dimuthu

On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

In addition to the previous thread from Aravind regarding the error, we tested the implementation from apache repository directly without making any of our own changes and did testing with other protocols and faced the similar problem.

Kind Regards
Akshay Rajvanshi

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Tuesday, April 21, 2020 at 20:58
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Hello,

While testing we noticed an error in the SecretServiceApplication, it seems to be a problem with the gRPC calls to the service.

I have attached the screenshot for your reference.

Could you please help us with this?

Thank you
Aravind Ramalingam



On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

Thank you for the review. We will look into the changes asap.

Thank you
Aravind Ramalingam

On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

I reviewed the PR and submitted my reviews. Please have a look at them. I didn't thoroughly go through optimizations in the code as there are some templating fixes and cleaning up required. Once you fix them, I will do a thorough review. Make sure to do a rebase of the PR next time as there are conflicts from other commits. Thanks for your contributions.

Dimuthu

On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We have raised a Pull Request [12].

We look forward to your feedback.

[12] https://github.com/apache/airavata-mft/pull/6

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Sounds good. Please send a PR once it is done.

Dimuthu

On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

Thank you Sudhakar and Dimuthu. We figured it out.

Like Sudhakar had pointed out with the issue link, GCS had returned a BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.

Currently we successfully tested from S3 to GCS and back. We are yet to test with other protocols.

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
https://github.com/googleapis/google-cloud-java/issues/4117 Does this help?

Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Sunday, April 19, 2020 at 4:46 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind,

Can you send a PR for what you have done so far so that I can provide a feedback. One thing you have to make sure is that the GCS Metadata collector returns the correct md5 for that file. You can download the file and run "md5sum <file name>" locally to get actual md5 value for that file and compare with what you can see in GCS implementation.

In S3, etag is the right property to fetch md5 for target resource. I'm not sure what is the right method for GCS. You have to locally try and verify.

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

We are working on GCS and we got certain parts working, but after a transfer is compete we are facing errors with the metadata checks.

<image001.png>

We are currently testing S3 to GCS. We noticed in the S3 implementation that Etag was set as the Md5sum. In our case we tried using both Etag and Md5Hash, but both threw the above error.

//S3 implementation

metadata.setMd5sum(s3Metadata.getETag());

//GCS implementation

metadata.setMd5sum(gcsMetadata.getEtag());

or

metadata.setMd5sum(gcsMetadata.getMd5Hash());



We are confused at this point, could you please guide us?



Thank you

Aravind Ramalingam

On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

You don't need the file to be present in the gcs example I sent. It needs an Input Stream to read the content. You can use the same approach I have done in S3 [9] transport to do that. It's straightforward. Replace file input stream with context.getStreamBuffer().getInputStream().

Akshay,

You can't assume that file is on the machine. It should be provided from the secret service. I found this example in [10]

Storage storage = StorageOptions.newBuilder()

    .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))

    .build()

    .getService();

It accepts a InputStream of json. You can programmatically load the content of that json into a java String through secret service and convert that string to a Input Stream as shown in [11]

[9] https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
[10] https://github.com/googleapis/google-cloud-java
[11] https://www.baeldung.com/convert-string-to-input-stream

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

We were searching about how to use google API’s to send files, but it’s required the first steps to be authentication. In that, the GCP API requires a credentials.json file to be present in the system.

Is it fine if we currently design the GCS transport feature such that the file is already present in the system ?

Kind Regards
Akshay

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, April 17, 2020 at 00:30
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: [External] Re: Apache Airavata MFT - AWS/GCS support

This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Hello,

Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.

Thank you
Aravind Ramalingam

On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.

Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.

Thank you for the update.
Thank you
Aravind Ramalingam

On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.

[6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the system works.
We were hoping to try an implementation for GCS, we will also look into Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.


String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();

[2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com>> wrote:
There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport.

[1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients.

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org>> wrote:
>
> Hi Aravind,
>
> I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on.
>
> There could be many use cases, here is an idea:
>
> * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage.
>
> Suresh
>
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>> wrote:
>>
>> Hello,
>>
>> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>
>> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients?
>>
>> Could you please provide suggestions for this use case.
>>
>> Thank you
>> Aravind Ramalingam
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
And obviously, mark it in the spreadsheet :)

On Wed, Apr 22, 2020 at 10:50 PM DImuthu Upeksha <di...@gmail.com>
wrote:

> I think a good start point is summarizing what are the available
> authentication / authorization methods in dropbox and finding a good java
> client. Then you can follow the same approach you have followed in GCS to
> implement this. Please start a new mail thread for future dropbox transport
> discussions.
>
> Dimuthu
>
> On Wed, Apr 22, 2020 at 10:42 PM Rajvanshi, Akshay <ak...@iu.edu>
> wrote:
>
>> Hello,
>>
>>
>>
>> Thank you for accepting our pull request. As a part of another
>> contribution, we were thinking of implementing the Dropbox transport
>> protocol. Do you have any suggestions for us regarding that ?
>>
>>
>>
>> Kind Regards
>>
>> Akshay Rajvanshi
>>
>>
>>
>> *From: *DImuthu Upeksha <di...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Wednesday, April 22, 2020 at 22:27
>> *To: *Airavata Dev <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Aravind, Sharanya,
>>
>>
>>
>> I merged your PR [13] for GCS transport. It looks really good now. Thanks
>> for the contribution and please update the status in spreadsheet [14].
>> Looking forward for more contributions from you and feel free to take up
>> any available transport in the sheet if you have time in the future.
>>
>>
>>
>> [13] https://github.com/apache/airavata-mft/pull/6
>>
>> [14]
>> https://docs.google.com/spreadsheets/d/1M7-Reda-pCi1l-TSSstI6Yi1pSbtINUqlBFcy5UrOW0/edit?usp=sharing
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Wed, Apr 22, 2020 at 9:19 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Can you debug and fix this? Could be a good chance to learn about MFT
>> core
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Wed, Apr 22, 2020 at 4:53 PM Ravichandran, Sharanya <sh...@iu.edu>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We tried transferring files between various protocols and we noticed that
>> when we made a transfer to/from azure ,after the completion of the transfer
>> ,the logging for transfer percentage was duplicated multiple times. We
>> wanted to know if this was expected ?
>>
>>
>>
>> Please find below the snapshots of the transfers for you reference :
>>
>>
>>
>> GCS to Azure
>>
>> Azure to Azure :
>>
>>
>>
>> Thanks ,
>>
>> Sharanya R.
>>
>>
>> ------------------------------
>>
>> *From:* DImuthu Upeksha <di...@gmail.com>
>> *Sent:* 22 April 2020 02:38
>> *To:* Airavata Dev
>> *Subject:* Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> If you look at the error line, it's calling *super*.getS3Secret(request,
>> responseObserver); which should not be done in GRPC services. This will
>> not give client side errors as the client has got all the data it needed in
>> earlier lines. I will remove this invocation and commit. Thanks for
>> reporting this
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> In addition to the previous thread from Aravind regarding the error, we
>> tested the implementation from apache repository directly without making
>> any of our own changes and did testing with other protocols and faced the
>> similar problem.
>>
>>
>>
>> Kind Regards
>>
>> Akshay Rajvanshi
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Tuesday, April 21, 2020 at 20:58
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Hello,
>>
>>
>>
>> While testing we noticed an error in the SecretServiceApplication, it
>> seems to be a problem with the gRPC calls to the service.
>>
>>
>>
>> I have attached the screenshot for your reference.
>>
>>
>>
>> Could you please help us with this?
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi Dimuthu,
>>
>>
>>
>> Thank you for the review. We will look into the changes asap.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Hi Aravind,
>>
>>
>>
>> I reviewed the PR and submitted my reviews. Please have a look at them. I
>> didn't thoroughly go through optimizations in the code as there are some
>> templating fixes and cleaning up required. Once you fix them, I will do a
>> thorough review. Make sure to do a rebase of the PR next time as there are
>> conflicts from other commits. Thanks for your contributions.
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We have raised a Pull Request [12].
>>
>>
>>
>> We look forward to your feedback.
>>
>>
>>
>> [12] https://github.com/apache/airavata-mft/pull/6
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Sounds good. Please send a PR once it is done.
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> Thank you Sudhakar and Dimuthu. We figured it out.
>>
>>
>>
>> Like Sudhakar had pointed out with the issue link, GCS had returned a
>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>
>>
>>
>> Currently we successfully tested from S3 to GCS and back. We are yet to
>> test with other protocols.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
>> wrote:
>>
>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>> help?
>>
>>
>>
>> Thanks,
>>
>> Sudhakar.
>>
>>
>>
>> *From: *DImuthu Upeksha <di...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>> *To: *Airavata Dev <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Aravind,
>>
>>
>>
>> Can you send a PR for what you have done so far so that I can provide a
>> feedback. One thing you have to make sure is that the GCS Metadata
>> collector returns the correct md5 for that file. You can download the file
>> and run "md5sum <file name>" locally to get actual md5 value for that file
>> and compare with what you can see in GCS implementation.
>>
>>
>>
>> In S3, etag is the right property to fetch md5 for target resource. I'm
>> not sure what is the right method for GCS. You have to locally try and
>> verify.
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi Dimuthu,
>>
>>
>>
>> We are working on GCS and we got certain parts working, but after a
>> transfer is compete we are facing errors with the metadata checks.
>>
>>
>>
>> <image001.png>
>>
>>
>>
>> We are currently testing S3 to GCS. We noticed in the S3 implementation
>> that Etag was set as the Md5sum. In our case we tried using both Etag and
>> Md5Hash, but both threw the above error.
>>
>>
>>
>> //S3 implementation
>>
>> metadata.setMd5sum(s3Metadata.getETag());
>>
>> //GCS implementation
>>
>> metadata.setMd5sum(gcsMetadata.getEtag());
>>
>> or
>>
>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>
>>
>>
>> We are confused at this point, could you please guide us?
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Hi Aravind,
>>
>>
>>
>> You don't need the file to be present in the gcs example I sent. It needs
>> an Input Stream to read the content. You can use the same approach I have
>> done in S3 [9] transport to do that. It's straightforward. Replace file
>> input stream with context.getStreamBuffer().getInputStream().
>>
>>
>>
>> Akshay,
>>
>>
>>
>> You can't assume that file is on the machine. It should be provided from
>> the secret service. I found this example in [10]
>>
>> Storage storage = StorageOptions.newBuilder()
>>
>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>
>>     .build()
>>
>>     .getService();
>>
>>
>>
>> It accepts a InputStream of json. You can programmatically load the
>> content of that json into a java String through secret service and convert
>> that string to a Input Stream as shown in [11]
>>
>>
>>
>> [9]
>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>
>> [10] https://github.com/googleapis/google-cloud-java
>>
>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We were searching about how to use google API’s to send files, but it’s
>> required the first steps to be authentication. In that, the GCP API
>> requires a credentials.json file to be present in the system.
>>
>>
>>
>> Is it fine if we currently design the GCS transport feature such that the
>> file is already present in the system ?
>>
>>
>>
>> Kind Regards
>>
>> Akshay
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Friday, April 17, 2020 at 00:30
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> This message was sent from a non-IU address. Please exercise caution when
>> clicking links or opening attachments from external sources.
>>
>>
>> Hello,
>>
>>
>>
>> Wouldn't it be that in this example the whole file has to be present and
>> converted into a single stream and uploaded at once?
>>
>> We had understood that MFT expects it to be chunk by chunk upload without
>> having to have the entire file present.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> Streaming is supported in GCS java client. Have a look at here [8]
>>
>>
>>
>> [8]
>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> As a followup, we explored GCS in detail. We are faced with a small
>> dilemma. We found that though GCS has a Java support, but the functionality
>> does not seem to extend to a stream based upload and download.
>>
>> The documentation says it is currently done with a gsutil command line
>> library [7], hence we are confused if we would be able to proceed the GCS
>> integration.
>>
>>
>>
>> Could you please give us any suggestions? Also we were wondering if we
>> could maybe take up Box integration or some other provider if GCS proves
>> not possible currently.
>>
>>
>>
>> [7] https://cloud.google.com/storage/docs/streaming
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> We had just started looking into Azure and GCS. Since Azure is done we
>> will take up and explore GCS.
>>
>>
>>
>> Thank you for the update.
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> I'm not sure whether you have made any progress on Azure transport yet. I
>> got a chance to look into that [6]. Let me know if you are working on GCS
>> or any other so that I can plan ahead. Next I will be focusing on Box
>> transport.
>>
>>
>>
>> [6]
>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi  Dimuthu,
>>
>>
>>
>> Thank you for the update. We look into it and get an idea about how the
>> system works.
>>
>> We were hoping to try an implementation for GCS, we will also look into
>> Azure.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Aravind,
>>
>>
>>
>> Here [2] is the complete commit for S3 transport implementation but don't
>> get confused by the amount of changes as this includes both transport
>> implementation and the service backend implementations. If you need to
>> implement a new transport, you need to implement a Receiver, Sender and a
>> MetadataCollector like this [3]. Then you need to add that resource support
>> to Resource service and Secret service [4] [5]. You can similarly do that
>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>> helps.
>>
>>
>>
>> String sourceId = *"remote-ssh-resource"*;
>> String sourceToken = *"local-ssh-cred"*;
>> String sourceType = *"SCP"*;
>> String destId = *"s3-file"*;
>> String destToken = *"s3-cred"*;
>> String destType = *"S3"*;
>>
>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>         .setSourceId(sourceId)
>>         .setSourceToken(sourceToken)
>>         .setSourceType(sourceType)
>>         .setDestinationId(destId)
>>         .setDestinationToken(destToken)
>>         .setDestinationType(destType)
>>         .setAffinityTransfer(*false*).build();
>>
>>
>>
>> [2]
>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>
>> [3]
>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>
>> [4]
>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>
>> [5]
>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>>
>>
>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> There is a working on S3 transport in my local copy. Will commit it once
>> I test it out properly. You can follow the same pattern for any cloud
>> provider which has clients with streaming IO. Streaming among different
>> transfer protocols inside an Agent has been discussed in the last part of
>> this [1] document. Try to get the conceptual idea from that and reverse
>> engineer SCP transport.
>>
>>
>>
>> [1]
>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> We were looking at the existing code in the project. We could find
>> implementations only for local copy and SCP.
>> We were confused on how to go about with an external provider like S3 or
>> Azure? Since it would require integrating with their respective clients.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>> >
>> > Hi Aravind,
>> >
>> > I have to catch up with the code, but you may want to look at the S3
>> implementation and extend it to Azure, GCP or other cloud services like
>> Box, Dropbox and so on.
>> >
>> > There could be many use cases, here is an idea:
>> >
>> > * Compute a job on a supercomputer with SCP access and push the outputs
>> to a Cloud storage.
>> >
>> > Suresh
>> >
>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>> wrote:
>> >>
>> >> Hello,
>> >>
>> >> We set up the MFT project on local system and tested out SCP transfer
>> between JetStream VMs, we were wondering how the support can be extended
>> for AWS/GCS.
>> >>
>> >> As per our understanding, the current implementation has support for
>> two protocols i.e. local-transport and scp-transport. Would we have to
>> modify/add to the code base to extend support for AWS/GCS clients?
>> >>
>> >> Could you please provide suggestions for this use case.
>> >>
>> >> Thank you
>> >> Aravind Ramalingam
>> >
>>
>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
I think a good start point is summarizing what are the available
authentication / authorization methods in dropbox and finding a good java
client. Then you can follow the same approach you have followed in GCS to
implement this. Please start a new mail thread for future dropbox transport
discussions.

Dimuthu

On Wed, Apr 22, 2020 at 10:42 PM Rajvanshi, Akshay <ak...@iu.edu> wrote:

> Hello,
>
>
>
> Thank you for accepting our pull request. As a part of another
> contribution, we were thinking of implementing the Dropbox transport
> protocol. Do you have any suggestions for us regarding that ?
>
>
>
> Kind Regards
>
> Akshay Rajvanshi
>
>
>
> *From: *DImuthu Upeksha <di...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Wednesday, April 22, 2020 at 22:27
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> Aravind, Sharanya,
>
>
>
> I merged your PR [13] for GCS transport. It looks really good now. Thanks
> for the contribution and please update the status in spreadsheet [14].
> Looking forward for more contributions from you and feel free to take up
> any available transport in the sheet if you have time in the future.
>
>
>
> [13] https://github.com/apache/airavata-mft/pull/6
>
> [14]
> https://docs.google.com/spreadsheets/d/1M7-Reda-pCi1l-TSSstI6Yi1pSbtINUqlBFcy5UrOW0/edit?usp=sharing
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Wed, Apr 22, 2020 at 9:19 PM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> Can you debug and fix this? Could be a good chance to learn about MFT
> core
>
>
>
> Dimuthu
>
>
>
> On Wed, Apr 22, 2020 at 4:53 PM Ravichandran, Sharanya <sh...@iu.edu>
> wrote:
>
> Hello,
>
>
>
> We tried transferring files between various protocols and we noticed that
> when we made a transfer to/from azure ,after the completion of the transfer
> ,the logging for transfer percentage was duplicated multiple times. We
> wanted to know if this was expected ?
>
>
>
> Please find below the snapshots of the transfers for you reference :
>
>
>
> GCS to Azure
>
> Azure to Azure :
>
>
>
> Thanks ,
>
> Sharanya R.
>
>
> ------------------------------
>
> *From:* DImuthu Upeksha <di...@gmail.com>
> *Sent:* 22 April 2020 02:38
> *To:* Airavata Dev
> *Subject:* Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> If you look at the error line, it's calling *super*.getS3Secret(request,
> responseObserver); which should not be done in GRPC services. This will
> not give client side errors as the client has got all the data it needed in
> earlier lines. I will remove this invocation and commit. Thanks for
> reporting this
>
>
>
> Dimuthu
>
>
>
> On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>
> Hello,
>
>
>
> In addition to the previous thread from Aravind regarding the error, we
> tested the implementation from apache repository directly without making
> any of our own changes and did testing with other protocols and faced the
> similar problem.
>
>
>
> Kind Regards
>
> Akshay Rajvanshi
>
>
>
> *From: *Aravind Ramalingam <po...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Tuesday, April 21, 2020 at 20:58
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> Hello,
>
>
>
> While testing we noticed an error in the SecretServiceApplication, it
> seems to be a problem with the gRPC calls to the service.
>
>
>
> I have attached the screenshot for your reference.
>
>
>
> Could you please help us with this?
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
>
>
>
>
> On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi Dimuthu,
>
>
>
> Thank you for the review. We will look into the changes asap.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Hi Aravind,
>
>
>
> I reviewed the PR and submitted my reviews. Please have a look at them. I
> didn't thoroughly go through optimizations in the code as there are some
> templating fixes and cleaning up required. Once you fix them, I will do a
> thorough review. Make sure to do a rebase of the PR next time as there are
> conflicts from other commits. Thanks for your contributions.
>
>
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
>
>
> We have raised a Pull Request [12].
>
>
>
> We look forward to your feedback.
>
>
>
> [12] https://github.com/apache/airavata-mft/pull/6
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> Sounds good. Please send a PR once it is done.
>
>
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
>
>
> Thank you Sudhakar and Dimuthu. We figured it out.
>
>
>
> Like Sudhakar had pointed out with the issue link, GCS had returned a
> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>
>
>
> Currently we successfully tested from S3 to GCS and back. We are yet to
> test with other protocols.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
> wrote:
>
> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
> help?
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
> *From: *DImuthu Upeksha <di...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Sunday, April 19, 2020 at 4:46 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> Aravind,
>
>
>
> Can you send a PR for what you have done so far so that I can provide a
> feedback. One thing you have to make sure is that the GCS Metadata
> collector returns the correct md5 for that file. You can download the file
> and run "md5sum <file name>" locally to get actual md5 value for that file
> and compare with what you can see in GCS implementation.
>
>
>
> In S3, etag is the right property to fetch md5 for target resource. I'm
> not sure what is the right method for GCS. You have to locally try and
> verify.
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi Dimuthu,
>
>
>
> We are working on GCS and we got certain parts working, but after a
> transfer is compete we are facing errors with the metadata checks.
>
>
>
> <image001.png>
>
>
>
> We are currently testing S3 to GCS. We noticed in the S3 implementation
> that Etag was set as the Md5sum. In our case we tried using both Etag and
> Md5Hash, but both threw the above error.
>
>
>
> //S3 implementation
>
> metadata.setMd5sum(s3Metadata.getETag());
>
> //GCS implementation
>
> metadata.setMd5sum(gcsMetadata.getEtag());
>
> or
>
> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>
>
>
> We are confused at this point, could you please guide us?
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> Hi Aravind,
>
>
>
> You don't need the file to be present in the gcs example I sent. It needs
> an Input Stream to read the content. You can use the same approach I have
> done in S3 [9] transport to do that. It's straightforward. Replace file
> input stream with context.getStreamBuffer().getInputStream().
>
>
>
> Akshay,
>
>
>
> You can't assume that file is on the machine. It should be provided from
> the secret service. I found this example in [10]
>
> Storage storage = StorageOptions.newBuilder()
>
>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>
>     .build()
>
>     .getService();
>
>
>
> It accepts a InputStream of json. You can programmatically load the
> content of that json into a java String through secret service and convert
> that string to a Input Stream as shown in [11]
>
>
>
> [9]
> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>
> [10] https://github.com/googleapis/google-cloud-java
>
> [11] https://www.baeldung.com/convert-string-to-input-stream
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>
> Hello,
>
>
>
> We were searching about how to use google API’s to send files, but it’s
> required the first steps to be authentication. In that, the GCP API
> requires a credentials.json file to be present in the system.
>
>
>
> Is it fine if we currently design the GCS transport feature such that the
> file is already present in the system ?
>
>
>
> Kind Regards
>
> Akshay
>
>
>
> *From: *Aravind Ramalingam <po...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, April 17, 2020 at 00:30
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> This message was sent from a non-IU address. Please exercise caution when
> clicking links or opening attachments from external sources.
>
>
> Hello,
>
>
>
> Wouldn't it be that in this example the whole file has to be present and
> converted into a single stream and uploaded at once?
>
> We had understood that MFT expects it to be chunk by chunk upload without
> having to have the entire file present.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Streaming is supported in GCS java client. Have a look at here [8]
>
>
>
> [8]
> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> As a followup, we explored GCS in detail. We are faced with a small
> dilemma. We found that though GCS has a Java support, but the functionality
> does not seem to extend to a stream based upload and download.
>
> The documentation says it is currently done with a gsutil command line
> library [7], hence we are confused if we would be able to proceed the GCS
> integration.
>
>
>
> Could you please give us any suggestions? Also we were wondering if we
> could maybe take up Box integration or some other provider if GCS proves
> not possible currently.
>
>
>
> [7] https://cloud.google.com/storage/docs/streaming
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> We had just started looking into Azure and GCS. Since Azure is done we
> will take up and explore GCS.
>
>
>
> Thank you for the update.
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> I'm not sure whether you have made any progress on Azure transport yet. I
> got a chance to look into that [6]. Let me know if you are working on GCS
> or any other so that I can plan ahead. Next I will be focusing on Box
> transport.
>
>
>
> [6]
> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi  Dimuthu,
>
>
>
> Thank you for the update. We look into it and get an idea about how the
> system works.
>
> We were hoping to try an implementation for GCS, we will also look into
> Azure.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Here [2] is the complete commit for S3 transport implementation but don't
> get confused by the amount of changes as this includes both transport
> implementation and the service backend implementations. If you need to
> implement a new transport, you need to implement a Receiver, Sender and a
> MetadataCollector like this [3]. Then you need to add that resource support
> to Resource service and Secret service [4] [5]. You can similarly do that
> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
> helps.
>
>
>
> String sourceId = *"remote-ssh-resource"*;
> String sourceToken = *"local-ssh-cred"*;
> String sourceType = *"SCP"*;
> String destId = *"s3-file"*;
> String destToken = *"s3-cred"*;
> String destType = *"S3"*;
>
> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>         .setSourceId(sourceId)
>         .setSourceToken(sourceToken)
>         .setSourceType(sourceType)
>         .setDestinationId(destId)
>         .setDestinationToken(destToken)
>         .setDestinationType(destType)
>         .setAffinityTransfer(*false*).build();
>
>
>
> [2]
> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>
> [3]
> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>
> [4]
> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>
> [5]
> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>
>
>
> Thanks
>
> Dimuthu
>
>
>
>
>
> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> There is a working on S3 transport in my local copy. Will commit it once I
> test it out properly. You can follow the same pattern for any cloud
> provider which has clients with streaming IO. Streaming among different
> transfer protocols inside an Agent has been discussed in the last part of
> this [1] document. Try to get the conceptual idea from that and reverse
> engineer SCP transport.
>
>
>
> [1]
> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>
>
>
> Dimuthu
>
>
>
> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
> We were looking at the existing code in the project. We could find
> implementations only for local copy and SCP.
> We were confused on how to go about with an external provider like S3 or
> Azure? Since it would require integrating with their respective clients.
>
> Thank you
> Aravind Ramalingam
>
> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
> >
> > Hi Aravind,
> >
> > I have to catch up with the code, but you may want to look at the S3
> implementation and extend it to Azure, GCP or other cloud services like
> Box, Dropbox and so on.
> >
> > There could be many use cases, here is an idea:
> >
> > * Compute a job on a supercomputer with SCP access and push the outputs
> to a Cloud storage.
> >
> > Suresh
> >
> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> We set up the MFT project on local system and tested out SCP transfer
> between JetStream VMs, we were wondering how the support can be extended
> for AWS/GCS.
> >>
> >> As per our understanding, the current implementation has support for
> two protocols i.e. local-transport and scp-transport. Would we have to
> modify/add to the code base to extend support for AWS/GCS clients?
> >>
> >> Could you please provide suggestions for this use case.
> >>
> >> Thank you
> >> Aravind Ramalingam
> >
>
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by "Rajvanshi, Akshay" <ak...@iu.edu>.
Hello,

Thank you for accepting our pull request. As a part of another contribution, we were thinking of implementing the Dropbox transport protocol. Do you have any suggestions for us regarding that ?

Kind Regards
Akshay Rajvanshi

From: DImuthu Upeksha <di...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Wednesday, April 22, 2020 at 22:27
To: Airavata Dev <de...@airavata.apache.org>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind, Sharanya,

I merged your PR [13] for GCS transport. It looks really good now. Thanks for the contribution and please update the status in spreadsheet [14]. Looking forward for more contributions from you and feel free to take up any available transport in the sheet if you have time in the future.

[13] https://github.com/apache/airavata-mft/pull/6
[14] https://docs.google.com/spreadsheets/d/1M7-Reda-pCi1l-TSSstI6Yi1pSbtINUqlBFcy5UrOW0/edit?usp=sharing

Thanks
Dimuthu

On Wed, Apr 22, 2020 at 9:19 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Can you debug and fix this? Could be a good chance to learn about MFT core

Dimuthu

On Wed, Apr 22, 2020 at 4:53 PM Ravichandran, Sharanya <sh...@iu.edu>> wrote:

Hello,



We tried transferring files between various protocols and we noticed that when we made a transfer to/from azure ,after the completion of the transfer ,the logging for transfer percentage was duplicated multiple times. We wanted to know if this was expected ?



Please find below the snapshots of the transfers for you reference :



GCS to Azure

[cid:image001.png@01D618F7.46DFA920]

Azure to Azure :

[cid:image002.png@01D618F7.46DFA920]



Thanks ,

Sharanya R.



________________________________
From: DImuthu Upeksha <di...@gmail.com>>
Sent: 22 April 2020 02:38
To: Airavata Dev
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

If you look at the error line, it's calling super.getS3Secret(request, responseObserver); which should not be done in GRPC services. This will not give client side errors as the client has got all the data it needed in earlier lines. I will remove this invocation and commit. Thanks for reporting this

Dimuthu

On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

In addition to the previous thread from Aravind regarding the error, we tested the implementation from apache repository directly without making any of our own changes and did testing with other protocols and faced the similar problem.

Kind Regards
Akshay Rajvanshi

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Tuesday, April 21, 2020 at 20:58
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Hello,

While testing we noticed an error in the SecretServiceApplication, it seems to be a problem with the gRPC calls to the service.

I have attached the screenshot for your reference.

Could you please help us with this?

Thank you
Aravind Ramalingam



On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

Thank you for the review. We will look into the changes asap.

Thank you
Aravind Ramalingam

On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

I reviewed the PR and submitted my reviews. Please have a look at them. I didn't thoroughly go through optimizations in the code as there are some templating fixes and cleaning up required. Once you fix them, I will do a thorough review. Make sure to do a rebase of the PR next time as there are conflicts from other commits. Thanks for your contributions.

Dimuthu

On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We have raised a Pull Request [12].

We look forward to your feedback.

[12] https://github.com/apache/airavata-mft/pull/6

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Sounds good. Please send a PR once it is done.

Dimuthu

On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

Thank you Sudhakar and Dimuthu. We figured it out.

Like Sudhakar had pointed out with the issue link, GCS had returned a BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.

Currently we successfully tested from S3 to GCS and back. We are yet to test with other protocols.

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
https://github.com/googleapis/google-cloud-java/issues/4117 Does this help?

Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Sunday, April 19, 2020 at 4:46 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind,

Can you send a PR for what you have done so far so that I can provide a feedback. One thing you have to make sure is that the GCS Metadata collector returns the correct md5 for that file. You can download the file and run "md5sum <file name>" locally to get actual md5 value for that file and compare with what you can see in GCS implementation.

In S3, etag is the right property to fetch md5 for target resource. I'm not sure what is the right method for GCS. You have to locally try and verify.

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

We are working on GCS and we got certain parts working, but after a transfer is compete we are facing errors with the metadata checks.

<image001.png>

We are currently testing S3 to GCS. We noticed in the S3 implementation that Etag was set as the Md5sum. In our case we tried using both Etag and Md5Hash, but both threw the above error.

//S3 implementation

metadata.setMd5sum(s3Metadata.getETag());

//GCS implementation

metadata.setMd5sum(gcsMetadata.getEtag());

or

metadata.setMd5sum(gcsMetadata.getMd5Hash());



We are confused at this point, could you please guide us?



Thank you

Aravind Ramalingam

On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

You don't need the file to be present in the gcs example I sent. It needs an Input Stream to read the content. You can use the same approach I have done in S3 [9] transport to do that. It's straightforward. Replace file input stream with context.getStreamBuffer().getInputStream().

Akshay,

You can't assume that file is on the machine. It should be provided from the secret service. I found this example in [10]

Storage storage = StorageOptions.newBuilder()

    .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))

    .build()

    .getService();

It accepts a InputStream of json. You can programmatically load the content of that json into a java String through secret service and convert that string to a Input Stream as shown in [11]

[9] https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
[10] https://github.com/googleapis/google-cloud-java
[11] https://www.baeldung.com/convert-string-to-input-stream

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

We were searching about how to use google API’s to send files, but it’s required the first steps to be authentication. In that, the GCP API requires a credentials.json file to be present in the system.

Is it fine if we currently design the GCS transport feature such that the file is already present in the system ?

Kind Regards
Akshay

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, April 17, 2020 at 00:30
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: [External] Re: Apache Airavata MFT - AWS/GCS support

This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Hello,

Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.

Thank you
Aravind Ramalingam

On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.

Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.

Thank you for the update.
Thank you
Aravind Ramalingam

On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.

[6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the system works.
We were hoping to try an implementation for GCS, we will also look into Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.


String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();

[2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com>> wrote:
There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport.

[1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients.

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org>> wrote:
>
> Hi Aravind,
>
> I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on.
>
> There could be many use cases, here is an idea:
>
> * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage.
>
> Suresh
>
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>> wrote:
>>
>> Hello,
>>
>> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>
>> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients?
>>
>> Could you please provide suggestions for this use case.
>>
>> Thank you
>> Aravind Ramalingam
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Aravind, Sharanya,

I merged your PR [13] for GCS transport. It looks really good now. Thanks
for the contribution and please update the status in spreadsheet [14].
Looking forward for more contributions from you and feel free to take up
any available transport in the sheet if you have time in the future.

[13] https://github.com/apache/airavata-mft/pull/6
[14]
https://docs.google.com/spreadsheets/d/1M7-Reda-pCi1l-TSSstI6Yi1pSbtINUqlBFcy5UrOW0/edit?usp=sharing

Thanks
Dimuthu

On Wed, Apr 22, 2020 at 9:19 PM DImuthu Upeksha <di...@gmail.com>
wrote:

> Can you debug and fix this? Could be a good chance to learn about MFT core
>
> Dimuthu
>
> On Wed, Apr 22, 2020 at 4:53 PM Ravichandran, Sharanya <sh...@iu.edu>
> wrote:
>
>> Hello,
>>
>>
>> We tried transferring files between various protocols and we noticed that
>> when we made a transfer to/from azure ,after the completion of the transfer
>> ,the logging for transfer percentage was duplicated multiple times. We
>> wanted to know if this was expected ?
>>
>>
>> Please find below the snapshots of the transfers for you reference :
>>
>>
>> GCS to Azure
>>
>>
>> Azure to Azure :
>>
>>
>>
>> Thanks ,
>>
>> Sharanya R.
>>
>>
>> ------------------------------
>> *From:* DImuthu Upeksha <di...@gmail.com>
>> *Sent:* 22 April 2020 02:38
>> *To:* Airavata Dev
>> *Subject:* Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>> If you look at the error line, it's calling super.getS3Secret(request,
>> responseObserver); which should not be done in GRPC services. This will
>> not give client side errors as the client has got all the data it needed in
>> earlier lines. I will remove this invocation and commit. Thanks for
>> reporting this
>>
>> Dimuthu
>>
>> On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu>
>> wrote:
>>
>>> Hello,
>>>
>>>
>>>
>>> In addition to the previous thread from Aravind regarding the error, we
>>> tested the implementation from apache repository directly without making
>>> any of our own changes and did testing with other protocols and faced the
>>> similar problem.
>>>
>>>
>>>
>>> Kind Regards
>>>
>>> Akshay Rajvanshi
>>>
>>>
>>>
>>> *From: *Aravind Ramalingam <po...@gmail.com>
>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Date: *Tuesday, April 21, 2020 at 20:58
>>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>
>>>
>>>
>>> Hello,
>>>
>>>
>>>
>>> While testing we noticed an error in the SecretServiceApplication, it
>>> seems to be a problem with the gRPC calls to the service.
>>>
>>>
>>>
>>> I have attached the screenshot for your reference.
>>>
>>>
>>>
>>> Could you please help us with this?
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hi Dimuthu,
>>>
>>>
>>>
>>> Thank you for the review. We will look into the changes asap.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>
>>> wrote:
>>>
>>> Hi Aravind,
>>>
>>>
>>>
>>> I reviewed the PR and submitted my reviews. Please have a look at them.
>>> I didn't thoroughly go through optimizations in the code as there are some
>>> templating fixes and cleaning up required. Once you fix them, I will do a
>>> thorough review. Make sure to do a rebase of the PR next time as there are
>>> conflicts from other commits. Thanks for your contributions.
>>>
>>>
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> We have raised a Pull Request [12].
>>>
>>>
>>>
>>> We look forward to your feedback.
>>>
>>>
>>>
>>> [12] https://github.com/apache/airavata-mft/pull/6
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> Sounds good. Please send a PR once it is done.
>>>
>>>
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> Thank you Sudhakar and Dimuthu. We figured it out.
>>>
>>>
>>>
>>> Like Sudhakar had pointed out with the issue link, GCS had returned a
>>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>>
>>>
>>>
>>> Currently we successfully tested from S3 to GCS and back. We are yet to
>>> test with other protocols.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
>>> wrote:
>>>
>>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>>> help?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Sudhakar.
>>>
>>>
>>>
>>> *From: *DImuthu Upeksha <di...@gmail.com>
>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>>> *To: *Airavata Dev <de...@airavata.apache.org>
>>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>
>>>
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Can you send a PR for what you have done so far so that I can provide a
>>> feedback. One thing you have to make sure is that the GCS Metadata
>>> collector returns the correct md5 for that file. You can download the file
>>> and run "md5sum <file name>" locally to get actual md5 value for that file
>>> and compare with what you can see in GCS implementation.
>>>
>>>
>>>
>>> In S3, etag is the right property to fetch md5 for target resource. I'm
>>> not sure what is the right method for GCS. You have to locally try and
>>> verify.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hi Dimuthu,
>>>
>>>
>>>
>>> We are working on GCS and we got certain parts working, but after a
>>> transfer is compete we are facing errors with the metadata checks.
>>>
>>>
>>>
>>> <image001.png>
>>>
>>>
>>>
>>> We are currently testing S3 to GCS. We noticed in the S3 implementation
>>> that Etag was set as the Md5sum. In our case we tried using both Etag and
>>> Md5Hash, but both threw the above error.
>>>
>>>
>>>
>>> //S3 implementation
>>>
>>> metadata.setMd5sum(s3Metadata.getETag());
>>>
>>> //GCS implementation
>>>
>>> metadata.setMd5sum(gcsMetadata.getEtag());
>>>
>>> or
>>>
>>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>>
>>>
>>>
>>> We are confused at this point, could you please guide us?
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> Hi Aravind,
>>>
>>>
>>>
>>> You don't need the file to be present in the gcs example I sent. It
>>> needs an Input Stream to read the content. You can use the same approach I
>>> have done in S3 [9] transport to do that. It's straightforward. Replace
>>> file input stream with context.getStreamBuffer().getInputStream().
>>>
>>>
>>>
>>> Akshay,
>>>
>>>
>>>
>>> You can't assume that file is on the machine. It should be provided from
>>> the secret service. I found this example in [10]
>>>
>>> Storage storage = StorageOptions.newBuilder()
>>>
>>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>>
>>>     .build()
>>>
>>>     .getService();
>>>
>>>
>>>
>>> It accepts a InputStream of json. You can programmatically load the
>>> content of that json into a java String through secret service and convert
>>> that string to a Input Stream as shown in [11]
>>>
>>>
>>>
>>> [9]
>>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>>
>>> [10] https://github.com/googleapis/google-cloud-java
>>>
>>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> We were searching about how to use google API’s to send files, but it’s
>>> required the first steps to be authentication. In that, the GCP API
>>> requires a credentials.json file to be present in the system.
>>>
>>>
>>>
>>> Is it fine if we currently design the GCS transport feature such that
>>> the file is already present in the system ?
>>>
>>>
>>>
>>> Kind Regards
>>>
>>> Akshay
>>>
>>>
>>>
>>> *From: *Aravind Ramalingam <po...@gmail.com>
>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Date: *Friday, April 17, 2020 at 00:30
>>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>>
>>>
>>>
>>> This message was sent from a non-IU address. Please exercise caution
>>> when clicking links or opening attachments from external sources.
>>>
>>>
>>> Hello,
>>>
>>>
>>>
>>> Wouldn't it be that in this example the whole file has to be present and
>>> converted into a single stream and uploaded at once?
>>>
>>> We had understood that MFT expects it to be chunk by chunk upload
>>> without having to have the entire file present.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>>> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Streaming is supported in GCS java client. Have a look at here [8]
>>>
>>>
>>>
>>> [8]
>>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello Dimuthu,
>>>
>>>
>>>
>>> As a followup, we explored GCS in detail. We are faced with a small
>>> dilemma. We found that though GCS has a Java support, but the functionality
>>> does not seem to extend to a stream based upload and download.
>>>
>>> The documentation says it is currently done with a gsutil command line
>>> library [7], hence we are confused if we would be able to proceed the GCS
>>> integration.
>>>
>>>
>>>
>>> Could you please give us any suggestions? Also we were wondering if we
>>> could maybe take up Box integration or some other provider if GCS proves
>>> not possible currently.
>>>
>>>
>>>
>>> [7] https://cloud.google.com/storage/docs/streaming
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello Dimuthu,
>>>
>>>
>>>
>>> We had just started looking into Azure and GCS. Since Azure is done we
>>> will take up and explore GCS.
>>>
>>>
>>>
>>> Thank you for the update.
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>>> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> I'm not sure whether you have made any progress on Azure transport yet.
>>> I got a chance to look into that [6]. Let me know if you are working on GCS
>>> or any other so that I can plan ahead. Next I will be focusing on Box
>>> transport.
>>>
>>>
>>>
>>> [6]
>>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hi  Dimuthu,
>>>
>>>
>>>
>>> Thank you for the update. We look into it and get an idea about how the
>>> system works.
>>>
>>> We were hoping to try an implementation for GCS, we will also look into
>>> Azure.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Here [2] is the complete commit for S3 transport implementation but
>>> don't get confused by the amount of changes as this includes both transport
>>> implementation and the service backend implementations. If you need to
>>> implement a new transport, you need to implement a Receiver, Sender and a
>>> MetadataCollector like this [3]. Then you need to add that resource support
>>> to Resource service and Secret service [4] [5]. You can similarly do that
>>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>>> helps.
>>>
>>>
>>>
>>> String sourceId = *"remote-ssh-resource"*;
>>> String sourceToken = *"local-ssh-cred"*;
>>> String sourceType = *"SCP"*;
>>> String destId = *"s3-file"*;
>>> String destToken = *"s3-cred"*;
>>> String destType = *"S3"*;
>>>
>>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>>         .setSourceId(sourceId)
>>>         .setSourceToken(sourceToken)
>>>         .setSourceType(sourceType)
>>>         .setDestinationId(destId)
>>>         .setDestinationToken(destToken)
>>>         .setDestinationType(destType)
>>>         .setAffinityTransfer(*false*).build();
>>>
>>>
>>>
>>> [2]
>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>
>>> [3]
>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>
>>> [4]
>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>
>>> [5]
>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> There is a working on S3 transport in my local copy. Will commit it once
>>> I test it out properly. You can follow the same pattern for any cloud
>>> provider which has clients with streaming IO. Streaming among different
>>> transfer protocols inside an Agent has been discussed in the last part of
>>> this [1] document. Try to get the conceptual idea from that and reverse
>>> engineer SCP transport.
>>>
>>>
>>>
>>> [1]
>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>
>>>
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>> We were looking at the existing code in the project. We could find
>>> implementations only for local copy and SCP.
>>> We were confused on how to go about with an external provider like S3 or
>>> Azure? Since it would require integrating with their respective clients.
>>>
>>> Thank you
>>> Aravind Ramalingam
>>>
>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>> >
>>> > Hi Aravind,
>>> >
>>> > I have to catch up with the code, but you may want to look at the S3
>>> implementation and extend it to Azure, GCP or other cloud services like
>>> Box, Dropbox and so on.
>>> >
>>> > There could be many use cases, here is an idea:
>>> >
>>> > * Compute a job on a supercomputer with SCP access and push the
>>> outputs to a Cloud storage.
>>> >
>>> > Suresh
>>> >
>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> We set up the MFT project on local system and tested out SCP transfer
>>> between JetStream VMs, we were wondering how the support can be extended
>>> for AWS/GCS.
>>> >>
>>> >> As per our understanding, the current implementation has support for
>>> two protocols i.e. local-transport and scp-transport. Would we have to
>>> modify/add to the code base to extend support for AWS/GCS clients?
>>> >>
>>> >> Could you please provide suggestions for this use case.
>>> >>
>>> >> Thank you
>>> >> Aravind Ramalingam
>>> >
>>>
>>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Can you debug and fix this? Could be a good chance to learn about MFT core

Dimuthu

On Wed, Apr 22, 2020 at 4:53 PM Ravichandran, Sharanya <sh...@iu.edu>
wrote:

> Hello,
>
>
> We tried transferring files between various protocols and we noticed that
> when we made a transfer to/from azure ,after the completion of the transfer
> ,the logging for transfer percentage was duplicated multiple times. We
> wanted to know if this was expected ?
>
>
> Please find below the snapshots of the transfers for you reference :
>
>
> GCS to Azure
>
>
> Azure to Azure :
>
>
>
> Thanks ,
>
> Sharanya R.
>
>
> ------------------------------
> *From:* DImuthu Upeksha <di...@gmail.com>
> *Sent:* 22 April 2020 02:38
> *To:* Airavata Dev
> *Subject:* Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
> If you look at the error line, it's calling super.getS3Secret(request,
> responseObserver); which should not be done in GRPC services. This will
> not give client side errors as the client has got all the data it needed in
> earlier lines. I will remove this invocation and commit. Thanks for
> reporting this
>
> Dimuthu
>
> On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>
>> Hello,
>>
>>
>>
>> In addition to the previous thread from Aravind regarding the error, we
>> tested the implementation from apache repository directly without making
>> any of our own changes and did testing with other protocols and faced the
>> similar problem.
>>
>>
>>
>> Kind Regards
>>
>> Akshay Rajvanshi
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Tuesday, April 21, 2020 at 20:58
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Hello,
>>
>>
>>
>> While testing we noticed an error in the SecretServiceApplication, it
>> seems to be a problem with the gRPC calls to the service.
>>
>>
>>
>> I have attached the screenshot for your reference.
>>
>>
>>
>> Could you please help us with this?
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi Dimuthu,
>>
>>
>>
>> Thank you for the review. We will look into the changes asap.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Hi Aravind,
>>
>>
>>
>> I reviewed the PR and submitted my reviews. Please have a look at them. I
>> didn't thoroughly go through optimizations in the code as there are some
>> templating fixes and cleaning up required. Once you fix them, I will do a
>> thorough review. Make sure to do a rebase of the PR next time as there are
>> conflicts from other commits. Thanks for your contributions.
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We have raised a Pull Request [12].
>>
>>
>>
>> We look forward to your feedback.
>>
>>
>>
>> [12] https://github.com/apache/airavata-mft/pull/6
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Sounds good. Please send a PR once it is done.
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> Thank you Sudhakar and Dimuthu. We figured it out.
>>
>>
>>
>> Like Sudhakar had pointed out with the issue link, GCS had returned a
>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>
>>
>>
>> Currently we successfully tested from S3 to GCS and back. We are yet to
>> test with other protocols.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
>> wrote:
>>
>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>> help?
>>
>>
>>
>> Thanks,
>>
>> Sudhakar.
>>
>>
>>
>> *From: *DImuthu Upeksha <di...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>> *To: *Airavata Dev <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Aravind,
>>
>>
>>
>> Can you send a PR for what you have done so far so that I can provide a
>> feedback. One thing you have to make sure is that the GCS Metadata
>> collector returns the correct md5 for that file. You can download the file
>> and run "md5sum <file name>" locally to get actual md5 value for that file
>> and compare with what you can see in GCS implementation.
>>
>>
>>
>> In S3, etag is the right property to fetch md5 for target resource. I'm
>> not sure what is the right method for GCS. You have to locally try and
>> verify.
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi Dimuthu,
>>
>>
>>
>> We are working on GCS and we got certain parts working, but after a
>> transfer is compete we are facing errors with the metadata checks.
>>
>>
>>
>> <image001.png>
>>
>>
>>
>> We are currently testing S3 to GCS. We noticed in the S3 implementation
>> that Etag was set as the Md5sum. In our case we tried using both Etag and
>> Md5Hash, but both threw the above error.
>>
>>
>>
>> //S3 implementation
>>
>> metadata.setMd5sum(s3Metadata.getETag());
>>
>> //GCS implementation
>>
>> metadata.setMd5sum(gcsMetadata.getEtag());
>>
>> or
>>
>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>
>>
>>
>> We are confused at this point, could you please guide us?
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Hi Aravind,
>>
>>
>>
>> You don't need the file to be present in the gcs example I sent. It needs
>> an Input Stream to read the content. You can use the same approach I have
>> done in S3 [9] transport to do that. It's straightforward. Replace file
>> input stream with context.getStreamBuffer().getInputStream().
>>
>>
>>
>> Akshay,
>>
>>
>>
>> You can't assume that file is on the machine. It should be provided from
>> the secret service. I found this example in [10]
>>
>> Storage storage = StorageOptions.newBuilder()
>>
>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>
>>     .build()
>>
>>     .getService();
>>
>>
>>
>> It accepts a InputStream of json. You can programmatically load the
>> content of that json into a java String through secret service and convert
>> that string to a Input Stream as shown in [11]
>>
>>
>>
>> [9]
>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>
>> [10] https://github.com/googleapis/google-cloud-java
>>
>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We were searching about how to use google API’s to send files, but it’s
>> required the first steps to be authentication. In that, the GCP API
>> requires a credentials.json file to be present in the system.
>>
>>
>>
>> Is it fine if we currently design the GCS transport feature such that the
>> file is already present in the system ?
>>
>>
>>
>> Kind Regards
>>
>> Akshay
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Friday, April 17, 2020 at 00:30
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> This message was sent from a non-IU address. Please exercise caution when
>> clicking links or opening attachments from external sources.
>>
>>
>> Hello,
>>
>>
>>
>> Wouldn't it be that in this example the whole file has to be present and
>> converted into a single stream and uploaded at once?
>>
>> We had understood that MFT expects it to be chunk by chunk upload without
>> having to have the entire file present.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> Streaming is supported in GCS java client. Have a look at here [8]
>>
>>
>>
>> [8]
>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> As a followup, we explored GCS in detail. We are faced with a small
>> dilemma. We found that though GCS has a Java support, but the functionality
>> does not seem to extend to a stream based upload and download.
>>
>> The documentation says it is currently done with a gsutil command line
>> library [7], hence we are confused if we would be able to proceed the GCS
>> integration.
>>
>>
>>
>> Could you please give us any suggestions? Also we were wondering if we
>> could maybe take up Box integration or some other provider if GCS proves
>> not possible currently.
>>
>>
>>
>> [7] https://cloud.google.com/storage/docs/streaming
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> We had just started looking into Azure and GCS. Since Azure is done we
>> will take up and explore GCS.
>>
>>
>>
>> Thank you for the update.
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> I'm not sure whether you have made any progress on Azure transport yet. I
>> got a chance to look into that [6]. Let me know if you are working on GCS
>> or any other so that I can plan ahead. Next I will be focusing on Box
>> transport.
>>
>>
>>
>> [6]
>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi  Dimuthu,
>>
>>
>>
>> Thank you for the update. We look into it and get an idea about how the
>> system works.
>>
>> We were hoping to try an implementation for GCS, we will also look into
>> Azure.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Aravind,
>>
>>
>>
>> Here [2] is the complete commit for S3 transport implementation but don't
>> get confused by the amount of changes as this includes both transport
>> implementation and the service backend implementations. If you need to
>> implement a new transport, you need to implement a Receiver, Sender and a
>> MetadataCollector like this [3]. Then you need to add that resource support
>> to Resource service and Secret service [4] [5]. You can similarly do that
>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>> helps.
>>
>>
>>
>> String sourceId = *"remote-ssh-resource"*;
>> String sourceToken = *"local-ssh-cred"*;
>> String sourceType = *"SCP"*;
>> String destId = *"s3-file"*;
>> String destToken = *"s3-cred"*;
>> String destType = *"S3"*;
>>
>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>         .setSourceId(sourceId)
>>         .setSourceToken(sourceToken)
>>         .setSourceType(sourceType)
>>         .setDestinationId(destId)
>>         .setDestinationToken(destToken)
>>         .setDestinationType(destType)
>>         .setAffinityTransfer(*false*).build();
>>
>>
>>
>> [2]
>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>
>> [3]
>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>
>> [4]
>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>
>> [5]
>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>>
>>
>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> There is a working on S3 transport in my local copy. Will commit it once
>> I test it out properly. You can follow the same pattern for any cloud
>> provider which has clients with streaming IO. Streaming among different
>> transfer protocols inside an Agent has been discussed in the last part of
>> this [1] document. Try to get the conceptual idea from that and reverse
>> engineer SCP transport.
>>
>>
>>
>> [1]
>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> We were looking at the existing code in the project. We could find
>> implementations only for local copy and SCP.
>> We were confused on how to go about with an external provider like S3 or
>> Azure? Since it would require integrating with their respective clients.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>> >
>> > Hi Aravind,
>> >
>> > I have to catch up with the code, but you may want to look at the S3
>> implementation and extend it to Azure, GCP or other cloud services like
>> Box, Dropbox and so on.
>> >
>> > There could be many use cases, here is an idea:
>> >
>> > * Compute a job on a supercomputer with SCP access and push the outputs
>> to a Cloud storage.
>> >
>> > Suresh
>> >
>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>> wrote:
>> >>
>> >> Hello,
>> >>
>> >> We set up the MFT project on local system and tested out SCP transfer
>> between JetStream VMs, we were wondering how the support can be extended
>> for AWS/GCS.
>> >>
>> >> As per our understanding, the current implementation has support for
>> two protocols i.e. local-transport and scp-transport. Would we have to
>> modify/add to the code base to extend support for AWS/GCS clients?
>> >>
>> >> Could you please provide suggestions for this use case.
>> >>
>> >> Thank you
>> >> Aravind Ramalingam
>> >
>>
>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by "Ravichandran, Sharanya" <sh...@iu.edu>.
Hello,


We tried transferring files between various protocols and we noticed that when we made a transfer to/from azure ,after the completion of the transfer ,the logging for transfer percentage was duplicated multiple times. We wanted to know if this was expected ?


Please find below the snapshots of the transfers for you reference :


GCS to Azure

[cid:78062357-682a-46f8-b630-369eb581c488]

Azure to Azure :

[cid:eb04704f-ca4e-4f75-a13e-d512bcf8daab]


Thanks ,

Sharanya R.


________________________________
From: DImuthu Upeksha <di...@gmail.com>
Sent: 22 April 2020 02:38
To: Airavata Dev
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

If you look at the error line, it's calling super.getS3Secret(request, responseObserver); which should not be done in GRPC services. This will not give client side errors as the client has got all the data it needed in earlier lines. I will remove this invocation and commit. Thanks for reporting this

Dimuthu

On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

In addition to the previous thread from Aravind regarding the error, we tested the implementation from apache repository directly without making any of our own changes and did testing with other protocols and faced the similar problem.

Kind Regards
Akshay Rajvanshi

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Tuesday, April 21, 2020 at 20:58
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Hello,

While testing we noticed an error in the SecretServiceApplication, it seems to be a problem with the gRPC calls to the service.

I have attached the screenshot for your reference.

Could you please help us with this?

Thank you
Aravind Ramalingam



On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

Thank you for the review. We will look into the changes asap.

Thank you
Aravind Ramalingam


On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

I reviewed the PR and submitted my reviews. Please have a look at them. I didn't thoroughly go through optimizations in the code as there are some templating fixes and cleaning up required. Once you fix them, I will do a thorough review. Make sure to do a rebase of the PR next time as there are conflicts from other commits. Thanks for your contributions.

Dimuthu

On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We have raised a Pull Request [12].

We look forward to your feedback.

[12] https://github.com/apache/airavata-mft/pull/6

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Sounds good. Please send a PR once it is done.

Dimuthu

On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

Thank you Sudhakar and Dimuthu. We figured it out.

Like Sudhakar had pointed out with the issue link, GCS had returned a BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.

Currently we successfully tested from S3 to GCS and back. We are yet to test with other protocols.

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
https://github.com/googleapis/google-cloud-java/issues/4117 Does this help?

Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Sunday, April 19, 2020 at 4:46 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind,

Can you send a PR for what you have done so far so that I can provide a feedback. One thing you have to make sure is that the GCS Metadata collector returns the correct md5 for that file. You can download the file and run "md5sum <file name>" locally to get actual md5 value for that file and compare with what you can see in GCS implementation.

In S3, etag is the right property to fetch md5 for target resource. I'm not sure what is the right method for GCS. You have to locally try and verify.

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

We are working on GCS and we got certain parts working, but after a transfer is compete we are facing errors with the metadata checks.

<image001.png>

We are currently testing S3 to GCS. We noticed in the S3 implementation that Etag was set as the Md5sum. In our case we tried using both Etag and Md5Hash, but both threw the above error.

//S3 implementation

metadata.setMd5sum(s3Metadata.getETag());

//GCS implementation

metadata.setMd5sum(gcsMetadata.getEtag());

or

metadata.setMd5sum(gcsMetadata.getMd5Hash());



We are confused at this point, could you please guide us?



Thank you

Aravind Ramalingam

On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

You don't need the file to be present in the gcs example I sent. It needs an Input Stream to read the content. You can use the same approach I have done in S3 [9] transport to do that. It's straightforward. Replace file input stream with context.getStreamBuffer().getInputStream().

Akshay,

You can't assume that file is on the machine. It should be provided from the secret service. I found this example in [10]

Storage storage = StorageOptions.newBuilder()

    .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))

    .build()

    .getService();

It accepts a InputStream of json. You can programmatically load the content of that json into a java String through secret service and convert that string to a Input Stream as shown in [11]

[9] https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
[10] https://github.com/googleapis/google-cloud-java
[11] https://www.baeldung.com/convert-string-to-input-stream

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

We were searching about how to use google API’s to send files, but it’s required the first steps to be authentication. In that, the GCP API requires a credentials.json file to be present in the system.

Is it fine if we currently design the GCS transport feature such that the file is already present in the system ?

Kind Regards
Akshay

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, April 17, 2020 at 00:30
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: [External] Re: Apache Airavata MFT - AWS/GCS support

This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Hello,

Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.

Thank you
Aravind Ramalingam

On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.

Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.

Thank you for the update.
Thank you
Aravind Ramalingam

On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.

[6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the system works.
We were hoping to try an implementation for GCS, we will also look into Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.


String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();

[2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com>> wrote:
There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport.

[1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients.

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org>> wrote:
>
> Hi Aravind,
>
> I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on.
>
> There could be many use cases, here is an idea:
>
> * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage.
>
> Suresh
>
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>> wrote:
>>
>> Hello,
>>
>> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>
>> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients?
>>
>> Could you please provide suggestions for this use case.
>>
>> Thank you
>> Aravind Ramalingam
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello,

Thank you for the clarification. We noticed that a similar super.getSecret
was implemented in Azure and Box functions.
I believe they have to be removed as well?

Thank you
Aravind Ramalingam

On Wed, Apr 22, 2020 at 2:38 AM DImuthu Upeksha <di...@gmail.com>
wrote:

> If you look at the error line, it's calling super.getS3Secret(request,
> responseObserver); which should not be done in GRPC services. This will
> not give client side errors as the client has got all the data it needed in
> earlier lines. I will remove this invocation and commit. Thanks for
> reporting this
>
> Dimuthu
>
> On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>
>> Hello,
>>
>>
>>
>> In addition to the previous thread from Aravind regarding the error, we
>> tested the implementation from apache repository directly without making
>> any of our own changes and did testing with other protocols and faced the
>> similar problem.
>>
>>
>>
>> Kind Regards
>>
>> Akshay Rajvanshi
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Tuesday, April 21, 2020 at 20:58
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Hello,
>>
>>
>>
>> While testing we noticed an error in the SecretServiceApplication, it
>> seems to be a problem with the gRPC calls to the service.
>>
>>
>>
>> I have attached the screenshot for your reference.
>>
>>
>>
>> Could you please help us with this?
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>>
>>
>>
>>
>> On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi Dimuthu,
>>
>>
>>
>> Thank you for the review. We will look into the changes asap.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Hi Aravind,
>>
>>
>>
>> I reviewed the PR and submitted my reviews. Please have a look at them. I
>> didn't thoroughly go through optimizations in the code as there are some
>> templating fixes and cleaning up required. Once you fix them, I will do a
>> thorough review. Make sure to do a rebase of the PR next time as there are
>> conflicts from other commits. Thanks for your contributions.
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We have raised a Pull Request [12].
>>
>>
>>
>> We look forward to your feedback.
>>
>>
>>
>> [12] https://github.com/apache/airavata-mft/pull/6
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Sounds good. Please send a PR once it is done.
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> Thank you Sudhakar and Dimuthu. We figured it out.
>>
>>
>>
>> Like Sudhakar had pointed out with the issue link, GCS had returned a
>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>
>>
>>
>> Currently we successfully tested from S3 to GCS and back. We are yet to
>> test with other protocols.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
>> wrote:
>>
>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>> help?
>>
>>
>>
>> Thanks,
>>
>> Sudhakar.
>>
>>
>>
>> *From: *DImuthu Upeksha <di...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>> *To: *Airavata Dev <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Aravind,
>>
>>
>>
>> Can you send a PR for what you have done so far so that I can provide a
>> feedback. One thing you have to make sure is that the GCS Metadata
>> collector returns the correct md5 for that file. You can download the file
>> and run "md5sum <file name>" locally to get actual md5 value for that file
>> and compare with what you can see in GCS implementation.
>>
>>
>>
>> In S3, etag is the right property to fetch md5 for target resource. I'm
>> not sure what is the right method for GCS. You have to locally try and
>> verify.
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi Dimuthu,
>>
>>
>>
>> We are working on GCS and we got certain parts working, but after a
>> transfer is compete we are facing errors with the metadata checks.
>>
>>
>>
>> <image001.png>
>>
>>
>>
>> We are currently testing S3 to GCS. We noticed in the S3 implementation
>> that Etag was set as the Md5sum. In our case we tried using both Etag and
>> Md5Hash, but both threw the above error.
>>
>>
>>
>> //S3 implementation
>>
>> metadata.setMd5sum(s3Metadata.getETag());
>>
>> //GCS implementation
>>
>> metadata.setMd5sum(gcsMetadata.getEtag());
>>
>> or
>>
>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>
>>
>>
>> We are confused at this point, could you please guide us?
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Hi Aravind,
>>
>>
>>
>> You don't need the file to be present in the gcs example I sent. It needs
>> an Input Stream to read the content. You can use the same approach I have
>> done in S3 [9] transport to do that. It's straightforward. Replace file
>> input stream with context.getStreamBuffer().getInputStream().
>>
>>
>>
>> Akshay,
>>
>>
>>
>> You can't assume that file is on the machine. It should be provided from
>> the secret service. I found this example in [10]
>>
>> Storage storage = StorageOptions.newBuilder()
>>
>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>
>>     .build()
>>
>>     .getService();
>>
>>
>>
>> It accepts a InputStream of json. You can programmatically load the
>> content of that json into a java String through secret service and convert
>> that string to a Input Stream as shown in [11]
>>
>>
>>
>> [9]
>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>
>> [10] https://github.com/googleapis/google-cloud-java
>>
>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We were searching about how to use google API’s to send files, but it’s
>> required the first steps to be authentication. In that, the GCP API
>> requires a credentials.json file to be present in the system.
>>
>>
>>
>> Is it fine if we currently design the GCS transport feature such that the
>> file is already present in the system ?
>>
>>
>>
>> Kind Regards
>>
>> Akshay
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Friday, April 17, 2020 at 00:30
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> This message was sent from a non-IU address. Please exercise caution when
>> clicking links or opening attachments from external sources.
>>
>>
>> Hello,
>>
>>
>>
>> Wouldn't it be that in this example the whole file has to be present and
>> converted into a single stream and uploaded at once?
>>
>> We had understood that MFT expects it to be chunk by chunk upload without
>> having to have the entire file present.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> Streaming is supported in GCS java client. Have a look at here [8]
>>
>>
>>
>> [8]
>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> As a followup, we explored GCS in detail. We are faced with a small
>> dilemma. We found that though GCS has a Java support, but the functionality
>> does not seem to extend to a stream based upload and download.
>>
>> The documentation says it is currently done with a gsutil command line
>> library [7], hence we are confused if we would be able to proceed the GCS
>> integration.
>>
>>
>>
>> Could you please give us any suggestions? Also we were wondering if we
>> could maybe take up Box integration or some other provider if GCS proves
>> not possible currently.
>>
>>
>>
>> [7] https://cloud.google.com/storage/docs/streaming
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> We had just started looking into Azure and GCS. Since Azure is done we
>> will take up and explore GCS.
>>
>>
>>
>> Thank you for the update.
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> I'm not sure whether you have made any progress on Azure transport yet. I
>> got a chance to look into that [6]. Let me know if you are working on GCS
>> or any other so that I can plan ahead. Next I will be focusing on Box
>> transport.
>>
>>
>>
>> [6]
>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi  Dimuthu,
>>
>>
>>
>> Thank you for the update. We look into it and get an idea about how the
>> system works.
>>
>> We were hoping to try an implementation for GCS, we will also look into
>> Azure.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Aravind,
>>
>>
>>
>> Here [2] is the complete commit for S3 transport implementation but don't
>> get confused by the amount of changes as this includes both transport
>> implementation and the service backend implementations. If you need to
>> implement a new transport, you need to implement a Receiver, Sender and a
>> MetadataCollector like this [3]. Then you need to add that resource support
>> to Resource service and Secret service [4] [5]. You can similarly do that
>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>> helps.
>>
>>
>>
>> String sourceId = *"remote-ssh-resource"*;
>> String sourceToken = *"local-ssh-cred"*;
>> String sourceType = *"SCP"*;
>> String destId = *"s3-file"*;
>> String destToken = *"s3-cred"*;
>> String destType = *"S3"*;
>>
>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>         .setSourceId(sourceId)
>>         .setSourceToken(sourceToken)
>>         .setSourceType(sourceType)
>>         .setDestinationId(destId)
>>         .setDestinationToken(destToken)
>>         .setDestinationType(destType)
>>         .setAffinityTransfer(*false*).build();
>>
>>
>>
>> [2]
>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>
>> [3]
>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>
>> [4]
>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>
>> [5]
>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>>
>>
>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> There is a working on S3 transport in my local copy. Will commit it once
>> I test it out properly. You can follow the same pattern for any cloud
>> provider which has clients with streaming IO. Streaming among different
>> transfer protocols inside an Agent has been discussed in the last part of
>> this [1] document. Try to get the conceptual idea from that and reverse
>> engineer SCP transport.
>>
>>
>>
>> [1]
>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> We were looking at the existing code in the project. We could find
>> implementations only for local copy and SCP.
>> We were confused on how to go about with an external provider like S3 or
>> Azure? Since it would require integrating with their respective clients.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>> >
>> > Hi Aravind,
>> >
>> > I have to catch up with the code, but you may want to look at the S3
>> implementation and extend it to Azure, GCP or other cloud services like
>> Box, Dropbox and so on.
>> >
>> > There could be many use cases, here is an idea:
>> >
>> > * Compute a job on a supercomputer with SCP access and push the outputs
>> to a Cloud storage.
>> >
>> > Suresh
>> >
>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>> wrote:
>> >>
>> >> Hello,
>> >>
>> >> We set up the MFT project on local system and tested out SCP transfer
>> between JetStream VMs, we were wondering how the support can be extended
>> for AWS/GCS.
>> >>
>> >> As per our understanding, the current implementation has support for
>> two protocols i.e. local-transport and scp-transport. Would we have to
>> modify/add to the code base to extend support for AWS/GCS clients?
>> >>
>> >> Could you please provide suggestions for this use case.
>> >>
>> >> Thank you
>> >> Aravind Ramalingam
>> >
>>
>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
If you look at the error line, it's calling super.getS3Secret(request,
responseObserver); which should not be done in GRPC services. This will not
give client side errors as the client has got all the data it needed in
earlier lines. I will remove this invocation and commit. Thanks for
reporting this

Dimuthu

On Tue, Apr 21, 2020 at 9:11 PM Rajvanshi, Akshay <ak...@iu.edu> wrote:

> Hello,
>
>
>
> In addition to the previous thread from Aravind regarding the error, we
> tested the implementation from apache repository directly without making
> any of our own changes and did testing with other protocols and faced the
> similar problem.
>
>
>
> Kind Regards
>
> Akshay Rajvanshi
>
>
>
> *From: *Aravind Ramalingam <po...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Tuesday, April 21, 2020 at 20:58
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> Hello,
>
>
>
> While testing we noticed an error in the SecretServiceApplication, it
> seems to be a problem with the gRPC calls to the service.
>
>
>
> I have attached the screenshot for your reference.
>
>
>
> Could you please help us with this?
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
>
>
>
>
> On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi Dimuthu,
>
>
>
> Thank you for the review. We will look into the changes asap.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Hi Aravind,
>
>
>
> I reviewed the PR and submitted my reviews. Please have a look at them. I
> didn't thoroughly go through optimizations in the code as there are some
> templating fixes and cleaning up required. Once you fix them, I will do a
> thorough review. Make sure to do a rebase of the PR next time as there are
> conflicts from other commits. Thanks for your contributions.
>
>
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
>
>
> We have raised a Pull Request [12].
>
>
>
> We look forward to your feedback.
>
>
>
> [12] https://github.com/apache/airavata-mft/pull/6
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> Sounds good. Please send a PR once it is done.
>
>
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
>
>
> Thank you Sudhakar and Dimuthu. We figured it out.
>
>
>
> Like Sudhakar had pointed out with the issue link, GCS had returned a
> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>
>
>
> Currently we successfully tested from S3 to GCS and back. We are yet to
> test with other protocols.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
> wrote:
>
> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
> help?
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
> *From: *DImuthu Upeksha <di...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Sunday, April 19, 2020 at 4:46 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> Aravind,
>
>
>
> Can you send a PR for what you have done so far so that I can provide a
> feedback. One thing you have to make sure is that the GCS Metadata
> collector returns the correct md5 for that file. You can download the file
> and run "md5sum <file name>" locally to get actual md5 value for that file
> and compare with what you can see in GCS implementation.
>
>
>
> In S3, etag is the right property to fetch md5 for target resource. I'm
> not sure what is the right method for GCS. You have to locally try and
> verify.
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi Dimuthu,
>
>
>
> We are working on GCS and we got certain parts working, but after a
> transfer is compete we are facing errors with the metadata checks.
>
>
>
> <image001.png>
>
>
>
> We are currently testing S3 to GCS. We noticed in the S3 implementation
> that Etag was set as the Md5sum. In our case we tried using both Etag and
> Md5Hash, but both threw the above error.
>
>
>
> //S3 implementation
>
> metadata.setMd5sum(s3Metadata.getETag());
>
> //GCS implementation
>
> metadata.setMd5sum(gcsMetadata.getEtag());
>
> or
>
> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>
>
>
> We are confused at this point, could you please guide us?
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> Hi Aravind,
>
>
>
> You don't need the file to be present in the gcs example I sent. It needs
> an Input Stream to read the content. You can use the same approach I have
> done in S3 [9] transport to do that. It's straightforward. Replace file
> input stream with context.getStreamBuffer().getInputStream().
>
>
>
> Akshay,
>
>
>
> You can't assume that file is on the machine. It should be provided from
> the secret service. I found this example in [10]
>
> Storage storage = StorageOptions.newBuilder()
>
>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>
>     .build()
>
>     .getService();
>
>
>
> It accepts a InputStream of json. You can programmatically load the
> content of that json into a java String through secret service and convert
> that string to a Input Stream as shown in [11]
>
>
>
> [9]
> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>
> [10] https://github.com/googleapis/google-cloud-java
>
> [11] https://www.baeldung.com/convert-string-to-input-stream
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>
> Hello,
>
>
>
> We were searching about how to use google API’s to send files, but it’s
> required the first steps to be authentication. In that, the GCP API
> requires a credentials.json file to be present in the system.
>
>
>
> Is it fine if we currently design the GCS transport feature such that the
> file is already present in the system ?
>
>
>
> Kind Regards
>
> Akshay
>
>
>
> *From: *Aravind Ramalingam <po...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, April 17, 2020 at 00:30
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> This message was sent from a non-IU address. Please exercise caution when
> clicking links or opening attachments from external sources.
>
>
> Hello,
>
>
>
> Wouldn't it be that in this example the whole file has to be present and
> converted into a single stream and uploaded at once?
>
> We had understood that MFT expects it to be chunk by chunk upload without
> having to have the entire file present.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Streaming is supported in GCS java client. Have a look at here [8]
>
>
>
> [8]
> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> As a followup, we explored GCS in detail. We are faced with a small
> dilemma. We found that though GCS has a Java support, but the functionality
> does not seem to extend to a stream based upload and download.
>
> The documentation says it is currently done with a gsutil command line
> library [7], hence we are confused if we would be able to proceed the GCS
> integration.
>
>
>
> Could you please give us any suggestions? Also we were wondering if we
> could maybe take up Box integration or some other provider if GCS proves
> not possible currently.
>
>
>
> [7] https://cloud.google.com/storage/docs/streaming
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> We had just started looking into Azure and GCS. Since Azure is done we
> will take up and explore GCS.
>
>
>
> Thank you for the update.
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> I'm not sure whether you have made any progress on Azure transport yet. I
> got a chance to look into that [6]. Let me know if you are working on GCS
> or any other so that I can plan ahead. Next I will be focusing on Box
> transport.
>
>
>
> [6]
> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi  Dimuthu,
>
>
>
> Thank you for the update. We look into it and get an idea about how the
> system works.
>
> We were hoping to try an implementation for GCS, we will also look into
> Azure.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Here [2] is the complete commit for S3 transport implementation but don't
> get confused by the amount of changes as this includes both transport
> implementation and the service backend implementations. If you need to
> implement a new transport, you need to implement a Receiver, Sender and a
> MetadataCollector like this [3]. Then you need to add that resource support
> to Resource service and Secret service [4] [5]. You can similarly do that
> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
> helps.
>
>
>
> String sourceId = *"remote-ssh-resource"*;
> String sourceToken = *"local-ssh-cred"*;
> String sourceType = *"SCP"*;
> String destId = *"s3-file"*;
> String destToken = *"s3-cred"*;
> String destType = *"S3"*;
>
> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>         .setSourceId(sourceId)
>         .setSourceToken(sourceToken)
>         .setSourceType(sourceType)
>         .setDestinationId(destId)
>         .setDestinationToken(destToken)
>         .setDestinationType(destType)
>         .setAffinityTransfer(*false*).build();
>
>
>
> [2]
> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>
> [3]
> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>
> [4]
> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>
> [5]
> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>
>
>
> Thanks
>
> Dimuthu
>
>
>
>
>
> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> There is a working on S3 transport in my local copy. Will commit it once I
> test it out properly. You can follow the same pattern for any cloud
> provider which has clients with streaming IO. Streaming among different
> transfer protocols inside an Agent has been discussed in the last part of
> this [1] document. Try to get the conceptual idea from that and reverse
> engineer SCP transport.
>
>
>
> [1]
> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>
>
>
> Dimuthu
>
>
>
> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
> We were looking at the existing code in the project. We could find
> implementations only for local copy and SCP.
> We were confused on how to go about with an external provider like S3 or
> Azure? Since it would require integrating with their respective clients.
>
> Thank you
> Aravind Ramalingam
>
> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
> >
> > Hi Aravind,
> >
> > I have to catch up with the code, but you may want to look at the S3
> implementation and extend it to Azure, GCP or other cloud services like
> Box, Dropbox and so on.
> >
> > There could be many use cases, here is an idea:
> >
> > * Compute a job on a supercomputer with SCP access and push the outputs
> to a Cloud storage.
> >
> > Suresh
> >
> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> We set up the MFT project on local system and tested out SCP transfer
> between JetStream VMs, we were wondering how the support can be extended
> for AWS/GCS.
> >>
> >> As per our understanding, the current implementation has support for
> two protocols i.e. local-transport and scp-transport. Would we have to
> modify/add to the code base to extend support for AWS/GCS clients?
> >>
> >> Could you please provide suggestions for this use case.
> >>
> >> Thank you
> >> Aravind Ramalingam
> >
>
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by "Rajvanshi, Akshay" <ak...@iu.edu>.
Hello,

In addition to the previous thread from Aravind regarding the error, we tested the implementation from apache repository directly without making any of our own changes and did testing with other protocols and faced the similar problem.

Kind Regards
Akshay Rajvanshi

From: Aravind Ramalingam <po...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Tuesday, April 21, 2020 at 20:58
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Hello,

While testing we noticed an error in the SecretServiceApplication, it seems to be a problem with the gRPC calls to the service.

I have attached the screenshot for your reference.

Could you please help us with this?

Thank you
Aravind Ramalingam



On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

Thank you for the review. We will look into the changes asap.

Thank you
Aravind Ramalingam


On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

I reviewed the PR and submitted my reviews. Please have a look at them. I didn't thoroughly go through optimizations in the code as there are some templating fixes and cleaning up required. Once you fix them, I will do a thorough review. Make sure to do a rebase of the PR next time as there are conflicts from other commits. Thanks for your contributions.

Dimuthu

On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We have raised a Pull Request [12].

We look forward to your feedback.

[12] https://github.com/apache/airavata-mft/pull/6

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Sounds good. Please send a PR once it is done.

Dimuthu

On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

Thank you Sudhakar and Dimuthu. We figured it out.

Like Sudhakar had pointed out with the issue link, GCS had returned a BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.

Currently we successfully tested from S3 to GCS and back. We are yet to test with other protocols.

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>> wrote:
https://github.com/googleapis/google-cloud-java/issues/4117 Does this help?

Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Sunday, April 19, 2020 at 4:46 PM
To: Airavata Dev <de...@airavata.apache.org>>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind,

Can you send a PR for what you have done so far so that I can provide a feedback. One thing you have to make sure is that the GCS Metadata collector returns the correct md5 for that file. You can download the file and run "md5sum <file name>" locally to get actual md5 value for that file and compare with what you can see in GCS implementation.

In S3, etag is the right property to fetch md5 for target resource. I'm not sure what is the right method for GCS. You have to locally try and verify.

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

We are working on GCS and we got certain parts working, but after a transfer is compete we are facing errors with the metadata checks.

<image001.png>

We are currently testing S3 to GCS. We noticed in the S3 implementation that Etag was set as the Md5sum. In our case we tried using both Etag and Md5Hash, but both threw the above error.

//S3 implementation

metadata.setMd5sum(s3Metadata.getETag());

//GCS implementation

metadata.setMd5sum(gcsMetadata.getEtag());

or

metadata.setMd5sum(gcsMetadata.getMd5Hash());



We are confused at this point, could you please guide us?



Thank you

Aravind Ramalingam

On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

You don't need the file to be present in the gcs example I sent. It needs an Input Stream to read the content. You can use the same approach I have done in S3 [9] transport to do that. It's straightforward. Replace file input stream with context.getStreamBuffer().getInputStream().

Akshay,

You can't assume that file is on the machine. It should be provided from the secret service. I found this example in [10]

Storage storage = StorageOptions.newBuilder()

    .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))

    .build()

    .getService();

It accepts a InputStream of json. You can programmatically load the content of that json into a java String through secret service and convert that string to a Input Stream as shown in [11]

[9] https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
[10] https://github.com/googleapis/google-cloud-java
[11] https://www.baeldung.com/convert-string-to-input-stream

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

We were searching about how to use google API’s to send files, but it’s required the first steps to be authentication. In that, the GCP API requires a credentials.json file to be present in the system.

Is it fine if we currently design the GCS transport feature such that the file is already present in the system ?

Kind Regards
Akshay

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, April 17, 2020 at 00:30
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: [External] Re: Apache Airavata MFT - AWS/GCS support

This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Hello,

Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.

Thank you
Aravind Ramalingam

On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.

Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.

Thank you for the update.
Thank you
Aravind Ramalingam

On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.

[6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the system works.
We were hoping to try an implementation for GCS, we will also look into Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.


String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();

[2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com>> wrote:
There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport.

[1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients.

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org>> wrote:
>
> Hi Aravind,
>
> I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on.
>
> There could be many use cases, here is an idea:
>
> * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage.
>
> Suresh
>
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>> wrote:
>>
>> Hello,
>>
>> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>
>> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients?
>>
>> Could you please provide suggestions for this use case.
>>
>> Thank you
>> Aravind Ramalingam
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello,

While testing we noticed an error in the SecretServiceApplication, it seems
to be a problem with the gRPC calls to the service.

I have attached the screenshot for your reference.

Could you please help us with this?

Thank you
Aravind Ramalingam



On Mon, Apr 20, 2020 at 10:59 PM Aravind Ramalingam <po...@gmail.com>
wrote:

> Hi Dimuthu,
>
> Thank you for the review. We will look into the changes asap.
>
> Thank you
> Aravind Ramalingam
>
> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> 
> Hi Aravind,
>
> I reviewed the PR and submitted my reviews. Please have a look at them. I
> didn't thoroughly go through optimizations in the code as there are some
> templating fixes and cleaning up required. Once you fix them, I will do a
> thorough review. Make sure to do a rebase of the PR next time as there are
> conflicts from other commits. Thanks for your contributions.
>
> Dimuthu
>
> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
>> Hello,
>>
>> We have raised a Pull Request [12].
>>
>> We look forward to your feedback.
>>
>> [12] https://github.com/apache/airavata-mft/pull/6
>>
>> Thank you
>> Aravind Ramalingam
>>
>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>>> Sounds good. Please send a PR once it is done.
>>>
>>> Dimuthu
>>>
>>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> Thank you Sudhakar and Dimuthu. We figured it out.
>>>>
>>>> Like Sudhakar had pointed out with the issue link, GCS had returned a
>>>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>>>
>>>> Currently we successfully tested from S3 to GCS and back. We are yet to
>>>> test with other protocols.
>>>>
>>>> Thank you
>>>> Aravind Ramalingam
>>>>
>>>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <
>>>> pamidigs@iu.edu> wrote:
>>>>
>>>>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>>>>> help?
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Sudhakar.
>>>>>
>>>>>
>>>>>
>>>>> *From: *DImuthu Upeksha <di...@gmail.com>
>>>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>>>>> *To: *Airavata Dev <de...@airavata.apache.org>
>>>>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>>>
>>>>>
>>>>>
>>>>> Aravind,
>>>>>
>>>>>
>>>>>
>>>>> Can you send a PR for what you have done so far so that I can provide
>>>>> a feedback. One thing you have to make sure is that the GCS Metadata
>>>>> collector returns the correct md5 for that file. You can download the file
>>>>> and run "md5sum <file name>" locally to get actual md5 value for that file
>>>>> and compare with what you can see in GCS implementation.
>>>>>
>>>>>
>>>>>
>>>>> In S3, etag is the right property to fetch md5 for target resource.
>>>>> I'm not sure what is the right method for GCS. You have to locally try and
>>>>> verify.
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Dimuthu
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi Dimuthu,
>>>>>
>>>>>
>>>>>
>>>>> We are working on GCS and we got certain parts working, but after a
>>>>> transfer is compete we are facing errors with the metadata checks.
>>>>>
>>>>>
>>>>>
>>>>> <image001.png>
>>>>>
>>>>>
>>>>>
>>>>> We are currently testing S3 to GCS. We noticed in the S3
>>>>> implementation that Etag was set as the Md5sum. In our case we tried using
>>>>> both Etag and Md5Hash, but both threw the above error.
>>>>>
>>>>>
>>>>>
>>>>> //S3 implementation
>>>>>
>>>>> metadata.setMd5sum(s3Metadata.getETag());
>>>>>
>>>>> //GCS implementation
>>>>>
>>>>> metadata.setMd5sum(gcsMetadata.getEtag());
>>>>>
>>>>> or
>>>>>
>>>>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>>>>
>>>>>
>>>>>
>>>>> We are confused at this point, could you please guide us?
>>>>>
>>>>>
>>>>>
>>>>> Thank you
>>>>>
>>>>> Aravind Ramalingam
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>>>>> dimuthu.upeksha2@gmail.com> wrote:
>>>>>
>>>>> Hi Aravind,
>>>>>
>>>>>
>>>>>
>>>>> You don't need the file to be present in the gcs example I sent. It
>>>>> needs an Input Stream to read the content. You can use the same approach I
>>>>> have done in S3 [9] transport to do that. It's straightforward. Replace
>>>>> file input stream with context.getStreamBuffer().getInputStream().
>>>>>
>>>>>
>>>>>
>>>>> Akshay,
>>>>>
>>>>>
>>>>>
>>>>> You can't assume that file is on the machine. It should be provided
>>>>> from the secret service. I found this example in [10]
>>>>>
>>>>> Storage storage = StorageOptions.newBuilder()
>>>>>
>>>>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>>>>
>>>>>     .build()
>>>>>
>>>>>     .getService();
>>>>>
>>>>>
>>>>>
>>>>> It accepts a InputStream of json. You can programmatically load the
>>>>> content of that json into a java String through secret service and convert
>>>>> that string to a Input Stream as shown in [11]
>>>>>
>>>>>
>>>>>
>>>>> [9]
>>>>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>>>>
>>>>> [10] https://github.com/googleapis/google-cloud-java
>>>>>
>>>>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Dimuthu
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>>>>> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>>
>>>>>
>>>>> We were searching about how to use google API’s to send files, but
>>>>> it’s required the first steps to be authentication. In that, the GCP API
>>>>> requires a credentials.json file to be present in the system.
>>>>>
>>>>>
>>>>>
>>>>> Is it fine if we currently design the GCS transport feature such that
>>>>> the file is already present in the system ?
>>>>>
>>>>>
>>>>>
>>>>> Kind Regards
>>>>>
>>>>> Akshay
>>>>>
>>>>>
>>>>>
>>>>> *From: *Aravind Ramalingam <po...@gmail.com>
>>>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>> *Date: *Friday, April 17, 2020 at 00:30
>>>>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>>>>
>>>>>
>>>>>
>>>>> This message was sent from a non-IU address. Please exercise caution
>>>>> when clicking links or opening attachments from external sources.
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>>
>>>>>
>>>>> Wouldn't it be that in this example the whole file has to be present
>>>>> and converted into a single stream and uploaded at once?
>>>>>
>>>>> We had understood that MFT expects it to be chunk by chunk upload
>>>>> without having to have the entire file present.
>>>>>
>>>>>
>>>>>
>>>>> Thank you
>>>>>
>>>>> Aravind Ramalingam
>>>>>
>>>>>
>>>>>
>>>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Aravind,
>>>>>
>>>>>
>>>>>
>>>>> Streaming is supported in GCS java client. Have a look at here [8]
>>>>>
>>>>>
>>>>>
>>>>> [8]
>>>>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Dimuthu
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hello Dimuthu,
>>>>>
>>>>>
>>>>>
>>>>> As a followup, we explored GCS in detail. We are faced with a small
>>>>> dilemma. We found that though GCS has a Java support, but the functionality
>>>>> does not seem to extend to a stream based upload and download.
>>>>>
>>>>> The documentation says it is currently done with a gsutil command line
>>>>> library [7], hence we are confused if we would be able to proceed the GCS
>>>>> integration.
>>>>>
>>>>>
>>>>>
>>>>> Could you please give us any suggestions? Also we were wondering if we
>>>>> could maybe take up Box integration or some other provider if GCS proves
>>>>> not possible currently.
>>>>>
>>>>>
>>>>>
>>>>> [7] https://cloud.google.com/storage/docs/streaming
>>>>>
>>>>>
>>>>>
>>>>> Thank you
>>>>>
>>>>> Aravind Ramalingam
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hello Dimuthu,
>>>>>
>>>>>
>>>>>
>>>>> We had just started looking into Azure and GCS. Since Azure is done we
>>>>> will take up and explore GCS.
>>>>>
>>>>>
>>>>>
>>>>> Thank you for the update.
>>>>>
>>>>> Thank you
>>>>>
>>>>> Aravind Ramalingam
>>>>>
>>>>>
>>>>>
>>>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Aravind,
>>>>>
>>>>>
>>>>>
>>>>> I'm not sure whether you have made any progress on Azure transport
>>>>> yet. I got a chance to look into that [6]. Let me know if you are
>>>>> working on GCS or any other so that I can plan ahead. Next I will be
>>>>> focusing on Box transport.
>>>>>
>>>>>
>>>>>
>>>>> [6]
>>>>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Dimuthu
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi  Dimuthu,
>>>>>
>>>>>
>>>>>
>>>>> Thank you for the update. We look into it and get an idea about how
>>>>> the system works.
>>>>>
>>>>> We were hoping to try an implementation for GCS, we will also look
>>>>> into Azure.
>>>>>
>>>>>
>>>>>
>>>>> Thank you
>>>>>
>>>>> Aravind Ramalingam
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>>>>> dimuthu.upeksha2@gmail.com> wrote:
>>>>>
>>>>> Aravind,
>>>>>
>>>>>
>>>>>
>>>>> Here [2] is the complete commit for S3 transport implementation but
>>>>> don't get confused by the amount of changes as this includes both transport
>>>>> implementation and the service backend implementations. If you need to
>>>>> implement a new transport, you need to implement a Receiver, Sender and a
>>>>> MetadataCollector like this [3]. Then you need to add that resource support
>>>>> to Resource service and Secret service [4] [5]. You can similarly do that
>>>>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>>>>> helps.
>>>>>
>>>>>
>>>>>
>>>>> String sourceId = *"remote-ssh-resource"*;
>>>>> String sourceToken = *"local-ssh-cred"*;
>>>>> String sourceType = *"SCP"*;
>>>>> String destId = *"s3-file"*;
>>>>> String destToken = *"s3-cred"*;
>>>>> String destType = *"S3"*;
>>>>>
>>>>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>>>>         .setSourceId(sourceId)
>>>>>         .setSourceToken(sourceToken)
>>>>>         .setSourceType(sourceType)
>>>>>         .setDestinationId(destId)
>>>>>         .setDestinationToken(destToken)
>>>>>         .setDestinationType(destType)
>>>>>         .setAffinityTransfer(*false*).build();
>>>>>
>>>>>
>>>>>
>>>>> [2]
>>>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>>>
>>>>> [3]
>>>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>>>
>>>>> [4]
>>>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>>>
>>>>> [5]
>>>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>>
>>>>> Dimuthu
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>>>>> dimuthu.upeksha2@gmail.com> wrote:
>>>>>
>>>>> There is a working on S3 transport in my local copy. Will commit it
>>>>> once I test it out properly. You can follow the same pattern for any cloud
>>>>> provider which has clients with streaming IO. Streaming among different
>>>>> transfer protocols inside an Agent has been discussed in the last part of
>>>>> this [1] document. Try to get the conceptual idea from that and reverse
>>>>> engineer SCP transport.
>>>>>
>>>>>
>>>>>
>>>>> [1]
>>>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>>>
>>>>>
>>>>>
>>>>> Dimuthu
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> We were looking at the existing code in the project. We could find
>>>>> implementations only for local copy and SCP.
>>>>> We were confused on how to go about with an external provider like S3
>>>>> or Azure? Since it would require integrating with their respective clients.
>>>>>
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>>
>>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>>> >
>>>>> > Hi Aravind,
>>>>> >
>>>>> > I have to catch up with the code, but you may want to look at the S3
>>>>> implementation and extend it to Azure, GCP or other cloud services like
>>>>> Box, Dropbox and so on.
>>>>> >
>>>>> > There could be many use cases, here is an idea:
>>>>> >
>>>>> > * Compute a job on a supercomputer with SCP access and push the
>>>>> outputs to a Cloud storage.
>>>>> >
>>>>> > Suresh
>>>>> >
>>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Hello,
>>>>> >>
>>>>> >> We set up the MFT project on local system and tested out SCP
>>>>> transfer between JetStream VMs, we were wondering how the support can be
>>>>> extended for AWS/GCS.
>>>>> >>
>>>>> >> As per our understanding, the current implementation has support
>>>>> for two protocols i.e. local-transport and scp-transport. Would we have to
>>>>> modify/add to the code base to extend support for AWS/GCS clients?
>>>>> >>
>>>>> >> Could you please provide suggestions for this use case.
>>>>> >>
>>>>> >> Thank you
>>>>> >> Aravind Ramalingam
>>>>> >
>>>>>
>>>>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hi Dimuthu,

Thank you for the review. We will look into the changes asap. 

Thank you
Aravind Ramalingam

> On Apr 20, 2020, at 22:42, DImuthu Upeksha <di...@gmail.com> wrote:
> 
> 
> Hi Aravind,
> 
> I reviewed the PR and submitted my reviews. Please have a look at them. I didn't thoroughly go through optimizations in the code as there are some templating fixes and cleaning up required. Once you fix them, I will do a thorough review. Make sure to do a rebase of the PR next time as there are conflicts from other commits. Thanks for your contributions.
> 
> Dimuthu
> 
>> On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com> wrote:
>> Hello,
>> 
>> We have raised a Pull Request [12]. 
>> 
>> We look forward to your feedback. 
>> 
>> [12] https://github.com/apache/airavata-mft/pull/6
>> 
>> Thank you
>> Aravind Ramalingam
>> 
>>> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <di...@gmail.com> wrote:
>>> Sounds good. Please send a PR once it is done.
>>> 
>>> Dimuthu
>>> 
>>>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>> Hello,
>>>> 
>>>> Thank you Sudhakar and Dimuthu. We figured it out.
>>>> 
>>>> Like Sudhakar had pointed out with the issue link, GCS had returned a BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>>> 
>>>> Currently we successfully tested from S3 to GCS and back. We are yet to test with other protocols.
>>>> 
>>>> Thank you
>>>> Aravind Ramalingam  
>>>> 
>>>>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu> wrote:
>>>>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this help?
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Sudhakar.
>>>>> 
>>>>>  
>>>>> 
>>>>> From: DImuthu Upeksha <di...@gmail.com>
>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>> Date: Sunday, April 19, 2020 at 4:46 PM
>>>>> To: Airavata Dev <de...@airavata.apache.org>
>>>>> Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>>> 
>>>>>  
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> Can you send a PR for what you have done so far so that I can provide a feedback. One thing you have to make sure is that the GCS Metadata collector returns the correct md5 for that file. You can download the file and run "md5sum <file name>" locally to get actual md5 value for that file and compare with what you can see in GCS implementation.
>>>>> 
>>>>>  
>>>>> 
>>>>> In S3, etag is the right property to fetch md5 for target resource. I'm not sure what is the right method for GCS. You have to locally try and verify.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> 
>>>>> Hi Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> We are working on GCS and we got certain parts working, but after a transfer is compete we are facing errors with the metadata checks.
>>>>> 
>>>>>  
>>>>> 
>>>>> <image001.png>
>>>>> 
>>>>>  
>>>>> 
>>>>> We are currently testing S3 to GCS. We noticed in the S3 implementation that Etag was set as the Md5sum. In our case we tried using both Etag and Md5Hash, but both threw the above error.
>>>>> 
>>>>>  
>>>>> 
>>>>> //S3 implementation
>>>>> 
>>>>> metadata.setMd5sum(s3Metadata.getETag());
>>>>> //GCS implementation
>>>>> metadata.setMd5sum(gcsMetadata.getEtag());
>>>>> or
>>>>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>>>>  
>>>>> We are confused at this point, could you please guide us?
>>>>>  
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>>  
>>>>> 
>>>>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <di...@gmail.com> wrote:
>>>>> 
>>>>> Hi Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> You don't need the file to be present in the gcs example I sent. It needs an Input Stream to read the content. You can use the same approach I have done in S3 [9] transport to do that. It's straightforward. Replace file input stream with context.getStreamBuffer().getInputStream().
>>>>> 
>>>>>  
>>>>> 
>>>>> Akshay,
>>>>> 
>>>>>  
>>>>> 
>>>>> You can't assume that file is on the machine. It should be provided from the secret service. I found this example in [10]
>>>>> 
>>>>> Storage storage = StorageOptions.newBuilder()
>>>>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>>>>     .build()
>>>>>     .getService();
>>>>>  
>>>>> 
>>>>> It accepts a InputStream of json. You can programmatically load the content of that json into a java String through secret service and convert that string to a Input Stream as shown in [11]
>>>>> 
>>>>>  
>>>>> 
>>>>> [9] https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>>>> 
>>>>> [10] https://github.com/googleapis/google-cloud-java
>>>>> 
>>>>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>>  
>>>>> 
>>>>> We were searching about how to use google API’s to send files, but it’s required the first steps to be authentication. In that, the GCP API requires a credentials.json file to be present in the system.
>>>>> 
>>>>>  
>>>>> 
>>>>> Is it fine if we currently design the GCS transport feature such that the file is already present in the system ?
>>>>> 
>>>>>  
>>>>> 
>>>>> Kind Regards
>>>>> 
>>>>> Akshay
>>>>> 
>>>>>  
>>>>> 
>>>>> From: Aravind Ramalingam <po...@gmail.com>
>>>>> Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>> Date: Friday, April 17, 2020 at 00:30
>>>>> To: "dev@airavata.apache.org" <de...@airavata.apache.org>
>>>>> Subject: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>>> 
>>>>>  
>>>>> 
>>>>> This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.
>>>>> 
>>>>> 
>>>>> Hello,
>>>>> 
>>>>>  
>>>>> 
>>>>> Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
>>>>> 
>>>>> We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com> wrote:
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> Streaming is supported in GCS java client. Have a look at here [8]
>>>>> 
>>>>>  
>>>>> 
>>>>> [8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> 
>>>>> Hello Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download. 
>>>>> 
>>>>> The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.
>>>>> 
>>>>>  
>>>>> 
>>>>> Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.
>>>>> 
>>>>>  
>>>>> 
>>>>> [7] https://cloud.google.com/storage/docs/streaming 
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> 
>>>>> Hello Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you for the update.
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com> wrote:
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.
>>>>> 
>>>>>  
>>>>> 
>>>>> [6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> 
>>>>> Hi  Dimuthu,
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you for the update. We look into it and get an idea about how the system works.
>>>>> 
>>>>> We were hoping to try an implementation for GCS, we will also look into Azure.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Aravind Ramalingam
>>>>> 
>>>>>  
>>>>> 
>>>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com> wrote:
>>>>> 
>>>>> Aravind,
>>>>> 
>>>>>  
>>>>> 
>>>>> Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.
>>>>> 
>>>>>  
>>>>> 
>>>>> String sourceId = "remote-ssh-resource";
>>>>> String sourceToken = "local-ssh-cred";
>>>>> String sourceType = "SCP";
>>>>> String destId = "s3-file";
>>>>> String destToken = "s3-cred";
>>>>> String destType = "S3";
>>>>> 
>>>>> TransferApiRequest request = TransferApiRequest.newBuilder()
>>>>>         .setSourceId(sourceId)
>>>>>         .setSourceToken(sourceToken)
>>>>>         .setSourceType(sourceType)
>>>>>         .setDestinationId(destId)
>>>>>         .setDestinationToken(destToken)
>>>>>         .setDestinationType(destType)
>>>>>         .setAffinityTransfer(false).build();
>>>>>  
>>>>> 
>>>>> [2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>>> 
>>>>> [3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>>> 
>>>>> [4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>>> 
>>>>> [5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com> wrote:
>>>>> 
>>>>> There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport. 
>>>>> 
>>>>>  
>>>>> 
>>>>> [1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>>> 
>>>>>  
>>>>> 
>>>>> Dimuthu
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> 
>>>>> Hello, 
>>>>> 
>>>>> We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
>>>>> We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients. 
>>>>> 
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>> 
>>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>>> > 
>>>>> > Hi Aravind,
>>>>> > 
>>>>> > I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on. 
>>>>> > 
>>>>> > There could be many use cases, here is an idea:
>>>>> > 
>>>>> > * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage. 
>>>>> > 
>>>>> > Suresh
>>>>> > 
>>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> >> 
>>>>> >> Hello,
>>>>> >> 
>>>>> >> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>>>> >> 
>>>>> >> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients?
>>>>> >> 
>>>>> >> Could you please provide suggestions for this use case. 
>>>>> >> 
>>>>> >> Thank you
>>>>> >> Aravind Ramalingam
>>>>> >

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Hi Aravind,

I reviewed the PR and submitted my reviews. Please have a look at them. I
didn't thoroughly go through optimizations in the code as there are some
templating fixes and cleaning up required. Once you fix them, I will do a
thorough review. Make sure to do a rebase of the PR next time as there are
conflicts from other commits. Thanks for your contributions.

Dimuthu

On Sun, Apr 19, 2020 at 10:13 PM Aravind Ramalingam <po...@gmail.com>
wrote:

> Hello,
>
> We have raised a Pull Request [12].
>
> We look forward to your feedback.
>
> [12] https://github.com/apache/airavata-mft/pull/6
>
> Thank you
> Aravind Ramalingam
>
> On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
>> Sounds good. Please send a PR once it is done.
>>
>> Dimuthu
>>
>> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> Thank you Sudhakar and Dimuthu. We figured it out.
>>>
>>> Like Sudhakar had pointed out with the issue link, GCS had returned a
>>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>>
>>> Currently we successfully tested from S3 to GCS and back. We are yet to
>>> test with other protocols.
>>>
>>> Thank you
>>> Aravind Ramalingam
>>>
>>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
>>> wrote:
>>>
>>>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>>>> help?
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Sudhakar.
>>>>
>>>>
>>>>
>>>> *From: *DImuthu Upeksha <di...@gmail.com>
>>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>>>> *To: *Airavata Dev <de...@airavata.apache.org>
>>>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>>
>>>>
>>>>
>>>> Aravind,
>>>>
>>>>
>>>>
>>>> Can you send a PR for what you have done so far so that I can provide a
>>>> feedback. One thing you have to make sure is that the GCS Metadata
>>>> collector returns the correct md5 for that file. You can download the file
>>>> and run "md5sum <file name>" locally to get actual md5 value for that file
>>>> and compare with what you can see in GCS implementation.
>>>>
>>>>
>>>>
>>>> In S3, etag is the right property to fetch md5 for target resource. I'm
>>>> not sure what is the right method for GCS. You have to locally try and
>>>> verify.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Dimuthu,
>>>>
>>>>
>>>>
>>>> We are working on GCS and we got certain parts working, but after a
>>>> transfer is compete we are facing errors with the metadata checks.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> We are currently testing S3 to GCS. We noticed in the S3 implementation
>>>> that Etag was set as the Md5sum. In our case we tried using both Etag and
>>>> Md5Hash, but both threw the above error.
>>>>
>>>>
>>>>
>>>> //S3 implementation
>>>>
>>>> metadata.setMd5sum(s3Metadata.getETag());
>>>>
>>>> //GCS implementation
>>>>
>>>> metadata.setMd5sum(gcsMetadata.getEtag());
>>>>
>>>> or
>>>>
>>>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>>>
>>>>
>>>>
>>>> We are confused at this point, could you please guide us?
>>>>
>>>>
>>>>
>>>> Thank you
>>>>
>>>> Aravind Ramalingam
>>>>
>>>>
>>>>
>>>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>>>> dimuthu.upeksha2@gmail.com> wrote:
>>>>
>>>> Hi Aravind,
>>>>
>>>>
>>>>
>>>> You don't need the file to be present in the gcs example I sent. It
>>>> needs an Input Stream to read the content. You can use the same approach I
>>>> have done in S3 [9] transport to do that. It's straightforward. Replace
>>>> file input stream with context.getStreamBuffer().getInputStream().
>>>>
>>>>
>>>>
>>>> Akshay,
>>>>
>>>>
>>>>
>>>> You can't assume that file is on the machine. It should be provided
>>>> from the secret service. I found this example in [10]
>>>>
>>>> Storage storage = StorageOptions.newBuilder()
>>>>
>>>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>>>
>>>>     .build()
>>>>
>>>>     .getService();
>>>>
>>>>
>>>>
>>>> It accepts a InputStream of json. You can programmatically load the
>>>> content of that json into a java String through secret service and convert
>>>> that string to a Input Stream as shown in [11]
>>>>
>>>>
>>>>
>>>> [9]
>>>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>>>
>>>> [10] https://github.com/googleapis/google-cloud-java
>>>>
>>>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>>>> wrote:
>>>>
>>>> Hello,
>>>>
>>>>
>>>>
>>>> We were searching about how to use google API’s to send files, but it’s
>>>> required the first steps to be authentication. In that, the GCP API
>>>> requires a credentials.json file to be present in the system.
>>>>
>>>>
>>>>
>>>> Is it fine if we currently design the GCS transport feature such that
>>>> the file is already present in the system ?
>>>>
>>>>
>>>>
>>>> Kind Regards
>>>>
>>>> Akshay
>>>>
>>>>
>>>>
>>>> *From: *Aravind Ramalingam <po...@gmail.com>
>>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>>> *Date: *Friday, April 17, 2020 at 00:30
>>>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>>>
>>>>
>>>>
>>>> This message was sent from a non-IU address. Please exercise caution
>>>> when clicking links or opening attachments from external sources.
>>>>
>>>>
>>>> Hello,
>>>>
>>>>
>>>>
>>>> Wouldn't it be that in this example the whole file has to be present
>>>> and converted into a single stream and uploaded at once?
>>>>
>>>> We had understood that MFT expects it to be chunk by chunk upload
>>>> without having to have the entire file present.
>>>>
>>>>
>>>>
>>>> Thank you
>>>>
>>>> Aravind Ramalingam
>>>>
>>>>
>>>>
>>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>>>> wrote:
>>>>
>>>> Aravind,
>>>>
>>>>
>>>>
>>>> Streaming is supported in GCS java client. Have a look at here [8]
>>>>
>>>>
>>>>
>>>> [8]
>>>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>>
>>>> Hello Dimuthu,
>>>>
>>>>
>>>>
>>>> As a followup, we explored GCS in detail. We are faced with a small
>>>> dilemma. We found that though GCS has a Java support, but the functionality
>>>> does not seem to extend to a stream based upload and download.
>>>>
>>>> The documentation says it is currently done with a gsutil command line
>>>> library [7], hence we are confused if we would be able to proceed the GCS
>>>> integration.
>>>>
>>>>
>>>>
>>>> Could you please give us any suggestions? Also we were wondering if we
>>>> could maybe take up Box integration or some other provider if GCS proves
>>>> not possible currently.
>>>>
>>>>
>>>>
>>>> [7] https://cloud.google.com/storage/docs/streaming
>>>>
>>>>
>>>>
>>>> Thank you
>>>>
>>>> Aravind Ramalingam
>>>>
>>>>
>>>>
>>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>>
>>>> Hello Dimuthu,
>>>>
>>>>
>>>>
>>>> We had just started looking into Azure and GCS. Since Azure is done we
>>>> will take up and explore GCS.
>>>>
>>>>
>>>>
>>>> Thank you for the update.
>>>>
>>>> Thank you
>>>>
>>>> Aravind Ramalingam
>>>>
>>>>
>>>>
>>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>>>> wrote:
>>>>
>>>> Aravind,
>>>>
>>>>
>>>>
>>>> I'm not sure whether you have made any progress on Azure transport yet.
>>>> I got a chance to look into that [6]. Let me know if you are working on GCS
>>>> or any other so that I can plan ahead. Next I will be focusing on Box
>>>> transport.
>>>>
>>>>
>>>>
>>>> [6]
>>>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi  Dimuthu,
>>>>
>>>>
>>>>
>>>> Thank you for the update. We look into it and get an idea about how the
>>>> system works.
>>>>
>>>> We were hoping to try an implementation for GCS, we will also look into
>>>> Azure.
>>>>
>>>>
>>>>
>>>> Thank you
>>>>
>>>> Aravind Ramalingam
>>>>
>>>>
>>>>
>>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>>>> dimuthu.upeksha2@gmail.com> wrote:
>>>>
>>>> Aravind,
>>>>
>>>>
>>>>
>>>> Here [2] is the complete commit for S3 transport implementation but
>>>> don't get confused by the amount of changes as this includes both transport
>>>> implementation and the service backend implementations. If you need to
>>>> implement a new transport, you need to implement a Receiver, Sender and a
>>>> MetadataCollector like this [3]. Then you need to add that resource support
>>>> to Resource service and Secret service [4] [5]. You can similarly do that
>>>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>>>> helps.
>>>>
>>>>
>>>>
>>>> String sourceId = *"remote-ssh-resource"*;
>>>> String sourceToken = *"local-ssh-cred"*;
>>>> String sourceType = *"SCP"*;
>>>> String destId = *"s3-file"*;
>>>> String destToken = *"s3-cred"*;
>>>> String destType = *"S3"*;
>>>>
>>>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>>>         .setSourceId(sourceId)
>>>>         .setSourceToken(sourceToken)
>>>>         .setSourceType(sourceType)
>>>>         .setDestinationId(destId)
>>>>         .setDestinationToken(destToken)
>>>>         .setDestinationType(destType)
>>>>         .setAffinityTransfer(*false*).build();
>>>>
>>>>
>>>>
>>>> [2]
>>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>>
>>>> [3]
>>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>>
>>>> [4]
>>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>>
>>>> [5]
>>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>>>> dimuthu.upeksha2@gmail.com> wrote:
>>>>
>>>> There is a working on S3 transport in my local copy. Will commit it
>>>> once I test it out properly. You can follow the same pattern for any cloud
>>>> provider which has clients with streaming IO. Streaming among different
>>>> transfer protocols inside an Agent has been discussed in the last part of
>>>> this [1] document. Try to get the conceptual idea from that and reverse
>>>> engineer SCP transport.
>>>>
>>>>
>>>>
>>>> [1]
>>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>>
>>>>
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> We were looking at the existing code in the project. We could find
>>>> implementations only for local copy and SCP.
>>>> We were confused on how to go about with an external provider like S3
>>>> or Azure? Since it would require integrating with their respective clients.
>>>>
>>>> Thank you
>>>> Aravind Ramalingam
>>>>
>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>> >
>>>> > Hi Aravind,
>>>> >
>>>> > I have to catch up with the code, but you may want to look at the S3
>>>> implementation and extend it to Azure, GCP or other cloud services like
>>>> Box, Dropbox and so on.
>>>> >
>>>> > There could be many use cases, here is an idea:
>>>> >
>>>> > * Compute a job on a supercomputer with SCP access and push the
>>>> outputs to a Cloud storage.
>>>> >
>>>> > Suresh
>>>> >
>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> Hello,
>>>> >>
>>>> >> We set up the MFT project on local system and tested out SCP
>>>> transfer between JetStream VMs, we were wondering how the support can be
>>>> extended for AWS/GCS.
>>>> >>
>>>> >> As per our understanding, the current implementation has support for
>>>> two protocols i.e. local-transport and scp-transport. Would we have to
>>>> modify/add to the code base to extend support for AWS/GCS clients?
>>>> >>
>>>> >> Could you please provide suggestions for this use case.
>>>> >>
>>>> >> Thank you
>>>> >> Aravind Ramalingam
>>>> >
>>>>
>>>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello,

We have raised a Pull Request [12].

We look forward to your feedback.

[12] https://github.com/apache/airavata-mft/pull/6

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 8:32 PM DImuthu Upeksha <di...@gmail.com>
wrote:

> Sounds good. Please send a PR once it is done.
>
> Dimuthu
>
> On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
>> Hello,
>>
>> Thank you Sudhakar and Dimuthu. We figured it out.
>>
>> Like Sudhakar had pointed out with the issue link, GCS had returned a
>> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>>
>> Currently we successfully tested from S3 to GCS and back. We are yet to
>> test with other protocols.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
>> wrote:
>>
>>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>>> help?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Sudhakar.
>>>
>>>
>>>
>>> *From: *DImuthu Upeksha <di...@gmail.com>
>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>>> *To: *Airavata Dev <de...@airavata.apache.org>
>>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>>
>>>
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Can you send a PR for what you have done so far so that I can provide a
>>> feedback. One thing you have to make sure is that the GCS Metadata
>>> collector returns the correct md5 for that file. You can download the file
>>> and run "md5sum <file name>" locally to get actual md5 value for that file
>>> and compare with what you can see in GCS implementation.
>>>
>>>
>>>
>>> In S3, etag is the right property to fetch md5 for target resource. I'm
>>> not sure what is the right method for GCS. You have to locally try and
>>> verify.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hi Dimuthu,
>>>
>>>
>>>
>>> We are working on GCS and we got certain parts working, but after a
>>> transfer is compete we are facing errors with the metadata checks.
>>>
>>>
>>>
>>>
>>>
>>> We are currently testing S3 to GCS. We noticed in the S3 implementation
>>> that Etag was set as the Md5sum. In our case we tried using both Etag and
>>> Md5Hash, but both threw the above error.
>>>
>>>
>>>
>>> //S3 implementation
>>>
>>> metadata.setMd5sum(s3Metadata.getETag());
>>>
>>> //GCS implementation
>>>
>>> metadata.setMd5sum(gcsMetadata.getEtag());
>>>
>>> or
>>>
>>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>>
>>>
>>>
>>> We are confused at this point, could you please guide us?
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> Hi Aravind,
>>>
>>>
>>>
>>> You don't need the file to be present in the gcs example I sent. It
>>> needs an Input Stream to read the content. You can use the same approach I
>>> have done in S3 [9] transport to do that. It's straightforward. Replace
>>> file input stream with context.getStreamBuffer().getInputStream().
>>>
>>>
>>>
>>> Akshay,
>>>
>>>
>>>
>>> You can't assume that file is on the machine. It should be provided from
>>> the secret service. I found this example in [10]
>>>
>>> Storage storage = StorageOptions.newBuilder()
>>>
>>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>>
>>>     .build()
>>>
>>>     .getService();
>>>
>>>
>>>
>>> It accepts a InputStream of json. You can programmatically load the
>>> content of that json into a java String through secret service and convert
>>> that string to a Input Stream as shown in [11]
>>>
>>>
>>>
>>> [9]
>>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>>
>>> [10] https://github.com/googleapis/google-cloud-java
>>>
>>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> We were searching about how to use google API’s to send files, but it’s
>>> required the first steps to be authentication. In that, the GCP API
>>> requires a credentials.json file to be present in the system.
>>>
>>>
>>>
>>> Is it fine if we currently design the GCS transport feature such that
>>> the file is already present in the system ?
>>>
>>>
>>>
>>> Kind Regards
>>>
>>> Akshay
>>>
>>>
>>>
>>> *From: *Aravind Ramalingam <po...@gmail.com>
>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Date: *Friday, April 17, 2020 at 00:30
>>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>>
>>>
>>>
>>> This message was sent from a non-IU address. Please exercise caution
>>> when clicking links or opening attachments from external sources.
>>>
>>>
>>> Hello,
>>>
>>>
>>>
>>> Wouldn't it be that in this example the whole file has to be present and
>>> converted into a single stream and uploaded at once?
>>>
>>> We had understood that MFT expects it to be chunk by chunk upload
>>> without having to have the entire file present.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>>> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Streaming is supported in GCS java client. Have a look at here [8]
>>>
>>>
>>>
>>> [8]
>>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello Dimuthu,
>>>
>>>
>>>
>>> As a followup, we explored GCS in detail. We are faced with a small
>>> dilemma. We found that though GCS has a Java support, but the functionality
>>> does not seem to extend to a stream based upload and download.
>>>
>>> The documentation says it is currently done with a gsutil command line
>>> library [7], hence we are confused if we would be able to proceed the GCS
>>> integration.
>>>
>>>
>>>
>>> Could you please give us any suggestions? Also we were wondering if we
>>> could maybe take up Box integration or some other provider if GCS proves
>>> not possible currently.
>>>
>>>
>>>
>>> [7] https://cloud.google.com/storage/docs/streaming
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello Dimuthu,
>>>
>>>
>>>
>>> We had just started looking into Azure and GCS. Since Azure is done we
>>> will take up and explore GCS.
>>>
>>>
>>>
>>> Thank you for the update.
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>>> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> I'm not sure whether you have made any progress on Azure transport yet.
>>> I got a chance to look into that [6]. Let me know if you are working on GCS
>>> or any other so that I can plan ahead. Next I will be focusing on Box
>>> transport.
>>>
>>>
>>>
>>> [6]
>>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hi  Dimuthu,
>>>
>>>
>>>
>>> Thank you for the update. We look into it and get an idea about how the
>>> system works.
>>>
>>> We were hoping to try an implementation for GCS, we will also look into
>>> Azure.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Here [2] is the complete commit for S3 transport implementation but
>>> don't get confused by the amount of changes as this includes both transport
>>> implementation and the service backend implementations. If you need to
>>> implement a new transport, you need to implement a Receiver, Sender and a
>>> MetadataCollector like this [3]. Then you need to add that resource support
>>> to Resource service and Secret service [4] [5]. You can similarly do that
>>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>>> helps.
>>>
>>>
>>>
>>> String sourceId = *"remote-ssh-resource"*;
>>> String sourceToken = *"local-ssh-cred"*;
>>> String sourceType = *"SCP"*;
>>> String destId = *"s3-file"*;
>>> String destToken = *"s3-cred"*;
>>> String destType = *"S3"*;
>>>
>>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>>         .setSourceId(sourceId)
>>>         .setSourceToken(sourceToken)
>>>         .setSourceType(sourceType)
>>>         .setDestinationId(destId)
>>>         .setDestinationToken(destToken)
>>>         .setDestinationType(destType)
>>>         .setAffinityTransfer(*false*).build();
>>>
>>>
>>>
>>> [2]
>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>
>>> [3]
>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>
>>> [4]
>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>
>>> [5]
>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> There is a working on S3 transport in my local copy. Will commit it once
>>> I test it out properly. You can follow the same pattern for any cloud
>>> provider which has clients with streaming IO. Streaming among different
>>> transfer protocols inside an Agent has been discussed in the last part of
>>> this [1] document. Try to get the conceptual idea from that and reverse
>>> engineer SCP transport.
>>>
>>>
>>>
>>> [1]
>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>
>>>
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>> We were looking at the existing code in the project. We could find
>>> implementations only for local copy and SCP.
>>> We were confused on how to go about with an external provider like S3 or
>>> Azure? Since it would require integrating with their respective clients.
>>>
>>> Thank you
>>> Aravind Ramalingam
>>>
>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>> >
>>> > Hi Aravind,
>>> >
>>> > I have to catch up with the code, but you may want to look at the S3
>>> implementation and extend it to Azure, GCP or other cloud services like
>>> Box, Dropbox and so on.
>>> >
>>> > There could be many use cases, here is an idea:
>>> >
>>> > * Compute a job on a supercomputer with SCP access and push the
>>> outputs to a Cloud storage.
>>> >
>>> > Suresh
>>> >
>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> We set up the MFT project on local system and tested out SCP transfer
>>> between JetStream VMs, we were wondering how the support can be extended
>>> for AWS/GCS.
>>> >>
>>> >> As per our understanding, the current implementation has support for
>>> two protocols i.e. local-transport and scp-transport. Would we have to
>>> modify/add to the code base to extend support for AWS/GCS clients?
>>> >>
>>> >> Could you please provide suggestions for this use case.
>>> >>
>>> >> Thank you
>>> >> Aravind Ramalingam
>>> >
>>>
>>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Sounds good. Please send a PR once it is done.

Dimuthu

On Sun, Apr 19, 2020 at 7:23 PM Aravind Ramalingam <po...@gmail.com>
wrote:

> Hello,
>
> Thank you Sudhakar and Dimuthu. We figured it out.
>
> Like Sudhakar had pointed out with the issue link, GCS had returned a
> BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.
>
> Currently we successfully tested from S3 to GCS and back. We are yet to
> test with other protocols.
>
> Thank you
> Aravind Ramalingam
>
> On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
> wrote:
>
>> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
>> help?
>>
>>
>>
>> Thanks,
>>
>> Sudhakar.
>>
>>
>>
>> *From: *DImuthu Upeksha <di...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Sunday, April 19, 2020 at 4:46 PM
>> *To: *Airavata Dev <de...@airavata.apache.org>
>> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> Aravind,
>>
>>
>>
>> Can you send a PR for what you have done so far so that I can provide a
>> feedback. One thing you have to make sure is that the GCS Metadata
>> collector returns the correct md5 for that file. You can download the file
>> and run "md5sum <file name>" locally to get actual md5 value for that file
>> and compare with what you can see in GCS implementation.
>>
>>
>>
>> In S3, etag is the right property to fetch md5 for target resource. I'm
>> not sure what is the right method for GCS. You have to locally try and
>> verify.
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi Dimuthu,
>>
>>
>>
>> We are working on GCS and we got certain parts working, but after a
>> transfer is compete we are facing errors with the metadata checks.
>>
>>
>>
>>
>>
>> We are currently testing S3 to GCS. We noticed in the S3 implementation
>> that Etag was set as the Md5sum. In our case we tried using both Etag and
>> Md5Hash, but both threw the above error.
>>
>>
>>
>> //S3 implementation
>>
>> metadata.setMd5sum(s3Metadata.getETag());
>>
>> //GCS implementation
>>
>> metadata.setMd5sum(gcsMetadata.getEtag());
>>
>> or
>>
>> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>>
>>
>>
>> We are confused at this point, could you please guide us?
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Hi Aravind,
>>
>>
>>
>> You don't need the file to be present in the gcs example I sent. It needs
>> an Input Stream to read the content. You can use the same approach I have
>> done in S3 [9] transport to do that. It's straightforward. Replace file
>> input stream with context.getStreamBuffer().getInputStream().
>>
>>
>>
>> Akshay,
>>
>>
>>
>> You can't assume that file is on the machine. It should be provided from
>> the secret service. I found this example in [10]
>>
>> Storage storage = StorageOptions.newBuilder()
>>
>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>
>>     .build()
>>
>>     .getService();
>>
>>
>>
>> It accepts a InputStream of json. You can programmatically load the
>> content of that json into a java String through secret service and convert
>> that string to a Input Stream as shown in [11]
>>
>>
>>
>> [9]
>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>>
>> [10] https://github.com/googleapis/google-cloud-java
>>
>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>> wrote:
>>
>> Hello,
>>
>>
>>
>> We were searching about how to use google API’s to send files, but it’s
>> required the first steps to be authentication. In that, the GCP API
>> requires a credentials.json file to be present in the system.
>>
>>
>>
>> Is it fine if we currently design the GCS transport feature such that the
>> file is already present in the system ?
>>
>>
>>
>> Kind Regards
>>
>> Akshay
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Friday, April 17, 2020 at 00:30
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> This message was sent from a non-IU address. Please exercise caution when
>> clicking links or opening attachments from external sources.
>>
>>
>> Hello,
>>
>>
>>
>> Wouldn't it be that in this example the whole file has to be present and
>> converted into a single stream and uploaded at once?
>>
>> We had understood that MFT expects it to be chunk by chunk upload without
>> having to have the entire file present.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> Streaming is supported in GCS java client. Have a look at here [8]
>>
>>
>>
>> [8]
>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> As a followup, we explored GCS in detail. We are faced with a small
>> dilemma. We found that though GCS has a Java support, but the functionality
>> does not seem to extend to a stream based upload and download.
>>
>> The documentation says it is currently done with a gsutil command line
>> library [7], hence we are confused if we would be able to proceed the GCS
>> integration.
>>
>>
>>
>> Could you please give us any suggestions? Also we were wondering if we
>> could maybe take up Box integration or some other provider if GCS proves
>> not possible currently.
>>
>>
>>
>> [7] https://cloud.google.com/storage/docs/streaming
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> We had just started looking into Azure and GCS. Since Azure is done we
>> will take up and explore GCS.
>>
>>
>>
>> Thank you for the update.
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> I'm not sure whether you have made any progress on Azure transport yet. I
>> got a chance to look into that [6]. Let me know if you are working on GCS
>> or any other so that I can plan ahead. Next I will be focusing on Box
>> transport.
>>
>>
>>
>> [6]
>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi  Dimuthu,
>>
>>
>>
>> Thank you for the update. We look into it and get an idea about how the
>> system works.
>>
>> We were hoping to try an implementation for GCS, we will also look into
>> Azure.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Aravind,
>>
>>
>>
>> Here [2] is the complete commit for S3 transport implementation but don't
>> get confused by the amount of changes as this includes both transport
>> implementation and the service backend implementations. If you need to
>> implement a new transport, you need to implement a Receiver, Sender and a
>> MetadataCollector like this [3]. Then you need to add that resource support
>> to Resource service and Secret service [4] [5]. You can similarly do that
>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>> helps.
>>
>>
>>
>> String sourceId = *"remote-ssh-resource"*;
>> String sourceToken = *"local-ssh-cred"*;
>> String sourceType = *"SCP"*;
>> String destId = *"s3-file"*;
>> String destToken = *"s3-cred"*;
>> String destType = *"S3"*;
>>
>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>         .setSourceId(sourceId)
>>         .setSourceToken(sourceToken)
>>         .setSourceType(sourceType)
>>         .setDestinationId(destId)
>>         .setDestinationToken(destToken)
>>         .setDestinationType(destType)
>>         .setAffinityTransfer(*false*).build();
>>
>>
>>
>> [2]
>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>
>> [3]
>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>
>> [4]
>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>
>> [5]
>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>>
>>
>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> There is a working on S3 transport in my local copy. Will commit it once
>> I test it out properly. You can follow the same pattern for any cloud
>> provider which has clients with streaming IO. Streaming among different
>> transfer protocols inside an Agent has been discussed in the last part of
>> this [1] document. Try to get the conceptual idea from that and reverse
>> engineer SCP transport.
>>
>>
>>
>> [1]
>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> We were looking at the existing code in the project. We could find
>> implementations only for local copy and SCP.
>> We were confused on how to go about with an external provider like S3 or
>> Azure? Since it would require integrating with their respective clients.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>> >
>> > Hi Aravind,
>> >
>> > I have to catch up with the code, but you may want to look at the S3
>> implementation and extend it to Azure, GCP or other cloud services like
>> Box, Dropbox and so on.
>> >
>> > There could be many use cases, here is an idea:
>> >
>> > * Compute a job on a supercomputer with SCP access and push the outputs
>> to a Cloud storage.
>> >
>> > Suresh
>> >
>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>> wrote:
>> >>
>> >> Hello,
>> >>
>> >> We set up the MFT project on local system and tested out SCP transfer
>> between JetStream VMs, we were wondering how the support can be extended
>> for AWS/GCS.
>> >>
>> >> As per our understanding, the current implementation has support for
>> two protocols i.e. local-transport and scp-transport. Would we have to
>> modify/add to the code base to extend support for AWS/GCS clients?
>> >>
>> >> Could you please provide suggestions for this use case.
>> >>
>> >> Thank you
>> >> Aravind Ramalingam
>> >
>>
>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello,

Thank you Sudhakar and Dimuthu. We figured it out.

Like Sudhakar had pointed out with the issue link, GCS had returned a
BASE64 Md5Hash, we had to convert it to HEX and it matched with the S3 hash.

Currently we successfully tested from S3 to GCS and back. We are yet to
test with other protocols.

Thank you
Aravind Ramalingam

On Sun, Apr 19, 2020 at 4:58 PM Pamidighantam, Sudhakar <pa...@iu.edu>
wrote:

> https://github.com/googleapis/google-cloud-java/issues/4117 Does this
> help?
>
>
>
> Thanks,
>
> Sudhakar.
>
>
>
> *From: *DImuthu Upeksha <di...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Sunday, April 19, 2020 at 4:46 PM
> *To: *Airavata Dev <de...@airavata.apache.org>
> *Subject: *Re: [External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> Aravind,
>
>
>
> Can you send a PR for what you have done so far so that I can provide a
> feedback. One thing you have to make sure is that the GCS Metadata
> collector returns the correct md5 for that file. You can download the file
> and run "md5sum <file name>" locally to get actual md5 value for that file
> and compare with what you can see in GCS implementation.
>
>
>
> In S3, etag is the right property to fetch md5 for target resource. I'm
> not sure what is the right method for GCS. You have to locally try and
> verify.
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi Dimuthu,
>
>
>
> We are working on GCS and we got certain parts working, but after a
> transfer is compete we are facing errors with the metadata checks.
>
>
>
>
>
> We are currently testing S3 to GCS. We noticed in the S3 implementation
> that Etag was set as the Md5sum. In our case we tried using both Etag and
> Md5Hash, but both threw the above error.
>
>
>
> //S3 implementation
>
> metadata.setMd5sum(s3Metadata.getETag());
>
> //GCS implementation
>
> metadata.setMd5sum(gcsMetadata.getEtag());
>
> or
>
> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>
>
>
> We are confused at this point, could you please guide us?
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> Hi Aravind,
>
>
>
> You don't need the file to be present in the gcs example I sent. It needs
> an Input Stream to read the content. You can use the same approach I have
> done in S3 [9] transport to do that. It's straightforward. Replace file
> input stream with context.getStreamBuffer().getInputStream().
>
>
>
> Akshay,
>
>
>
> You can't assume that file is on the machine. It should be provided from
> the secret service. I found this example in [10]
>
> Storage storage = StorageOptions.newBuilder()
>
>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>
>     .build()
>
>     .getService();
>
>
>
> It accepts a InputStream of json. You can programmatically load the
> content of that json into a java String through secret service and convert
> that string to a Input Stream as shown in [11]
>
>
>
> [9]
> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>
> [10] https://github.com/googleapis/google-cloud-java
>
> [11] https://www.baeldung.com/convert-string-to-input-stream
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>
> Hello,
>
>
>
> We were searching about how to use google API’s to send files, but it’s
> required the first steps to be authentication. In that, the GCP API
> requires a credentials.json file to be present in the system.
>
>
>
> Is it fine if we currently design the GCS transport feature such that the
> file is already present in the system ?
>
>
>
> Kind Regards
>
> Akshay
>
>
>
> *From: *Aravind Ramalingam <po...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, April 17, 2020 at 00:30
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> This message was sent from a non-IU address. Please exercise caution when
> clicking links or opening attachments from external sources.
>
>
> Hello,
>
>
>
> Wouldn't it be that in this example the whole file has to be present and
> converted into a single stream and uploaded at once?
>
> We had understood that MFT expects it to be chunk by chunk upload without
> having to have the entire file present.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Streaming is supported in GCS java client. Have a look at here [8]
>
>
>
> [8]
> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> As a followup, we explored GCS in detail. We are faced with a small
> dilemma. We found that though GCS has a Java support, but the functionality
> does not seem to extend to a stream based upload and download.
>
> The documentation says it is currently done with a gsutil command line
> library [7], hence we are confused if we would be able to proceed the GCS
> integration.
>
>
>
> Could you please give us any suggestions? Also we were wondering if we
> could maybe take up Box integration or some other provider if GCS proves
> not possible currently.
>
>
>
> [7] https://cloud.google.com/storage/docs/streaming
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> We had just started looking into Azure and GCS. Since Azure is done we
> will take up and explore GCS.
>
>
>
> Thank you for the update.
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> I'm not sure whether you have made any progress on Azure transport yet. I
> got a chance to look into that [6]. Let me know if you are working on GCS
> or any other so that I can plan ahead. Next I will be focusing on Box
> transport.
>
>
>
> [6]
> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi  Dimuthu,
>
>
>
> Thank you for the update. We look into it and get an idea about how the
> system works.
>
> We were hoping to try an implementation for GCS, we will also look into
> Azure.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Here [2] is the complete commit for S3 transport implementation but don't
> get confused by the amount of changes as this includes both transport
> implementation and the service backend implementations. If you need to
> implement a new transport, you need to implement a Receiver, Sender and a
> MetadataCollector like this [3]. Then you need to add that resource support
> to Resource service and Secret service [4] [5]. You can similarly do that
> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
> helps.
>
>
>
> String sourceId = *"remote-ssh-resource"*;
> String sourceToken = *"local-ssh-cred"*;
> String sourceType = *"SCP"*;
> String destId = *"s3-file"*;
> String destToken = *"s3-cred"*;
> String destType = *"S3"*;
>
> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>         .setSourceId(sourceId)
>         .setSourceToken(sourceToken)
>         .setSourceType(sourceType)
>         .setDestinationId(destId)
>         .setDestinationToken(destToken)
>         .setDestinationType(destType)
>         .setAffinityTransfer(*false*).build();
>
>
>
> [2]
> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>
> [3]
> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>
> [4]
> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>
> [5]
> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>
>
>
> Thanks
>
> Dimuthu
>
>
>
>
>
> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> There is a working on S3 transport in my local copy. Will commit it once I
> test it out properly. You can follow the same pattern for any cloud
> provider which has clients with streaming IO. Streaming among different
> transfer protocols inside an Agent has been discussed in the last part of
> this [1] document. Try to get the conceptual idea from that and reverse
> engineer SCP transport.
>
>
>
> [1]
> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>
>
>
> Dimuthu
>
>
>
> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
> We were looking at the existing code in the project. We could find
> implementations only for local copy and SCP.
> We were confused on how to go about with an external provider like S3 or
> Azure? Since it would require integrating with their respective clients.
>
> Thank you
> Aravind Ramalingam
>
> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
> >
> > Hi Aravind,
> >
> > I have to catch up with the code, but you may want to look at the S3
> implementation and extend it to Azure, GCP or other cloud services like
> Box, Dropbox and so on.
> >
> > There could be many use cases, here is an idea:
> >
> > * Compute a job on a supercomputer with SCP access and push the outputs
> to a Cloud storage.
> >
> > Suresh
> >
> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> We set up the MFT project on local system and tested out SCP transfer
> between JetStream VMs, we were wondering how the support can be extended
> for AWS/GCS.
> >>
> >> As per our understanding, the current implementation has support for
> two protocols i.e. local-transport and scp-transport. Would we have to
> modify/add to the code base to extend support for AWS/GCS clients?
> >>
> >> Could you please provide suggestions for this use case.
> >>
> >> Thank you
> >> Aravind Ramalingam
> >
>
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by "Pamidighantam, Sudhakar" <pa...@iu.edu>.
https://github.com/googleapis/google-cloud-java/issues/4117 Does this help?

Thanks,
Sudhakar.

From: DImuthu Upeksha <di...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Sunday, April 19, 2020 at 4:46 PM
To: Airavata Dev <de...@airavata.apache.org>
Subject: Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Aravind,

Can you send a PR for what you have done so far so that I can provide a feedback. One thing you have to make sure is that the GCS Metadata collector returns the correct md5 for that file. You can download the file and run "md5sum <file name>" locally to get actual md5 value for that file and compare with what you can see in GCS implementation.

In S3, etag is the right property to fetch md5 for target resource. I'm not sure what is the right method for GCS. You have to locally try and verify.

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi Dimuthu,

We are working on GCS and we got certain parts working, but after a transfer is compete we are facing errors with the metadata checks.

[cid:image001.png@01D6166A.B258B230]

We are currently testing S3 to GCS. We noticed in the S3 implementation that Etag was set as the Md5sum. In our case we tried using both Etag and Md5Hash, but both threw the above error.

//S3 implementation

metadata.setMd5sum(s3Metadata.getETag());

//GCS implementation

metadata.setMd5sum(gcsMetadata.getEtag());

or

metadata.setMd5sum(gcsMetadata.getMd5Hash());



We are confused at this point, could you please guide us?



Thank you

Aravind Ramalingam

On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <di...@gmail.com>> wrote:
Hi Aravind,

You don't need the file to be present in the gcs example I sent. It needs an Input Stream to read the content. You can use the same approach I have done in S3 [9] transport to do that. It's straightforward. Replace file input stream with context.getStreamBuffer().getInputStream().

Akshay,

You can't assume that file is on the machine. It should be provided from the secret service. I found this example in [10]

Storage storage = StorageOptions.newBuilder()

    .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))

    .build()

    .getService();

It accepts a InputStream of json. You can programmatically load the content of that json into a java String through secret service and convert that string to a Input Stream as shown in [11]

[9] https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
[10] https://github.com/googleapis/google-cloud-java
[11] https://www.baeldung.com/convert-string-to-input-stream

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>> wrote:
Hello,

We were searching about how to use google API’s to send files, but it’s required the first steps to be authentication. In that, the GCP API requires a credentials.json file to be present in the system.

Is it fine if we currently design the GCS transport feature such that the file is already present in the system ?

Kind Regards
Akshay

From: Aravind Ramalingam <po...@gmail.com>>
Reply-To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Date: Friday, April 17, 2020 at 00:30
To: "dev@airavata.apache.org<ma...@airavata.apache.org>" <de...@airavata.apache.org>>
Subject: [External] Re: Apache Airavata MFT - AWS/GCS support

This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Hello,

Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.

Thank you
Aravind Ramalingam

On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.

Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.

Thank you for the update.
Thank you
Aravind Ramalingam

On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.

[6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the system works.
We were hoping to try an implementation for GCS, we will also look into Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.


String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();

[2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com>> wrote:
There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport.

[1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients.

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org>> wrote:
>
> Hi Aravind,
>
> I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on.
>
> There could be many use cases, here is an idea:
>
> * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage.
>
> Suresh
>
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>> wrote:
>>
>> Hello,
>>
>> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>
>> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients?
>>
>> Could you please provide suggestions for this use case.
>>
>> Thank you
>> Aravind Ramalingam
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Aravind,

Can you send a PR for what you have done so far so that I can provide a
feedback. One thing you have to make sure is that the GCS Metadata
collector returns the correct md5 for that file. You can download the file
and run "md5sum <file name>" locally to get actual md5 value for that file
and compare with what you can see in GCS implementation.

In S3, etag is the right property to fetch md5 for target resource. I'm not
sure what is the right method for GCS. You have to locally try and verify.

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 3:32 PM Aravind Ramalingam <po...@gmail.com>
wrote:

> Hi Dimuthu,
>
> We are working on GCS and we got certain parts working, but after a
> transfer is compete we are facing errors with the metadata checks.
>
> [image: image.png]
>
> We are currently testing S3 to GCS. We noticed in the S3 implementation
> that Etag was set as the Md5sum. In our case we tried using both Etag and
> Md5Hash, but both threw the above error.
>
> //S3 implementation
>
> metadata.setMd5sum(s3Metadata.getETag());
>
> //GCS implementation
>
> metadata.setMd5sum(gcsMetadata.getEtag());
>
> or
>
> metadata.setMd5sum(gcsMetadata.getMd5Hash());
>
>
> We are confused at this point, could you please guide us?
>
>
> Thank you
>
> Aravind Ramalingam
>
>
> On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
>> Hi Aravind,
>>
>> You don't need the file to be present in the gcs example I sent. It needs
>> an Input Stream to read the content. You can use the same approach I have
>> done in S3 [9] transport to do that. It's straightforward. Replace file
>> input stream with context.getStreamBuffer().getInputStream().
>>
>> Akshay,
>>
>> You can't assume that file is on the machine. It should be provided from
>> the secret service. I found this example in [10]
>>
>> Storage storage = StorageOptions.newBuilder()
>>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>>     .build()
>>     .getService();
>>
>>
>> It accepts a InputStream of json. You can programmatically load the
>> content of that json into a java String through secret service and convert
>> that string to a Input Stream as shown in [11]
>>
>> [9]
>> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
>> [10] https://github.com/googleapis/google-cloud-java
>> [11] https://www.baeldung.com/convert-string-to-input-stream
>>
>> Thanks
>> Dimuthu
>>
>> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu>
>> wrote:
>>
>>> Hello,
>>>
>>>
>>>
>>> We were searching about how to use google API’s to send files, but it’s
>>> required the first steps to be authentication. In that, the GCP API
>>> requires a credentials.json file to be present in the system.
>>>
>>>
>>>
>>> Is it fine if we currently design the GCS transport feature such that
>>> the file is already present in the system ?
>>>
>>>
>>>
>>> Kind Regards
>>>
>>> Akshay
>>>
>>>
>>>
>>> *From: *Aravind Ramalingam <po...@gmail.com>
>>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Date: *Friday, April 17, 2020 at 00:30
>>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>>
>>>
>>>
>>> This message was sent from a non-IU address. Please exercise caution
>>> when clicking links or opening attachments from external sources.
>>>
>>>
>>> Hello,
>>>
>>>
>>>
>>> Wouldn't it be that in this example the whole file has to be present and
>>> converted into a single stream and uploaded at once?
>>>
>>> We had understood that MFT expects it to be chunk by chunk upload
>>> without having to have the entire file present.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>>> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Streaming is supported in GCS java client. Have a look at here [8]
>>>
>>>
>>>
>>> [8]
>>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello Dimuthu,
>>>
>>>
>>>
>>> As a followup, we explored GCS in detail. We are faced with a small
>>> dilemma. We found that though GCS has a Java support, but the functionality
>>> does not seem to extend to a stream based upload and download.
>>>
>>> The documentation says it is currently done with a gsutil command line
>>> library [7], hence we are confused if we would be able to proceed the GCS
>>> integration.
>>>
>>>
>>>
>>> Could you please give us any suggestions? Also we were wondering if we
>>> could maybe take up Box integration or some other provider if GCS proves
>>> not possible currently.
>>>
>>>
>>>
>>> [7] https://cloud.google.com/storage/docs/streaming
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello Dimuthu,
>>>
>>>
>>>
>>> We had just started looking into Azure and GCS. Since Azure is done we
>>> will take up and explore GCS.
>>>
>>>
>>>
>>> Thank you for the update.
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>>> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> I'm not sure whether you have made any progress on Azure transport yet.
>>> I got a chance to look into that [6]. Let me know if you are working on GCS
>>> or any other so that I can plan ahead. Next I will be focusing on Box
>>> transport.
>>>
>>>
>>>
>>> [6]
>>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hi  Dimuthu,
>>>
>>>
>>>
>>> Thank you for the update. We look into it and get an idea about how the
>>> system works.
>>>
>>> We were hoping to try an implementation for GCS, we will also look into
>>> Azure.
>>>
>>>
>>>
>>> Thank you
>>>
>>> Aravind Ramalingam
>>>
>>>
>>>
>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> Aravind,
>>>
>>>
>>>
>>> Here [2] is the complete commit for S3 transport implementation but
>>> don't get confused by the amount of changes as this includes both transport
>>> implementation and the service backend implementations. If you need to
>>> implement a new transport, you need to implement a Receiver, Sender and a
>>> MetadataCollector like this [3]. Then you need to add that resource support
>>> to Resource service and Secret service [4] [5]. You can similarly do that
>>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>>> helps.
>>>
>>>
>>>
>>> String sourceId = *"remote-ssh-resource"*;
>>> String sourceToken = *"local-ssh-cred"*;
>>> String sourceType = *"SCP"*;
>>> String destId = *"s3-file"*;
>>> String destToken = *"s3-cred"*;
>>> String destType = *"S3"*;
>>>
>>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>>         .setSourceId(sourceId)
>>>         .setSourceToken(sourceToken)
>>>         .setSourceType(sourceType)
>>>         .setDestinationId(destId)
>>>         .setDestinationToken(destToken)
>>>         .setDestinationType(destType)
>>>         .setAffinityTransfer(*false*).build();
>>>
>>>
>>>
>>> [2]
>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>
>>> [3]
>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>
>>> [4]
>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>
>>> [5]
>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>
>>>
>>>
>>> Thanks
>>>
>>> Dimuthu
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>> There is a working on S3 transport in my local copy. Will commit it once
>>> I test it out properly. You can follow the same pattern for any cloud
>>> provider which has clients with streaming IO. Streaming among different
>>> transfer protocols inside an Agent has been discussed in the last part of
>>> this [1] document. Try to get the conceptual idea from that and reverse
>>> engineer SCP transport.
>>>
>>>
>>>
>>> [1]
>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>
>>>
>>>
>>> Dimuthu
>>>
>>>
>>>
>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>> Hello,
>>>
>>> We were looking at the existing code in the project. We could find
>>> implementations only for local copy and SCP.
>>> We were confused on how to go about with an external provider like S3 or
>>> Azure? Since it would require integrating with their respective clients.
>>>
>>> Thank you
>>> Aravind Ramalingam
>>>
>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>> >
>>> > Hi Aravind,
>>> >
>>> > I have to catch up with the code, but you may want to look at the S3
>>> implementation and extend it to Azure, GCP or other cloud services like
>>> Box, Dropbox and so on.
>>> >
>>> > There could be many use cases, here is an idea:
>>> >
>>> > * Compute a job on a supercomputer with SCP access and push the
>>> outputs to a Cloud storage.
>>> >
>>> > Suresh
>>> >
>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> We set up the MFT project on local system and tested out SCP transfer
>>> between JetStream VMs, we were wondering how the support can be extended
>>> for AWS/GCS.
>>> >>
>>> >> As per our understanding, the current implementation has support for
>>> two protocols i.e. local-transport and scp-transport. Would we have to
>>> modify/add to the code base to extend support for AWS/GCS clients?
>>> >>
>>> >> Could you please provide suggestions for this use case.
>>> >>
>>> >> Thank you
>>> >> Aravind Ramalingam
>>> >
>>>
>>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hi Dimuthu,

We are working on GCS and we got certain parts working, but after a
transfer is compete we are facing errors with the metadata checks.

[image: image.png]

We are currently testing S3 to GCS. We noticed in the S3 implementation
that Etag was set as the Md5sum. In our case we tried using both Etag and
Md5Hash, but both threw the above error.

//S3 implementation

metadata.setMd5sum(s3Metadata.getETag());

//GCS implementation

metadata.setMd5sum(gcsMetadata.getEtag());

or

metadata.setMd5sum(gcsMetadata.getMd5Hash());


We are confused at this point, could you please guide us?


Thank you

Aravind Ramalingam


On Sun, Apr 19, 2020 at 11:28 AM DImuthu Upeksha <di...@gmail.com>
wrote:

> Hi Aravind,
>
> You don't need the file to be present in the gcs example I sent. It needs
> an Input Stream to read the content. You can use the same approach I have
> done in S3 [9] transport to do that. It's straightforward. Replace file
> input stream with context.getStreamBuffer().getInputStream().
>
> Akshay,
>
> You can't assume that file is on the machine. It should be provided from
> the secret service. I found this example in [10]
>
> Storage storage = StorageOptions.newBuilder()
>     .setCredentials(ServiceAccountCredentials.fromStream(new FileInputStream("/path/to/my/key.json")))
>     .build()
>     .getService();
>
>
> It accepts a InputStream of json. You can programmatically load the
> content of that json into a java String through secret service and convert
> that string to a Input Stream as shown in [11]
>
> [9]
> https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
> [10] https://github.com/googleapis/google-cloud-java
> [11] https://www.baeldung.com/convert-string-to-input-stream
>
> Thanks
> Dimuthu
>
> On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu> wrote:
>
>> Hello,
>>
>>
>>
>> We were searching about how to use google API’s to send files, but it’s
>> required the first steps to be authentication. In that, the GCP API
>> requires a credentials.json file to be present in the system.
>>
>>
>>
>> Is it fine if we currently design the GCS transport feature such that the
>> file is already present in the system ?
>>
>>
>>
>> Kind Regards
>>
>> Akshay
>>
>>
>>
>> *From: *Aravind Ramalingam <po...@gmail.com>
>> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Date: *Friday, April 17, 2020 at 00:30
>> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
>> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>>
>>
>>
>> This message was sent from a non-IU address. Please exercise caution when
>> clicking links or opening attachments from external sources.
>>
>>
>> Hello,
>>
>>
>>
>> Wouldn't it be that in this example the whole file has to be present and
>> converted into a single stream and uploaded at once?
>>
>> We had understood that MFT expects it to be chunk by chunk upload without
>> having to have the entire file present.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> Streaming is supported in GCS java client. Have a look at here [8]
>>
>>
>>
>> [8]
>> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> As a followup, we explored GCS in detail. We are faced with a small
>> dilemma. We found that though GCS has a Java support, but the functionality
>> does not seem to extend to a stream based upload and download.
>>
>> The documentation says it is currently done with a gsutil command line
>> library [7], hence we are confused if we would be able to proceed the GCS
>> integration.
>>
>>
>>
>> Could you please give us any suggestions? Also we were wondering if we
>> could maybe take up Box integration or some other provider if GCS proves
>> not possible currently.
>>
>>
>>
>> [7] https://cloud.google.com/storage/docs/streaming
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello Dimuthu,
>>
>>
>>
>> We had just started looking into Azure and GCS. Since Azure is done we
>> will take up and explore GCS.
>>
>>
>>
>> Thank you for the update.
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> Aravind,
>>
>>
>>
>> I'm not sure whether you have made any progress on Azure transport yet. I
>> got a chance to look into that [6]. Let me know if you are working on GCS
>> or any other so that I can plan ahead. Next I will be focusing on Box
>> transport.
>>
>>
>>
>> [6]
>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hi  Dimuthu,
>>
>>
>>
>> Thank you for the update. We look into it and get an idea about how the
>> system works.
>>
>> We were hoping to try an implementation for GCS, we will also look into
>> Azure.
>>
>>
>>
>> Thank you
>>
>> Aravind Ramalingam
>>
>>
>>
>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> Aravind,
>>
>>
>>
>> Here [2] is the complete commit for S3 transport implementation but don't
>> get confused by the amount of changes as this includes both transport
>> implementation and the service backend implementations. If you need to
>> implement a new transport, you need to implement a Receiver, Sender and a
>> MetadataCollector like this [3]. Then you need to add that resource support
>> to Resource service and Secret service [4] [5]. You can similarly do that
>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>> helps.
>>
>>
>>
>> String sourceId = *"remote-ssh-resource"*;
>> String sourceToken = *"local-ssh-cred"*;
>> String sourceType = *"SCP"*;
>> String destId = *"s3-file"*;
>> String destToken = *"s3-cred"*;
>> String destType = *"S3"*;
>>
>> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>>         .setSourceId(sourceId)
>>         .setSourceToken(sourceToken)
>>         .setSourceType(sourceType)
>>         .setDestinationId(destId)
>>         .setDestinationToken(destToken)
>>         .setDestinationType(destType)
>>         .setAffinityTransfer(*false*).build();
>>
>>
>>
>> [2]
>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>
>> [3]
>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>
>> [4]
>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>
>> [5]
>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>
>>
>>
>> Thanks
>>
>> Dimuthu
>>
>>
>>
>>
>>
>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>> There is a working on S3 transport in my local copy. Will commit it once
>> I test it out properly. You can follow the same pattern for any cloud
>> provider which has clients with streaming IO. Streaming among different
>> transfer protocols inside an Agent has been discussed in the last part of
>> this [1] document. Try to get the conceptual idea from that and reverse
>> engineer SCP transport.
>>
>>
>>
>> [1]
>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>
>>
>>
>> Dimuthu
>>
>>
>>
>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>> Hello,
>>
>> We were looking at the existing code in the project. We could find
>> implementations only for local copy and SCP.
>> We were confused on how to go about with an external provider like S3 or
>> Azure? Since it would require integrating with their respective clients.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>> >
>> > Hi Aravind,
>> >
>> > I have to catch up with the code, but you may want to look at the S3
>> implementation and extend it to Azure, GCP or other cloud services like
>> Box, Dropbox and so on.
>> >
>> > There could be many use cases, here is an idea:
>> >
>> > * Compute a job on a supercomputer with SCP access and push the outputs
>> to a Cloud storage.
>> >
>> > Suresh
>> >
>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>> wrote:
>> >>
>> >> Hello,
>> >>
>> >> We set up the MFT project on local system and tested out SCP transfer
>> between JetStream VMs, we were wondering how the support can be extended
>> for AWS/GCS.
>> >>
>> >> As per our understanding, the current implementation has support for
>> two protocols i.e. local-transport and scp-transport. Would we have to
>> modify/add to the code base to extend support for AWS/GCS clients?
>> >>
>> >> Could you please provide suggestions for this use case.
>> >>
>> >> Thank you
>> >> Aravind Ramalingam
>> >
>>
>>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Hi Aravind,

You don't need the file to be present in the gcs example I sent. It needs
an Input Stream to read the content. You can use the same approach I have
done in S3 [9] transport to do that. It's straightforward. Replace file
input stream with context.getStreamBuffer().getInputStream().

Akshay,

You can't assume that file is on the machine. It should be provided from
the secret service. I found this example in [10]

Storage storage = StorageOptions.newBuilder()
    .setCredentials(ServiceAccountCredentials.fromStream(new
FileInputStream("/path/to/my/key.json")))
    .build()
    .getService();


It accepts a InputStream of json. You can programmatically load the content
of that json into a java String through secret service and convert that
string to a Input Stream as shown in [11]

[9]
https://github.com/apache/airavata-mft/blob/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3/S3Sender.java#L73
[10] https://github.com/googleapis/google-cloud-java
[11] https://www.baeldung.com/convert-string-to-input-stream

Thanks
Dimuthu

On Sun, Apr 19, 2020 at 2:03 AM Rajvanshi, Akshay <ak...@iu.edu> wrote:

> Hello,
>
>
>
> We were searching about how to use google API’s to send files, but it’s
> required the first steps to be authentication. In that, the GCP API
> requires a credentials.json file to be present in the system.
>
>
>
> Is it fine if we currently design the GCS transport feature such that the
> file is already present in the system ?
>
>
>
> Kind Regards
>
> Akshay
>
>
>
> *From: *Aravind Ramalingam <po...@gmail.com>
> *Reply-To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Date: *Friday, April 17, 2020 at 00:30
> *To: *"dev@airavata.apache.org" <de...@airavata.apache.org>
> *Subject: *[External] Re: Apache Airavata MFT - AWS/GCS support
>
>
>
> This message was sent from a non-IU address. Please exercise caution when
> clicking links or opening attachments from external sources.
>
>
> Hello,
>
>
>
> Wouldn't it be that in this example the whole file has to be present and
> converted into a single stream and uploaded at once?
>
> We had understood that MFT expects it to be chunk by chunk upload without
> having to have the entire file present.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Streaming is supported in GCS java client. Have a look at here [8]
>
>
>
> [8]
> https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> As a followup, we explored GCS in detail. We are faced with a small
> dilemma. We found that though GCS has a Java support, but the functionality
> does not seem to extend to a stream based upload and download.
>
> The documentation says it is currently done with a gsutil command line
> library [7], hence we are confused if we would be able to proceed the GCS
> integration.
>
>
>
> Could you please give us any suggestions? Also we were wondering if we
> could maybe take up Box integration or some other provider if GCS proves
> not possible currently.
>
>
>
> [7] https://cloud.google.com/storage/docs/streaming
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello Dimuthu,
>
>
>
> We had just started looking into Azure and GCS. Since Azure is done we
> will take up and explore GCS.
>
>
>
> Thank you for the update.
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> I'm not sure whether you have made any progress on Azure transport yet. I
> got a chance to look into that [6]. Let me know if you are working on GCS
> or any other so that I can plan ahead. Next I will be focusing on Box
> transport.
>
>
>
> [6]
> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>
>
>
> Thanks
>
> Dimuthu
>
>
>
> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hi  Dimuthu,
>
>
>
> Thank you for the update. We look into it and get an idea about how the
> system works.
>
> We were hoping to try an implementation for GCS, we will also look into
> Azure.
>
>
>
> Thank you
>
> Aravind Ramalingam
>
>
>
> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> Aravind,
>
>
>
> Here [2] is the complete commit for S3 transport implementation but don't
> get confused by the amount of changes as this includes both transport
> implementation and the service backend implementations. If you need to
> implement a new transport, you need to implement a Receiver, Sender and a
> MetadataCollector like this [3]. Then you need to add that resource support
> to Resource service and Secret service [4] [5]. You can similarly do that
> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
> helps.
>
>
>
> String sourceId = *"remote-ssh-resource"*;
> String sourceToken = *"local-ssh-cred"*;
> String sourceType = *"SCP"*;
> String destId = *"s3-file"*;
> String destToken = *"s3-cred"*;
> String destType = *"S3"*;
>
> TransferApiRequest request = TransferApiRequest.*newBuilder*()
>         .setSourceId(sourceId)
>         .setSourceToken(sourceToken)
>         .setSourceType(sourceType)
>         .setDestinationId(destId)
>         .setDestinationToken(destToken)
>         .setDestinationType(destType)
>         .setAffinityTransfer(*false*).build();
>
>
>
> [2]
> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>
> [3]
> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>
> [4]
> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>
> [5]
> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>
>
>
> Thanks
>
> Dimuthu
>
>
>
>
>
> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
> There is a working on S3 transport in my local copy. Will commit it once I
> test it out properly. You can follow the same pattern for any cloud
> provider which has clients with streaming IO. Streaming among different
> transfer protocols inside an Agent has been discussed in the last part of
> this [1] document. Try to get the conceptual idea from that and reverse
> engineer SCP transport.
>
>
>
> [1]
> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>
>
>
> Dimuthu
>
>
>
> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
> Hello,
>
> We were looking at the existing code in the project. We could find
> implementations only for local copy and SCP.
> We were confused on how to go about with an external provider like S3 or
> Azure? Since it would require integrating with their respective clients.
>
> Thank you
> Aravind Ramalingam
>
> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
> >
> > Hi Aravind,
> >
> > I have to catch up with the code, but you may want to look at the S3
> implementation and extend it to Azure, GCP or other cloud services like
> Box, Dropbox and so on.
> >
> > There could be many use cases, here is an idea:
> >
> > * Compute a job on a supercomputer with SCP access and push the outputs
> to a Cloud storage.
> >
> > Suresh
> >
> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> We set up the MFT project on local system and tested out SCP transfer
> between JetStream VMs, we were wondering how the support can be extended
> for AWS/GCS.
> >>
> >> As per our understanding, the current implementation has support for
> two protocols i.e. local-transport and scp-transport. Would we have to
> modify/add to the code base to extend support for AWS/GCS clients?
> >>
> >> Could you please provide suggestions for this use case.
> >>
> >> Thank you
> >> Aravind Ramalingam
> >
>
>

Re: [External] Re: Apache Airavata MFT - AWS/GCS support

Posted by "Rajvanshi, Akshay" <ak...@iu.edu>.
Hello,

We were searching about how to use google API’s to send files, but it’s required the first steps to be authentication. In that, the GCP API requires a credentials.json file to be present in the system.

Is it fine if we currently design the GCS transport feature such that the file is already present in the system ?

Kind Regards
Akshay

From: Aravind Ramalingam <po...@gmail.com>
Reply-To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Date: Friday, April 17, 2020 at 00:30
To: "dev@airavata.apache.org" <de...@airavata.apache.org>
Subject: [External] Re: Apache Airavata MFT - AWS/GCS support

This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.

Hello,

Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.

Thank you
Aravind Ramalingam


On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com> wrote:
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.

Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.

Thank you for the update.
Thank you
Aravind Ramalingam


On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.

[6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the system works.
We were hoping to try an implementation for GCS, we will also look into Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>> wrote:
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.


String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();

[2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com>> wrote:
There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport.

[1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>> wrote:
Hello,

We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients.

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org>> wrote:
>
> Hi Aravind,
>
> I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on.
>
> There could be many use cases, here is an idea:
>
> * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage.
>
> Suresh
>
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>> wrote:
>>
>> Hello,
>>
>> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>
>> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients?
>>
>> Could you please provide suggestions for this use case.
>>
>> Thank you
>> Aravind Ramalingam
>

Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello,

Wouldn't it be that in this example the whole file has to be present and converted into a single stream and uploaded at once?
We had understood that MFT expects it to be chunk by chunk upload without having to have the entire file present.

Thank you
Aravind Ramalingam

> On Apr 17, 2020, at 00:07, DImuthu Upeksha <di...@gmail.com> wrote:
> 
> 
> Aravind,
> 
> Streaming is supported in GCS java client. Have a look at here [8]
> 
> [8] https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104
> 
> Thanks
> Dimuthu
> 
>> On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com> wrote:
>> Hello Dimuthu,
>> 
>> As a followup, we explored GCS in detail. We are faced with a small dilemma. We found that though GCS has a Java support, but the functionality does not seem to extend to a stream based upload and download. 
>> The documentation says it is currently done with a gsutil command line library [7], hence we are confused if we would be able to proceed the GCS integration.
>> 
>> Could you please give us any suggestions? Also we were wondering if we could maybe take up Box integration or some other provider if GCS proves not possible currently.
>> 
>> [7] https://cloud.google.com/storage/docs/streaming 
>> 
>> Thank you
>> Aravind Ramalingam
>> 
>>> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com> wrote:
>>> Hello Dimuthu,
>>> 
>>> We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.
>>> 
>>> Thank you for the update.
>>> 
>>> Thank you
>>> Aravind Ramalingam
>>> 
>>>>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com> wrote:
>>>>> 
>>>> 
>>>> Aravind,
>>>> 
>>>> I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.
>>>> 
>>>> [6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>>> 
>>>> Thanks
>>>> Dimuthu
>>>> 
>>>>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> Hi  Dimuthu,
>>>>> 
>>>>> Thank you for the update. We look into it and get an idea about how the system works.
>>>>> We were hoping to try an implementation for GCS, we will also look into Azure.
>>>>> 
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>> 
>>>>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com> wrote:
>>>>>> Aravind,
>>>>>> 
>>>>>> Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.
>>>>>> 
>>>>>> String sourceId = "remote-ssh-resource";
>>>>>> String sourceToken = "local-ssh-cred";
>>>>>> String sourceType = "SCP";
>>>>>> String destId = "s3-file";
>>>>>> String destToken = "s3-cred";
>>>>>> String destType = "S3";
>>>>>> 
>>>>>> TransferApiRequest request = TransferApiRequest.newBuilder()
>>>>>>         .setSourceId(sourceId)
>>>>>>         .setSourceToken(sourceToken)
>>>>>>         .setSourceType(sourceType)
>>>>>>         .setDestinationId(destId)
>>>>>>         .setDestinationToken(destToken)
>>>>>>         .setDestinationType(destType)
>>>>>>         .setAffinityTransfer(false).build();
>>>>>> 
>>>>>> [2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>>>> [3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>>>> [4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>>>> [5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>>>> 
>>>>>> Thanks
>>>>>> Dimuthu
>>>>>> 
>>>>>> 
>>>>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com> wrote:
>>>>>>> There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport. 
>>>>>>> 
>>>>>>> [1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>>>>> 
>>>>>>> Dimuthu
>>>>>>> 
>>>>>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>>>>> Hello, 
>>>>>>>> 
>>>>>>>> We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
>>>>>>>> We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients. 
>>>>>>>> 
>>>>>>>> Thank you
>>>>>>>> Aravind Ramalingam
>>>>>>>> 
>>>>>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>>>>>> > 
>>>>>>>> > Hi Aravind,
>>>>>>>> > 
>>>>>>>> > I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on. 
>>>>>>>> > 
>>>>>>>> > There could be many use cases, here is an idea:
>>>>>>>> > 
>>>>>>>> > * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage. 
>>>>>>>> > 
>>>>>>>> > Suresh
>>>>>>>> > 
>>>>>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com> wrote:
>>>>>>>> >> 
>>>>>>>> >> Hello,
>>>>>>>> >> 
>>>>>>>> >> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>>>>>>> >> 
>>>>>>>> >> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients? 
>>>>>>>> >> 
>>>>>>>> >> Could you please provide suggestions for this use case. 
>>>>>>>> >> 
>>>>>>>> >> Thank you
>>>>>>>> >> Aravind Ramalingam
>>>>>>>> > 

Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Aravind,

Streaming is supported in GCS java client. Have a look at here [8]

[8]
https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/storage/json-api/src/main/java/StorageSample.java#L104

Thanks
Dimuthu

On Thu, Apr 16, 2020 at 9:56 PM Aravind Ramalingam <po...@gmail.com>
wrote:

> Hello Dimuthu,
>
> As a followup, we explored GCS in detail. We are faced with a small
> dilemma. We found that though GCS has a Java support, but the functionality
> does not seem to extend to a stream based upload and download.
> The documentation says it is currently done with a gsutil command line
> library [7], hence we are confused if we would be able to proceed the GCS
> integration.
>
> Could you please give us any suggestions? Also we were wondering if we
> could maybe take up Box integration or some other provider if GCS proves
> not possible currently.
>
> [7] https://cloud.google.com/storage/docs/streaming
>
> Thank you
> Aravind Ramalingam
>
> On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
>> Hello Dimuthu,
>>
>> We had just started looking into Azure and GCS. Since Azure is done we
>> will take up and explore GCS.
>>
>> Thank you for the update.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
>> wrote:
>>
>> 
>> Aravind,
>>
>> I'm not sure whether you have made any progress on Azure transport yet. I
>> got a chance to look into that [6]. Let me know if you are working on GCS
>> or any other so that I can plan ahead. Next I will be focusing on Box
>> transport.
>>
>> [6]
>> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>>
>> Thanks
>> Dimuthu
>>
>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>>> Hi  Dimuthu,
>>>
>>> Thank you for the update. We look into it and get an idea about how the
>>> system works.
>>> We were hoping to try an implementation for GCS, we will also look into
>>> Azure.
>>>
>>> Thank you
>>> Aravind Ramalingam
>>>
>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>>> Aravind,
>>>>
>>>> Here [2] is the complete commit for S3 transport implementation but
>>>> don't get confused by the amount of changes as this includes both transport
>>>> implementation and the service backend implementations. If you need to
>>>> implement a new transport, you need to implement a Receiver, Sender and a
>>>> MetadataCollector like this [3]. Then you need to add that resource support
>>>> to Resource service and Secret service [4] [5]. You can similarly do that
>>>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>>>> helps.
>>>>
>>>> String sourceId = "remote-ssh-resource";
>>>> String sourceToken = "local-ssh-cred";
>>>> String sourceType = "SCP";
>>>> String destId = "s3-file";
>>>> String destToken = "s3-cred";
>>>> String destType = "S3";
>>>>
>>>> TransferApiRequest request = TransferApiRequest.newBuilder()
>>>>         .setSourceId(sourceId)
>>>>         .setSourceToken(sourceToken)
>>>>         .setSourceType(sourceType)
>>>>         .setDestinationId(destId)
>>>>         .setDestinationToken(destToken)
>>>>         .setDestinationType(destType)
>>>>         .setAffinityTransfer(false).build();
>>>>
>>>>
>>>> [2]
>>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>>> [3]
>>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>>> [4]
>>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>>> [5]
>>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>>
>>>> Thanks
>>>> Dimuthu
>>>>
>>>>
>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>>>> dimuthu.upeksha2@gmail.com> wrote:
>>>>
>>>>> There is a working on S3 transport in my local copy. Will commit it
>>>>> once I test it out properly. You can follow the same pattern for any cloud
>>>>> provider which has clients with streaming IO. Streaming among different
>>>>> transfer protocols inside an Agent has been discussed in the last part of
>>>>> this [1] document. Try to get the conceptual idea from that and reverse
>>>>> engineer SCP transport.
>>>>>
>>>>> [1]
>>>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>>>
>>>>> Dimuthu
>>>>>
>>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We were looking at the existing code in the project. We could find
>>>>>> implementations only for local copy and SCP.
>>>>>> We were confused on how to go about with an external provider like S3
>>>>>> or Azure? Since it would require integrating with their respective clients.
>>>>>>
>>>>>> Thank you
>>>>>> Aravind Ramalingam
>>>>>>
>>>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>>>> >
>>>>>> > Hi Aravind,
>>>>>> >
>>>>>> > I have to catch up with the code, but you may want to look at the
>>>>>> S3 implementation and extend it to Azure, GCP or other cloud services like
>>>>>> Box, Dropbox and so on.
>>>>>> >
>>>>>> > There could be many use cases, here is an idea:
>>>>>> >
>>>>>> > * Compute a job on a supercomputer with SCP access and push the
>>>>>> outputs to a Cloud storage.
>>>>>> >
>>>>>> > Suresh
>>>>>> >
>>>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Hello,
>>>>>> >>
>>>>>> >> We set up the MFT project on local system and tested out SCP
>>>>>> transfer between JetStream VMs, we were wondering how the support can be
>>>>>> extended for AWS/GCS.
>>>>>> >>
>>>>>> >> As per our understanding, the current implementation has support
>>>>>> for two protocols i.e. local-transport and scp-transport. Would we have to
>>>>>> modify/add to the code base to extend support for AWS/GCS clients?
>>>>>> >>
>>>>>> >> Could you please provide suggestions for this use case.
>>>>>> >>
>>>>>> >> Thank you
>>>>>> >> Aravind Ramalingam
>>>>>> >
>>>>>>
>>>>>

Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello Dimuthu,

As a followup, we explored GCS in detail. We are faced with a small
dilemma. We found that though GCS has a Java support, but the functionality
does not seem to extend to a stream based upload and download.
The documentation says it is currently done with a gsutil command line
library [7], hence we are confused if we would be able to proceed the GCS
integration.

Could you please give us any suggestions? Also we were wondering if we
could maybe take up Box integration or some other provider if GCS proves
not possible currently.

[7] https://cloud.google.com/storage/docs/streaming

Thank you
Aravind Ramalingam

On Thu, Apr 16, 2020 at 12:45 AM Aravind Ramalingam <po...@gmail.com>
wrote:

> Hello Dimuthu,
>
> We had just started looking into Azure and GCS. Since Azure is done we
> will take up and explore GCS.
>
> Thank you for the update.
>
> Thank you
> Aravind Ramalingam
>
> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com>
> wrote:
>
> 
> Aravind,
>
> I'm not sure whether you have made any progress on Azure transport yet. I
> got a chance to look into that [6]. Let me know if you are working on GCS
> or any other so that I can plan ahead. Next I will be focusing on Box
> transport.
>
> [6]
> https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
>
> Thanks
> Dimuthu
>
> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
>> Hi  Dimuthu,
>>
>> Thank you for the update. We look into it and get an idea about how the
>> system works.
>> We were hoping to try an implementation for GCS, we will also look into
>> Azure.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>>> Aravind,
>>>
>>> Here [2] is the complete commit for S3 transport implementation but
>>> don't get confused by the amount of changes as this includes both transport
>>> implementation and the service backend implementations. If you need to
>>> implement a new transport, you need to implement a Receiver, Sender and a
>>> MetadataCollector like this [3]. Then you need to add that resource support
>>> to Resource service and Secret service [4] [5]. You can similarly do that
>>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>>> helps.
>>>
>>> String sourceId = "remote-ssh-resource";
>>> String sourceToken = "local-ssh-cred";
>>> String sourceType = "SCP";
>>> String destId = "s3-file";
>>> String destToken = "s3-cred";
>>> String destType = "S3";
>>>
>>> TransferApiRequest request = TransferApiRequest.newBuilder()
>>>         .setSourceId(sourceId)
>>>         .setSourceToken(sourceToken)
>>>         .setSourceType(sourceType)
>>>         .setDestinationId(destId)
>>>         .setDestinationToken(destToken)
>>>         .setDestinationType(destType)
>>>         .setAffinityTransfer(false).build();
>>>
>>>
>>> [2]
>>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>> [3]
>>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>> [4]
>>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>> [5]
>>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>>
>>> Thanks
>>> Dimuthu
>>>
>>>
>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>>> dimuthu.upeksha2@gmail.com> wrote:
>>>
>>>> There is a working on S3 transport in my local copy. Will commit it
>>>> once I test it out properly. You can follow the same pattern for any cloud
>>>> provider which has clients with streaming IO. Streaming among different
>>>> transfer protocols inside an Agent has been discussed in the last part of
>>>> this [1] document. Try to get the conceptual idea from that and reverse
>>>> engineer SCP transport.
>>>>
>>>> [1]
>>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>>
>>>> Dimuthu
>>>>
>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We were looking at the existing code in the project. We could find
>>>>> implementations only for local copy and SCP.
>>>>> We were confused on how to go about with an external provider like S3
>>>>> or Azure? Since it would require integrating with their respective clients.
>>>>>
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>>
>>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>>> >
>>>>> > Hi Aravind,
>>>>> >
>>>>> > I have to catch up with the code, but you may want to look at the S3
>>>>> implementation and extend it to Azure, GCP or other cloud services like
>>>>> Box, Dropbox and so on.
>>>>> >
>>>>> > There could be many use cases, here is an idea:
>>>>> >
>>>>> > * Compute a job on a supercomputer with SCP access and push the
>>>>> outputs to a Cloud storage.
>>>>> >
>>>>> > Suresh
>>>>> >
>>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Hello,
>>>>> >>
>>>>> >> We set up the MFT project on local system and tested out SCP
>>>>> transfer between JetStream VMs, we were wondering how the support can be
>>>>> extended for AWS/GCS.
>>>>> >>
>>>>> >> As per our understanding, the current implementation has support
>>>>> for two protocols i.e. local-transport and scp-transport. Would we have to
>>>>> modify/add to the code base to extend support for AWS/GCS clients?
>>>>> >>
>>>>> >> Could you please provide suggestions for this use case.
>>>>> >>
>>>>> >> Thank you
>>>>> >> Aravind Ramalingam
>>>>> >
>>>>>
>>>>

Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello Dimuthu,

We had just started looking into Azure and GCS. Since Azure is done we will take up and explore GCS.

Thank you for the update.

Thank you
Aravind Ramalingam

> On Apr 16, 2020, at 00:30, DImuthu Upeksha <di...@gmail.com> wrote:
> 
> 
> Aravind,
> 
> I'm not sure whether you have made any progress on Azure transport yet. I got a chance to look into that [6]. Let me know if you are working on GCS or any other so that I can plan ahead. Next I will be focusing on Box transport.
> 
> [6] https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd
> 
> Thanks
> Dimuthu
> 
>> On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com> wrote:
>> Hi  Dimuthu,
>> 
>> Thank you for the update. We look into it and get an idea about how the system works.
>> We were hoping to try an implementation for GCS, we will also look into Azure.
>> 
>> Thank you
>> Aravind Ramalingam
>> 
>>> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com> wrote:
>>> Aravind,
>>> 
>>> Here [2] is the complete commit for S3 transport implementation but don't get confused by the amount of changes as this includes both transport implementation and the service backend implementations. If you need to implement a new transport, you need to implement a Receiver, Sender and a MetadataCollector like this [3]. Then you need to add that resource support to Resource service and Secret service [4] [5]. You can similarly do that for Azure. A sample SCP -> S3 transfer request is like below. Hope that helps.
>>> 
>>> String sourceId = "remote-ssh-resource";
>>> String sourceToken = "local-ssh-cred";
>>> String sourceType = "SCP";
>>> String destId = "s3-file";
>>> String destToken = "s3-cred";
>>> String destType = "S3";
>>> 
>>> TransferApiRequest request = TransferApiRequest.newBuilder()
>>>         .setSourceId(sourceId)
>>>         .setSourceToken(sourceToken)
>>>         .setSourceType(sourceType)
>>>         .setDestinationId(destId)
>>>         .setDestinationToken(destToken)
>>>         .setDestinationType(destType)
>>>         .setAffinityTransfer(false).build();
>>> 
>>> [2] https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>>> [3] https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>>> [4] https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>>> [5] https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>> 
>>> Thanks
>>> Dimuthu
>>> 
>>> 
>>>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com> wrote:
>>>> There is a working on S3 transport in my local copy. Will commit it once I test it out properly. You can follow the same pattern for any cloud provider which has clients with streaming IO. Streaming among different transfer protocols inside an Agent has been discussed in the last part of this [1] document. Try to get the conceptual idea from that and reverse engineer SCP transport. 
>>>> 
>>>> [1] https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>> 
>>>> Dimuthu
>>>> 
>>>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> Hello, 
>>>>> 
>>>>> We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
>>>>> We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients. 
>>>>> 
>>>>> Thank you
>>>>> Aravind Ramalingam
>>>>> 
>>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>>> > 
>>>>> > Hi Aravind,
>>>>> > 
>>>>> > I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on. 
>>>>> > 
>>>>> > There could be many use cases, here is an idea:
>>>>> > 
>>>>> > * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage. 
>>>>> > 
>>>>> > Suresh
>>>>> > 
>>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com> wrote:
>>>>> >> 
>>>>> >> Hello,
>>>>> >> 
>>>>> >> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>>>>> >> 
>>>>> >> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients? 
>>>>> >> 
>>>>> >> Could you please provide suggestions for this use case. 
>>>>> >> 
>>>>> >> Thank you
>>>>> >> Aravind Ramalingam
>>>>> > 

Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Aravind,

I'm not sure whether you have made any progress on Azure transport yet. I
got a chance to look into that [6]. Let me know if you are working on GCS
or any other so that I can plan ahead. Next I will be focusing on Box
transport.

[6]
https://github.com/apache/airavata-mft/commit/013ed494eb958990d0a6f90186a53103e1237bcd

Thanks
Dimuthu

On Mon, Apr 6, 2020 at 5:19 PM Aravind Ramalingam <po...@gmail.com> wrote:

> Hi  Dimuthu,
>
> Thank you for the update. We look into it and get an idea about how the
> system works.
> We were hoping to try an implementation for GCS, we will also look into
> Azure.
>
> Thank you
> Aravind Ramalingam
>
> On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>
> wrote:
>
>> Aravind,
>>
>> Here [2] is the complete commit for S3 transport implementation but don't
>> get confused by the amount of changes as this includes both transport
>> implementation and the service backend implementations. If you need to
>> implement a new transport, you need to implement a Receiver, Sender and a
>> MetadataCollector like this [3]. Then you need to add that resource support
>> to Resource service and Secret service [4] [5]. You can similarly do that
>> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
>> helps.
>>
>> String sourceId = "remote-ssh-resource";
>> String sourceToken = "local-ssh-cred";
>> String sourceType = "SCP";
>> String destId = "s3-file";
>> String destToken = "s3-cred";
>> String destType = "S3";
>>
>> TransferApiRequest request = TransferApiRequest.newBuilder()
>>         .setSourceId(sourceId)
>>         .setSourceToken(sourceToken)
>>         .setSourceType(sourceType)
>>         .setDestinationId(destId)
>>         .setDestinationToken(destToken)
>>         .setDestinationType(destType)
>>         .setAffinityTransfer(false).build();
>>
>>
>> [2]
>> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
>> [3]
>> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
>> [4]
>> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
>> [5]
>> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>>
>> Thanks
>> Dimuthu
>>
>>
>> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
>> dimuthu.upeksha2@gmail.com> wrote:
>>
>>> There is a working on S3 transport in my local copy. Will commit it once
>>> I test it out properly. You can follow the same pattern for any cloud
>>> provider which has clients with streaming IO. Streaming among different
>>> transfer protocols inside an Agent has been discussed in the last part of
>>> this [1] document. Try to get the conceptual idea from that and reverse
>>> engineer SCP transport.
>>>
>>> [1]
>>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>>
>>> Dimuthu
>>>
>>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> We were looking at the existing code in the project. We could find
>>>> implementations only for local copy and SCP.
>>>> We were confused on how to go about with an external provider like S3
>>>> or Azure? Since it would require integrating with their respective clients.
>>>>
>>>> Thank you
>>>> Aravind Ramalingam
>>>>
>>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>>> >
>>>> > Hi Aravind,
>>>> >
>>>> > I have to catch up with the code, but you may want to look at the S3
>>>> implementation and extend it to Azure, GCP or other cloud services like
>>>> Box, Dropbox and so on.
>>>> >
>>>> > There could be many use cases, here is an idea:
>>>> >
>>>> > * Compute a job on a supercomputer with SCP access and push the
>>>> outputs to a Cloud storage.
>>>> >
>>>> > Suresh
>>>> >
>>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> Hello,
>>>> >>
>>>> >> We set up the MFT project on local system and tested out SCP
>>>> transfer between JetStream VMs, we were wondering how the support can be
>>>> extended for AWS/GCS.
>>>> >>
>>>> >> As per our understanding, the current implementation has support for
>>>> two protocols i.e. local-transport and scp-transport. Would we have to
>>>> modify/add to the code base to extend support for AWS/GCS clients?
>>>> >>
>>>> >> Could you please provide suggestions for this use case.
>>>> >>
>>>> >> Thank you
>>>> >> Aravind Ramalingam
>>>> >
>>>>
>>>

Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hi  Dimuthu,

Thank you for the update. We look into it and get an idea about how the
system works.
We were hoping to try an implementation for GCS, we will also look into
Azure.

Thank you
Aravind Ramalingam

On Mon, Apr 6, 2020 at 4:44 PM DImuthu Upeksha <di...@gmail.com>
wrote:

> Aravind,
>
> Here [2] is the complete commit for S3 transport implementation but don't
> get confused by the amount of changes as this includes both transport
> implementation and the service backend implementations. If you need to
> implement a new transport, you need to implement a Receiver, Sender and a
> MetadataCollector like this [3]. Then you need to add that resource support
> to Resource service and Secret service [4] [5]. You can similarly do that
> for Azure. A sample SCP -> S3 transfer request is like below. Hope that
> helps.
>
> String sourceId = "remote-ssh-resource";
> String sourceToken = "local-ssh-cred";
> String sourceType = "SCP";
> String destId = "s3-file";
> String destToken = "s3-cred";
> String destType = "S3";
>
> TransferApiRequest request = TransferApiRequest.newBuilder()
>         .setSourceId(sourceId)
>         .setSourceToken(sourceToken)
>         .setSourceType(sourceType)
>         .setDestinationId(destId)
>         .setDestinationToken(destToken)
>         .setDestinationType(destType)
>         .setAffinityTransfer(false).build();
>
>
> [2]
> https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
> [3]
> https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
> [4]
> https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
> [5]
> https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45
>
> Thanks
> Dimuthu
>
>
> On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <
> dimuthu.upeksha2@gmail.com> wrote:
>
>> There is a working on S3 transport in my local copy. Will commit it once
>> I test it out properly. You can follow the same pattern for any cloud
>> provider which has clients with streaming IO. Streaming among different
>> transfer protocols inside an Agent has been discussed in the last part of
>> this [1] document. Try to get the conceptual idea from that and reverse
>> engineer SCP transport.
>>
>> [1]
>> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>>
>> Dimuthu
>>
>> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> We were looking at the existing code in the project. We could find
>>> implementations only for local copy and SCP.
>>> We were confused on how to go about with an external provider like S3 or
>>> Azure? Since it would require integrating with their respective clients.
>>>
>>> Thank you
>>> Aravind Ramalingam
>>>
>>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>>> >
>>> > Hi Aravind,
>>> >
>>> > I have to catch up with the code, but you may want to look at the S3
>>> implementation and extend it to Azure, GCP or other cloud services like
>>> Box, Dropbox and so on.
>>> >
>>> > There could be many use cases, here is an idea:
>>> >
>>> > * Compute a job on a supercomputer with SCP access and push the
>>> outputs to a Cloud storage.
>>> >
>>> > Suresh
>>> >
>>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>>> wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> We set up the MFT project on local system and tested out SCP transfer
>>> between JetStream VMs, we were wondering how the support can be extended
>>> for AWS/GCS.
>>> >>
>>> >> As per our understanding, the current implementation has support for
>>> two protocols i.e. local-transport and scp-transport. Would we have to
>>> modify/add to the code base to extend support for AWS/GCS clients?
>>> >>
>>> >> Could you please provide suggestions for this use case.
>>> >>
>>> >> Thank you
>>> >> Aravind Ramalingam
>>> >
>>>
>>

Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
Aravind,

Here [2] is the complete commit for S3 transport implementation but don't
get confused by the amount of changes as this includes both transport
implementation and the service backend implementations. If you need to
implement a new transport, you need to implement a Receiver, Sender and a
MetadataCollector like this [3]. Then you need to add that resource support
to Resource service and Secret service [4] [5]. You can similarly do that
for Azure. A sample SCP -> S3 transfer request is like below. Hope that
helps.

String sourceId = "remote-ssh-resource";
String sourceToken = "local-ssh-cred";
String sourceType = "SCP";
String destId = "s3-file";
String destToken = "s3-cred";
String destType = "S3";

TransferApiRequest request = TransferApiRequest.newBuilder()
        .setSourceId(sourceId)
        .setSourceToken(sourceToken)
        .setSourceType(sourceType)
        .setDestinationId(destId)
        .setDestinationToken(destToken)
        .setDestinationType(destType)
        .setAffinityTransfer(false).build();


[2]
https://github.com/apache/airavata-mft/commit/62fae3d0ab2921fa8bf0bea7970e233f842e6948
[3]
https://github.com/apache/airavata-mft/tree/master/transport/s3-transport/src/main/java/org/apache/airavata/mft/transport/s3
[4]
https://github.com/apache/airavata-mft/blob/master/services/resource-service/stub/src/main/proto/ResourceService.proto#L90
[5]
https://github.com/apache/airavata-mft/blob/master/services/secret-service/stub/src/main/proto/SecretService.proto#L45

Thanks
Dimuthu


On Sun, Apr 5, 2020 at 12:10 AM DImuthu Upeksha <di...@gmail.com>
wrote:

> There is a working on S3 transport in my local copy. Will commit it once I
> test it out properly. You can follow the same pattern for any cloud
> provider which has clients with streaming IO. Streaming among different
> transfer protocols inside an Agent has been discussed in the last part of
> this [1] document. Try to get the conceptual idea from that and reverse
> engineer SCP transport.
>
> [1]
> https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo
>
> Dimuthu
>
> On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com>
> wrote:
>
>> Hello,
>>
>> We were looking at the existing code in the project. We could find
>> implementations only for local copy and SCP.
>> We were confused on how to go about with an external provider like S3 or
>> Azure? Since it would require integrating with their respective clients.
>>
>> Thank you
>> Aravind Ramalingam
>>
>> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
>> >
>> > Hi Aravind,
>> >
>> > I have to catch up with the code, but you may want to look at the S3
>> implementation and extend it to Azure, GCP or other cloud services like
>> Box, Dropbox and so on.
>> >
>> > There could be many use cases, here is an idea:
>> >
>> > * Compute a job on a supercomputer with SCP access and push the outputs
>> to a Cloud storage.
>> >
>> > Suresh
>> >
>> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
>> wrote:
>> >>
>> >> Hello,
>> >>
>> >> We set up the MFT project on local system and tested out SCP transfer
>> between JetStream VMs, we were wondering how the support can be extended
>> for AWS/GCS.
>> >>
>> >> As per our understanding, the current implementation has support for
>> two protocols i.e. local-transport and scp-transport. Would we have to
>> modify/add to the code base to extend support for AWS/GCS clients?
>> >>
>> >> Could you please provide suggestions for this use case.
>> >>
>> >> Thank you
>> >> Aravind Ramalingam
>> >
>>
>

Re: Apache Airavata MFT - AWS/GCS support

Posted by DImuthu Upeksha <di...@gmail.com>.
There is a working on S3 transport in my local copy. Will commit it once I
test it out properly. You can follow the same pattern for any cloud
provider which has clients with streaming IO. Streaming among different
transfer protocols inside an Agent has been discussed in the last part of
this [1] document. Try to get the conceptual idea from that and reverse
engineer SCP transport.

[1]
https://docs.google.com/document/d/1zrO4Z1dn7ENhm1RBdVCw-dDpWiebaZEWy66ceTWoOlo

Dimuthu

On Sat, Apr 4, 2020 at 9:22 PM Aravind Ramalingam <po...@gmail.com> wrote:

> Hello,
>
> We were looking at the existing code in the project. We could find
> implementations only for local copy and SCP.
> We were confused on how to go about with an external provider like S3 or
> Azure? Since it would require integrating with their respective clients.
>
> Thank you
> Aravind Ramalingam
>
> > On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
> >
> > Hi Aravind,
> >
> > I have to catch up with the code, but you may want to look at the S3
> implementation and extend it to Azure, GCP or other cloud services like
> Box, Dropbox and so on.
> >
> > There could be many use cases, here is an idea:
> >
> > * Compute a job on a supercomputer with SCP access and push the outputs
> to a Cloud storage.
> >
> > Suresh
> >
> >> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com>
> wrote:
> >>
> >> Hello,
> >>
> >> We set up the MFT project on local system and tested out SCP transfer
> between JetStream VMs, we were wondering how the support can be extended
> for AWS/GCS.
> >>
> >> As per our understanding, the current implementation has support for
> two protocols i.e. local-transport and scp-transport. Would we have to
> modify/add to the code base to extend support for AWS/GCS clients?
> >>
> >> Could you please provide suggestions for this use case.
> >>
> >> Thank you
> >> Aravind Ramalingam
> >
>

Re: Apache Airavata MFT - AWS/GCS support

Posted by Aravind Ramalingam <po...@gmail.com>.
Hello, 

We were looking at the existing code in the project. We could find implementations only for local copy and SCP.
We were confused on how to go about with an external provider like S3 or Azure? Since it would require integrating with their respective clients. 

Thank you
Aravind Ramalingam

> On Apr 4, 2020, at 21:15, Suresh Marru <sm...@apache.org> wrote:
> 
> Hi Aravind,
> 
> I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on. 
> 
> There could be many use cases, here is an idea:
> 
> * Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage. 
> 
> Suresh
> 
>> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com> wrote:
>> 
>> Hello,
>> 
>> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
>> 
>> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients? 
>> 
>> Could you please provide suggestions for this use case. 
>> 
>> Thank you
>> Aravind Ramalingam
> 

Re: Apache Airavata MFT - AWS/GCS support

Posted by Suresh Marru <sm...@apache.org>.
Hi Aravind,

I have to catch up with the code, but you may want to look at the S3 implementation and extend it to Azure, GCP or other cloud services like Box, Dropbox and so on. 

There could be many use cases, here is an idea:

* Compute a job on a supercomputer with SCP access and push the outputs to a Cloud storage. 

Suresh

> On Apr 4, 2020, at 8:09 PM, Aravind Ramalingam <po...@gmail.com> wrote:
> 
> Hello,
> 
> We set up the MFT project on local system and tested out SCP transfer between JetStream VMs, we were wondering how the support can be extended for AWS/GCS.
> 
> As per our understanding, the current implementation has support for two protocols i.e. local-transport and scp-transport. Would we have to modify/add to the code base to extend support for AWS/GCS clients? 
> 
> Could you please provide suggestions for this use case. 
> 
> Thank you
> Aravind Ramalingam