You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by "Christie, Marcus Aaron" <ma...@iu.edu> on 2023/04/06 14:14:02 UTC

Re: Cybershuttle Replica Catalog API

Hi Jayan,

Thanks for sharing. One question, the airavata-data-catalog already has a DATA_PRODUCT table and a way to store a data product's metadata. Could that be used instead of adding a new table?

Or more generally my question is how does this replica catalog API relate to the data catalog API/data model?

Thanks,

Marcus

> On Mar 31, 2023, at 4:11 PM, Jayan Vidanapathirana <jc...@gmail.com> wrote:
> 
> Hi All,
> 
> I have implemented basic flow(simple create and retrieve) of the replica catalog and drafted a pull request[1] to the Airavata data catalog as a new module. According to that implementation I have come to the following database structure for the replica catalog and I greatly appreciate your thoughts and feedback on the designs[2]. At this stage S3 storage type was considered as a sample. 
> 
> <Replica Catalog V2.drawio (1).png>
> 
> Also please refer to the following google doc[3] to review the implemented APIs.
> 
> [1] https://github.com/apache/airavata-data-catalog/pull/28
> [2] https://drive.google.com/file/d/1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN/view?usp=sharing
> [3] https://docs.google.com/document/d/1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k/edit?usp=sharing
> 
> Thank you.
> 
> On Mon, Mar 20, 2023 at 2:20 AM Suresh Marru <sm...@apache.org> wrote:
> Hi Jayan,
> 
> Can you contribute a PR to the data catalog repo so we can keep the feedback on that issue?
> 
> Thanks for your contribution,
> Suresh
> 
>> On Mar 19, 2023, at 12:55 PM, Jayan Vidanapathirana <jc...@gmail.com> wrote:
>> 
>> Hi All,
>> 
>> I have updated the draft code base[1] with a simple workflow of adding data to replica catalog. Still services are not yet finalized and will be enhanced with the workflow. 
>> 
>> [1] https://github.com/Jayancv/airavata-replica-catalog
>> 
>> Thanks.
>> 
>> On Sat, Feb 25, 2023 at 4:21 PM Jayan Vidanapathirana <jc...@gmail.com> wrote:
>> Hi Dimuthu and Marcus,
>> 
>> Thank you both for checking my PoC and providing valuable feedback.
>> 
>> Dimuthu,
>> 	• Im agree with you regarding Replica location categories. It should be a data catalog level attribute. 
>> 	• To manage replica data access permissions don't we need user information at Replica catalog level ? I'm a bit confused on the permission management side of this catalog. 
>> 	• ReplicaListEntry  - Added to expose the list DataReplicaLocation s with basic details in AllDataReplicaGetResponse which provide all the replica items for the given product_id. However, here I was not considering that hierarchical structure. ReplicaGroupEntry is actually a one product replica which holds the file structure of the replica data. According to your suggestion we can model that AllDataReplicaGetResponse as follows,
>> message AllDataReplicaGetResponse {
>>   data_product_id = 1
>>   repeated ReplicaGroupEntry replica_list = 2;
>> }
>> 
>> message ReplicaGroupEntry {
>>   string replica_group_id = 1
>>   repeated ReplicaGroupEntry directories = 2;
>>   repeated DataReplicaLocation files = 3;
>> }
>> 
>> Marcus, 
>> 	• Yes, I will remove replica_id  from the data catalog diagram. 
>> 	• I added that parent_data_product_id to replica data by considering full context with replica catalog and data catalog relation. But within replica catalog context there is no such paranet product relationship. Therefore we can rename it to data_product_id. Thanks for pointing this out. 
>> 
>> Thanks.
>> 
>> On Thu, Feb 23, 2023 at 2:48 AM Christie, Marcus Aaron <ma...@iu.edu> wrote:
>> Hi Jayan,
>> 
>> I would like to echo Dimuthu and say that this looks great and I appreciate the effort in your pulling this all together.  I have some feedback to share.
>> 
>> The high-level architecture diagram shows the replica id being stored in the data catalog. That was an initial idea that we had, but we decided that the replica catalog would store the data product id. That seems reflected in your API design so I think you already know this, but I wanted to point it out since the diagram might be a little confusing for others.
>> 
>> In the ReplicaCatalogAPI.proto the name of the data product id field is "parent_data_product_id". I would suggest calling it "data_product_id" instead. "parent_data_product_id" means "the id of the parent data product of this data product" in the data catalog. It might be confusing to use the same name in ReplicaCatalogAPI.proto.
>> 
>> 
>> Thanks,
>> 
>> Marcus
>> 
>> > On Feb 18, 2023, at 3:09 PM, Jayan Vidanapathirana <jc...@gmail.com> wrote:
>> > 
>> > Hi All, 
>> > 
>> > As a new contributor to the Cybershuttle project, I have been actively involved in implementing the Data Replica Catalog. This new catalog is designed to interface with both the Apache Airavata Data Catalog [1] and Airavata MFT[2]. This replica catalog should be able to store each replica resource storage details and secret/credential details specific to the storage type. The proposed high-level architecture will be as follows:
>> > 
>> > 
>> > 
>> > I will mainly work on the highlighted area (red color box) and as an initial step started defining APIs which communicate with Replica Catalog. This API calls will be gRPC APIs and following methods will be implement,
>> > 
>> > Replica Registration
>> > 
>> >       • registerReplicaLocation(DataReplicaCreateRequest createRequest)
>> >       • updateReplicaLocation(DataReplicaCreateRequest updateRequest)
>> >       • DataReplicaLocationModel getReplicaLocation(DataReplicaGetRequest getReplicaRequest)
>> >       • removeReplicaLocation(DataReplicaDeleteRequest deleteReplicaRequest)
>> >       • getAllReplicaLocations(AllDataReplicaGetRequest allDataGetRequest)
>> >       • removeAllReplicaLocations(AllDataReplicaDeleteRequest allDataDeleteRequest)
>> > 
>> > Storage Registration
>> > 
>> > registerSecretForStorage(SecretForStorage request)
>> > deleteSecretsForStorage(SecretForStorageDeleteRequest request)
>> > getSecretForStorage(SecretForStorageGetRequest request)
>> > searchStorages(StorageSearchRequest request)
>> > listStorages(StorageListRequest request)
>> > resolveStorageType (StorageTypeResolveRequest request)
>> > 
>> > Storage - Internal APIs
>> > 
>> > S3StorageListResponse listS3Storage(S3StorageListRequest request) 
>> > Optional<S3Storage> getS3Storage(S3StorageGetRequest request) 
>> > S3Storage createS3Storage(S3StorageCreateRequest request) 
>> > boolean updateS3Storage(S3StorageUpdateRequest request) 
>> > boolean deleteS3Storage(S3StorageDeleteRequest request) 
>> > 
>> > AzureStorageListResponse listAzureStorage(AzureStorageListRequest request) 
>> > Optional<AzureStorage> getAzureStorage(AzureStorageGetRequest request) 
>> > AzureStorage createAzureStorage(AzureStorageCreateRequest request) 
>> > boolean updateAzureStorage(AzureStorageUpdateRequest request) 
>> > boolean deleteAzureStorage(AzureStorageDeleteRequest request) 
>> > 
>> > GCSStorageListResponse listGCSStorage(GCSStorageListRequest request) 
>> > Optional<GCSStorage> getGCSStorage(GCSStorageGetRequest request) 
>> > GCSStorage createGCSStorage(GCSStorageCreateRequest request) 
>> > boolean updateGCSStorage(GCSStorageUpdateRequest request) 
>> > boolean deleteGCSStorage(GCSStorageDeleteRequest request) 
>> > 
>> > Secret Registration
>> > 
>> > registerSecret(SecretRegistrationRequest request)
>> > deleteSecret(SecretDeleteRequest request)
>> > resolveStorageType (StorageTypeResolveRequest request)
>> > 
>> > Secret  - Internal APIs
>> > 
>> > Optional<S3Secret> getS3Secret(S3SecretGetRequest request) 
>> > S3Secret createS3Secret(S3SecretCreateRequest request) 
>> > boolean updateS3Secret(S3SecretUpdateRequest request) 
>> > boolean deleteS3Secret(S3SecretDeleteRequest request) 
>> > 
>> > Optional<AzureSecret> getAzureSecret(AzureSecretGetRequest request) 
>> > AzureSecret createAzureSecret(AzureSecretCreateRequest request) 
>> > boolean updateAzureSecret(AzureSecretUpdateRequest request) 
>> > boolean deleteAzureSecret(AzureSecretDeleteRequest request) 
>> > 
>> > Optional<GCSSecret> getGCSSecret(GCSSecretGetRequest request) 
>> > GCSSecret createGCSSecret(GCSSecretCreateRequest request) 
>> > boolean updateGCSSecret(GCSSecretUpdateRequest request) 
>> > boolean deleteGCSSecret(GCSSecretDeleteRequest request) 
>> > 
>> > 
>> > Poc[3] : https://github.com/Jayancv/airavata-replica-catalog  (Defining API calls)
>> > Draft APIs : refer the attachment replicaCatalogAPIsDocumentation.html[4] which generated using the Poc [3]
>> > 
>> > I greatly appreciate your thoughts and feedback on the designs[5], as they can help us improve and adopt a more generalized approach. Additionally, I would like to identify any other factors that we should take into account to minimize potential issues in the future. Are there any other considerations that we should keep in mind? 
>> > 
>> > 
>> > [1] - https://github.com/apache/airavata-data-catalog
>> > [2] - https://github.com/apache/airavata-mft
>> > [3] - https://github.com/Jayancv/airavata-replica-catalog 
>> > [4] - https://drive.google.com/file/d/1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah/view?usp=sharing
>> > [5] - https://docs.google.com/document/d/1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac/edit?usp=sharing
>> > 
>> > Thanks.
>> > -- 
>> > Best Regards
>> > 
>> > Jayan Vidanapathirana
>> > 
>> > <replicaCatalogAPIsDocumentation.html>
>> 
>> 
>> 
>> -- 
>> Best Regards
>> 
>> Jayan Vidanapathirana
>> 
>> 
>> 
>> -- 
>> Best Regards
>> 
>> Jayan Vidanapathirana
>> 
> 
> 
> 
> -- 
> Best Regards
> 
> Jayan Vidanapathirana
> 


Re: Cybershuttle Replica Catalog API

Posted by "Christie, Marcus Aaron" <ma...@iu.edu>.
Hi Jayan,

This looks really good. Thanks for putting together this test case. It makes it very clear how the API should be used.

Thanks,

Marcus

> On Apr 20, 2023, at 1:44 PM, Jayan Vidanapathirana <jc...@gmail.com> wrote:
> 
> Hi All, 
> 
> A new test case [1] has been incorporated into the replica catalog, enabling the addition and retrieval of replica data through the data catalog pull request [2]. The current implementation allows users to operate with S3 and Google Cloud Storage credentials and storage.
> 
> Ongoing implementation 
> How to model this replica locations as hierarchical system to support replica groups
> 
> [1] org.apache.airavata.ReplicaCatalogAPIClientTest#testCase1
> [2] https://github.com/apache/airavata-data-catalog/pull/28
> 
> Thanks. 
> 
> On Sat, Apr 8, 2023 at 3:07 AM Christie, Marcus Aaron <ma...@iu.edu> wrote:
> Hi Jayan,
> 
>  
> 
> I think we can keep them loosely coupled. When a client is searching for files, it will search the Data Catalog. That will yield data product ids. Then the client needs to resolve those data product ids to replica locations. So that means that the Replica Catalog needs to record the data product id associated with each replica location.
> 
>  
> 
> For deployment plans, I haven’t though much about it but I think we’ll deploy them as separate services. That is, separate JVM processes. They will likely connect to the same backend database out of convenience, but I don’t think they would have to do so, they could have separate databases. Does that answer your question?
> 
>  
> 
> Thanks,
> 
>  
> 
> Marcus
> 
>  
> 
> From: Jayan Vidanapathirana <jc...@gmail.com>
> Date: Friday, April 7, 2023 at 2:00 PM
> To: dev@airavata.apache.org <de...@airavata.apache.org>
> Cc: Dimuthu Upeksha Wannipurage <di...@gmail.com>, Christie, Marcus Aaron <ma...@iu.edu>
> Subject: Re: Cybershuttle Replica Catalog API
> 
> Hi Marcus, 
> 
> I sincerely appreciate you taking the time to review my changes. Actually, I thought these two services needed to be operated independently when i'm starting the API definition. 
> If we couple both services then users can simply use the Data-Catalog Data Product  APIs and Data models. Then I can remove data product APIs and models from the replica catalog. 
> 
>  
> 
> Can I know if there is any overall deployment plan for these 2 catalogs ?
> 
>  
> 
> Thanks. 
> 
>  
> 
> On Thu, Apr 6, 2023 at 7:44 PM Christie, Marcus Aaron <ma...@iu.edu> wrote:
> 
> Hi Jayan,
> 
> Thanks for sharing. One question, the airavata-data-catalog already has a DATA_PRODUCT table and a way to store a data product's metadata. Could that be used instead of adding a new table?
> 
> Or more generally my question is how does this replica catalog API relate to the data catalog API/data model?
> 
> Thanks,
> 
> Marcus
> 
> > On Mar 31, 2023, at 4:11 PM, Jayan Vidanapathirana <jc...@gmail.com> wrote:
> > 
> > Hi All,
> > 
> > I have implemented basic flow(simple create and retrieve) of the replica catalog and drafted a pull request[1] to the Airavata data catalog as a new module. According to that implementation I have come to the following database structure for the replica catalog and I greatly appreciate your thoughts and feedback on the designs[2]. At this stage S3 storage type was considered as a sample. 
> > 
> > <Replica Catalog V2.drawio (1).png>
> > 
> > Also please refer to the following google doc[3] to review the implemented APIs.
> > 
> > [1] https://github.com/apache/airavata-data-catalog/pull/28
> > [2] https://drive.google.com/file/d/1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN/view?usp=sharing
> > [3] https://docs.google.com/document/d/1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k/edit?usp=sharing
> > 
> > Thank you.
> > 
> > On Mon, Mar 20, 2023 at 2:20 AM Suresh Marru <sm...@apache.org> wrote:
> > Hi Jayan,
> > 
> > Can you contribute a PR to the data catalog repo so we can keep the feedback on that issue?
> > 
> > Thanks for your contribution,
> > Suresh
> > 
> >> On Mar 19, 2023, at 12:55 PM, Jayan Vidanapathirana <jc...@gmail.com> wrote:
> >> 
> >> Hi All,
> >> 
> >> I have updated the draft code base[1] with a simple workflow of adding data to replica catalog. Still services are not yet finalized and will be enhanced with the workflow. 
> >> 
> >> [1] https://github.com/Jayancv/airavata-replica-catalog
> >> 
> >> Thanks.
> >> 
> >> On Sat, Feb 25, 2023 at 4:21 PM Jayan Vidanapathirana <jc...@gmail.com> wrote:
> >> Hi Dimuthu and Marcus,
> >> 
> >> Thank you both for checking my PoC and providing valuable feedback.
> >> 
> >> Dimuthu,
> >>      • Im agree with you regarding Replica location categories. It should be a data catalog level attribute. 
> >>      • To manage replica data access permissions don't we need user information at Replica catalog level ? I'm a bit confused on the permission management side of this catalog. 
> >>      • ReplicaListEntry  - Added to expose the list DataReplicaLocation s with basic details in AllDataReplicaGetResponse which provide all the replica items for the given product_id. However, here I was not considering that hierarchical structure. ReplicaGroupEntry is actually a one product replica which holds the file structure of the replica data. According to your suggestion we can model that AllDataReplicaGetResponse as follows,
> >> message AllDataReplicaGetResponse {
> >>   data_product_id = 1
> >>   repeated ReplicaGroupEntry replica_list = 2;
> >> }
> >> 
> >> message ReplicaGroupEntry {
> >>   string replica_group_id = 1
> >>   repeated ReplicaGroupEntry directories = 2;
> >>   repeated DataReplicaLocation files = 3;
> >> }
> >> 
> >> Marcus, 
> >>      • Yes, I will remove replica_id  from the data catalog diagram. 
> >>      • I added that parent_data_product_id to replica data by considering full context with replica catalog and data catalog relation. But within replica catalog context there is no such paranet product relationship. Therefore we can rename it to data_product_id. Thanks for pointing this out. 
> >> 
> >> Thanks.
> >> 
> >> On Thu, Feb 23, 2023 at 2:48 AM Christie, Marcus Aaron <ma...@iu.edu> wrote:
> >> Hi Jayan,
> >> 
> >> I would like to echo Dimuthu and say that this looks great and I appreciate the effort in your pulling this all together.  I have some feedback to share.
> >> 
> >> The high-level architecture diagram shows the replica id being stored in the data catalog. That was an initial idea that we had, but we decided that the replica catalog would store the data product id. That seems reflected in your API design so I think you already know this, but I wanted to point it out since the diagram might be a little confusing for others.
> >> 
> >> In the ReplicaCatalogAPI.proto the name of the data product id field is "parent_data_product_id". I would suggest calling it "data_product_id" instead. "parent_data_product_id" means "the id of the parent data product of this data product" in the data catalog. It might be confusing to use the same name in ReplicaCatalogAPI.proto.
> >> 
> >> 
> >> Thanks,
> >> 
> >> Marcus
> >> 
> >> > On Feb 18, 2023, at 3:09 PM, Jayan Vidanapathirana <jc...@gmail.com> wrote:
> >> > 
> >> > Hi All, 
> >> > 
> >> > As a new contributor to the Cybershuttle project, I have been actively involved in implementing the Data Replica Catalog. This new catalog is designed to interface with both the Apache Airavata Data Catalog [1] and Airavata MFT[2]. This replica catalog should be able to store each replica resource storage details and secret/credential details specific to the storage type. The proposed high-level architecture will be as follows:
> >> > 
> >> > 
> >> > 
> >> > I will mainly work on the highlighted area (red color box) and as an initial step started defining APIs which communicate with Replica Catalog. This API calls will be gRPC APIs and following methods will be implement,
> >> > 
> >> > Replica Registration
> >> > 
> >> >       • registerReplicaLocation(DataReplicaCreateRequest createRequest)
> >> >       • updateReplicaLocation(DataReplicaCreateRequest updateRequest)
> >> >       • DataReplicaLocationModel getReplicaLocation(DataReplicaGetRequest getReplicaRequest)
> >> >       • removeReplicaLocation(DataReplicaDeleteRequest deleteReplicaRequest)
> >> >       • getAllReplicaLocations(AllDataReplicaGetRequest allDataGetRequest)
> >> >       • removeAllReplicaLocations(AllDataReplicaDeleteRequest allDataDeleteRequest)
> >> > 
> >> > Storage Registration
> >> > 
> >> > registerSecretForStorage(SecretForStorage request)
> >> > deleteSecretsForStorage(SecretForStorageDeleteRequest request)
> >> > getSecretForStorage(SecretForStorageGetRequest request)
> >> > searchStorages(StorageSearchRequest request)
> >> > listStorages(StorageListRequest request)
> >> > resolveStorageType (StorageTypeResolveRequest request)
> >> > 
> >> > Storage - Internal APIs
> >> > 
> >> > S3StorageListResponse listS3Storage(S3StorageListRequest request) 
> >> > Optional<S3Storage> getS3Storage(S3StorageGetRequest request) 
> >> > S3Storage createS3Storage(S3StorageCreateRequest request) 
> >> > boolean updateS3Storage(S3StorageUpdateRequest request) 
> >> > boolean deleteS3Storage(S3StorageDeleteRequest request) 
> >> > 
> >> > AzureStorageListResponse listAzureStorage(AzureStorageListRequest request) 
> >> > Optional<AzureStorage> getAzureStorage(AzureStorageGetRequest request) 
> >> > AzureStorage createAzureStorage(AzureStorageCreateRequest request) 
> >> > boolean updateAzureStorage(AzureStorageUpdateRequest request) 
> >> > boolean deleteAzureStorage(AzureStorageDeleteRequest request) 
> >> > 
> >> > GCSStorageListResponse listGCSStorage(GCSStorageListRequest request) 
> >> > Optional<GCSStorage> getGCSStorage(GCSStorageGetRequest request) 
> >> > GCSStorage createGCSStorage(GCSStorageCreateRequest request) 
> >> > boolean updateGCSStorage(GCSStorageUpdateRequest request) 
> >> > boolean deleteGCSStorage(GCSStorageDeleteRequest request) 
> >> > 
> >> > Secret Registration
> >> > 
> >> > registerSecret(SecretRegistrationRequest request)
> >> > deleteSecret(SecretDeleteRequest request)
> >> > resolveStorageType (StorageTypeResolveRequest request)
> >> > 
> >> > Secret  - Internal APIs
> >> > 
> >> > Optional<S3Secret> getS3Secret(S3SecretGetRequest request) 
> >> > S3Secret createS3Secret(S3SecretCreateRequest request) 
> >> > boolean updateS3Secret(S3SecretUpdateRequest request) 
> >> > boolean deleteS3Secret(S3SecretDeleteRequest request) 
> >> > 
> >> > Optional<AzureSecret> getAzureSecret(AzureSecretGetRequest request) 
> >> > AzureSecret createAzureSecret(AzureSecretCreateRequest request) 
> >> > boolean updateAzureSecret(AzureSecretUpdateRequest request) 
> >> > boolean deleteAzureSecret(AzureSecretDeleteRequest request) 
> >> > 
> >> > Optional<GCSSecret> getGCSSecret(GCSSecretGetRequest request) 
> >> > GCSSecret createGCSSecret(GCSSecretCreateRequest request) 
> >> > boolean updateGCSSecret(GCSSecretUpdateRequest request) 
> >> > boolean deleteGCSSecret(GCSSecretDeleteRequest request) 
> >> > 
> >> > 
> >> > Poc[3] : https://github.com/Jayancv/airavata-replica-catalog  (Defining API calls)
> >> > Draft APIs : refer the attachment replicaCatalogAPIsDocumentation.html[4] which generated using the Poc [3]
> >> > 
> >> > I greatly appreciate your thoughts and feedback on the designs[5], as they can help us improve and adopt a more generalized approach. Additionally, I would like to identify any other factors that we should take into account to minimize potential issues in the future. Are there any other considerations that we should keep in mind? 
> >> > 
> >> > 
> >> > [1] - https://github.com/apache/airavata-data-catalog
> >> > [2] - https://github.com/apache/airavata-mft
> >> > [3] - https://github.com/Jayancv/airavata-replica-catalog 
> >> > [4] - https://drive.google.com/file/d/1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah/view?usp=sharing
> >> > [5] - https://docs.google.com/document/d/1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac/edit?usp=sharing
> >> > 
> >> > Thanks.
> >> > -- 
> >> > Best Regards
> >> > 
> >> > Jayan Vidanapathirana
> >> > 
> >> > <replicaCatalogAPIsDocumentation.html>
> >> 
> >> 
> >> 
> >> -- 
> >> Best Regards
> >> 
> >> Jayan Vidanapathirana
> >> 
> >> 
> >> 
> >> -- 
> >> Best Regards
> >> 
> >> Jayan Vidanapathirana
> >> 
> > 
> > 
> > 
> > -- 
> > Best Regards
> > 
> > Jayan Vidanapathirana
> >
> 
> 
> 
>  
> 
> --
> 
> Best Regards
> 
>  
> 
> Jayan Vidanapathirana
> 
> 
> 
> 
> 
> -- 
> Best Regards
> 
> Jayan Vidanapathirana
> 


Re: Cybershuttle Replica Catalog API

Posted by Jayan Vidanapathirana <jc...@gmail.com>.
Hi All,

A new test case [1] has been incorporated into the replica catalog,
enabling the addition and retrieval of replica data through the data
catalog pull request [2]. The current implementation allows users to
operate with S3 and Google Cloud Storage credentials and storage.

Ongoing implementation
How to model this replica locations as hierarchical system to support
replica groups

[1] org.apache.airavata.ReplicaCatalogAPIClientTest#testCase1
[2] https://github.com/apache/airavata-data-catalog/pull/28

Thanks.

On Sat, Apr 8, 2023 at 3:07 AM Christie, Marcus Aaron <ma...@iu.edu>
wrote:

> Hi Jayan,
>
>
>
> I think we can keep them loosely coupled. When a client is searching for
> files, it will search the Data Catalog. That will yield data product ids.
> Then the client needs to resolve those data product ids to replica
> locations. So that means that the Replica Catalog needs to record the data
> product id associated with each replica location.
>
>
>
> For deployment plans, I haven’t though much about it but I think we’ll
> deploy them as separate services. That is, separate JVM processes. They
> will likely connect to the same backend database out of convenience, but I
> don’t think they would have to do so, they could have separate databases.
> Does that answer your question?
>
>
>
> Thanks,
>
>
>
> Marcus
>
>
>
> *From: *Jayan Vidanapathirana <jc...@gmail.com>
> *Date: *Friday, April 7, 2023 at 2:00 PM
> *To: *dev@airavata.apache.org <de...@airavata.apache.org>
> *Cc: *Dimuthu Upeksha Wannipurage <di...@gmail.com>, Christie,
> Marcus Aaron <ma...@iu.edu>
> *Subject: *Re: Cybershuttle Replica Catalog API
>
> Hi Marcus,
>
> I sincerely appreciate you taking the time to review my changes. Actually,
> I thought these two services needed to be operated independently when i'm
> starting the API definition.
> If we couple both services then users can simply use the Data-Catalog Data
> Product  APIs and Data models. Then I can remove data product APIs and
> models from the replica catalog.
>
>
>
> Can I know if there is any overall deployment plan for these 2 catalogs ?
>
>
>
> Thanks.
>
>
>
> On Thu, Apr 6, 2023 at 7:44 PM Christie, Marcus Aaron <ma...@iu.edu>
> wrote:
>
> Hi Jayan,
>
> Thanks for sharing. One question, the airavata-data-catalog already has a
> DATA_PRODUCT table and a way to store a data product's metadata. Could that
> be used instead of adding a new table?
>
> Or more generally my question is how does this replica catalog API relate
> to the data catalog API/data model?
>
> Thanks,
>
> Marcus
>
> > On Mar 31, 2023, at 4:11 PM, Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >
> > Hi All,
> >
> > I have implemented basic flow(simple create and retrieve) of the replica
> catalog and drafted a pull request[1] to the Airavata data catalog as a new
> module. According to that implementation I have come to the following
> database structure for the replica catalog and I greatly appreciate your
> thoughts and feedback on the designs[2]. At this stage S3 storage type was
> considered as a sample.
> >
> > <Replica Catalog V2.drawio (1).png>
> >
> > Also please refer to the following google doc[3] to review the
> implemented APIs.
> >
> > [1] https://github.com/apache/airavata-data-catalog/pull/28
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairavata-data-catalog%2Fpull%2F28&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jrhMiXuNrxatk4vNi9IUlNaI2Ectt1bIlw8SgOEtQZM%3D&reserved=0>
> > [2]
> https://drive.google.com/file/d/1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN/view?usp=sharing
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.com%2Ffile%2Fd%2F1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN%2Fview%3Fusp%3Dsharing&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FOJnNQrvmNQ%2FUJ3yssx1SmeJMDZueH4QXGqUxEaE0Gc%3D&reserved=0>
> > [3]
> https://docs.google.com/document/d/1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k/edit?usp=sharing
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k%2Fedit%3Fusp%3Dsharing&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=I4D%2BZYv73giLI15Aq4GRiFU5PwHbmPlcXcrh7Uyda1o%3D&reserved=0>
> >
> > Thank you.
> >
> > On Mon, Mar 20, 2023 at 2:20 AM Suresh Marru <sm...@apache.org> wrote:
> > Hi Jayan,
> >
> > Can you contribute a PR to the data catalog repo so we can keep the
> feedback on that issue?
> >
> > Thanks for your contribution,
> > Suresh
> >
> >> On Mar 19, 2023, at 12:55 PM, Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >>
> >> Hi All,
> >>
> >> I have updated the draft code base[1] with a simple workflow of adding
> data to replica catalog. Still services are not yet finalized and will be
> enhanced with the workflow.
> >>
> >> [1] https://github.com/Jayancv/airavata-replica-catalog
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJayancv%2Fairavata-replica-catalog&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qk%2FEq31C%2F9HXARoeNts8KKpl4wU81Qrl7JHeR%2BNIPa0%3D&reserved=0>
> >>
> >> Thanks.
> >>
> >> On Sat, Feb 25, 2023 at 4:21 PM Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >> Hi Dimuthu and Marcus,
> >>
> >> Thank you both for checking my PoC and providing valuable feedback.
> >>
> >> Dimuthu,
> >>      • Im agree with you regarding Replica location categories. It
> should be a data catalog level attribute.
> >>      • To manage replica data access permissions don't we need user
> information at Replica catalog level ? I'm a bit confused on the permission
> management side of this catalog.
> >>      • ReplicaListEntry  - Added to expose the list DataReplicaLocation
> s with basic details in AllDataReplicaGetResponse which provide all the
> replica items for the given product_id. However, here I was not considering
> that hierarchical structure. ReplicaGroupEntry is actually a one product
> replica which holds the file structure of the replica data. According to
> your suggestion we can model that AllDataReplicaGetResponse as follows,
> >> message AllDataReplicaGetResponse {
> >>   data_product_id = 1
> >>   repeated ReplicaGroupEntry replica_list = 2;
> >> }
> >>
> >> message ReplicaGroupEntry {
> >>   string replica_group_id = 1
> >>   repeated ReplicaGroupEntry directories = 2;
> >>   repeated DataReplicaLocation files = 3;
> >> }
> >>
> >> Marcus,
> >>      • Yes, I will remove replica_id  from the data catalog diagram.
> >>      • I added that parent_data_product_id to replica data by
> considering full context with replica catalog and data catalog relation.
> But within replica catalog context there is no such paranet product
> relationship. Therefore we can rename it to data_product_id. Thanks for
> pointing this out.
> >>
> >> Thanks.
> >>
> >> On Thu, Feb 23, 2023 at 2:48 AM Christie, Marcus Aaron <ma...@iu.edu>
> wrote:
> >> Hi Jayan,
> >>
> >> I would like to echo Dimuthu and say that this looks great and I
> appreciate the effort in your pulling this all together.  I have some
> feedback to share.
> >>
> >> The high-level architecture diagram shows the replica id being stored
> in the data catalog. That was an initial idea that we had, but we decided
> that the replica catalog would store the data product id. That seems
> reflected in your API design so I think you already know this, but I wanted
> to point it out since the diagram might be a little confusing for others.
> >>
> >> In the ReplicaCatalogAPI.proto the name of the data product id field is
> "parent_data_product_id". I would suggest calling it "data_product_id"
> instead. "parent_data_product_id" means "the id of the parent data product
> of this data product" in the data catalog. It might be confusing to use the
> same name in ReplicaCatalogAPI.proto.
> >>
> >>
> >> Thanks,
> >>
> >> Marcus
> >>
> >> > On Feb 18, 2023, at 3:09 PM, Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >> >
> >> > Hi All,
> >> >
> >> > As a new contributor to the Cybershuttle project, I have been
> actively involved in implementing the Data Replica Catalog. This new
> catalog is designed to interface with both the Apache Airavata Data Catalog
> [1] and Airavata MFT[2]. This replica catalog should be able to store each
> replica resource storage details and secret/credential details specific to
> the storage type. The proposed high-level architecture will be as follows:
> >> >
> >> >
> >> >
> >> > I will mainly work on the highlighted area (red color box) and as an
> initial step started defining APIs which communicate with Replica Catalog.
> This API calls will be gRPC APIs and following methods will be implement,
> >> >
> >> > Replica Registration
> >> >
> >> >       • registerReplicaLocation(DataReplicaCreateRequest
> createRequest)
> >> >       • updateReplicaLocation(DataReplicaCreateRequest updateRequest)
> >> >       • DataReplicaLocationModel
> getReplicaLocation(DataReplicaGetRequest getReplicaRequest)
> >> >       • removeReplicaLocation(DataReplicaDeleteRequest
> deleteReplicaRequest)
> >> >       • getAllReplicaLocations(AllDataReplicaGetRequest
> allDataGetRequest)
> >> >       • removeAllReplicaLocations(AllDataReplicaDeleteRequest
> allDataDeleteRequest)
> >> >
> >> > Storage Registration
> >> >
> >> > registerSecretForStorage(SecretForStorage request)
> >> > deleteSecretsForStorage(SecretForStorageDeleteRequest request)
> >> > getSecretForStorage(SecretForStorageGetRequest request)
> >> > searchStorages(StorageSearchRequest request)
> >> > listStorages(StorageListRequest request)
> >> > resolveStorageType (StorageTypeResolveRequest request)
> >> >
> >> > Storage - Internal APIs
> >> >
> >> > S3StorageListResponse listS3Storage(S3StorageListRequest request)
> >> > Optional<S3Storage> getS3Storage(S3StorageGetRequest request)
> >> > S3Storage createS3Storage(S3StorageCreateRequest request)
> >> > boolean updateS3Storage(S3StorageUpdateRequest request)
> >> > boolean deleteS3Storage(S3StorageDeleteRequest request)
> >> >
> >> > AzureStorageListResponse listAzureStorage(AzureStorageListRequest
> request)
> >> > Optional<AzureStorage> getAzureStorage(AzureStorageGetRequest
> request)
> >> > AzureStorage createAzureStorage(AzureStorageCreateRequest request)
> >> > boolean updateAzureStorage(AzureStorageUpdateRequest request)
> >> > boolean deleteAzureStorage(AzureStorageDeleteRequest request)
> >> >
> >> > GCSStorageListResponse listGCSStorage(GCSStorageListRequest request)
> >> > Optional<GCSStorage> getGCSStorage(GCSStorageGetRequest request)
> >> > GCSStorage createGCSStorage(GCSStorageCreateRequest request)
> >> > boolean updateGCSStorage(GCSStorageUpdateRequest request)
> >> > boolean deleteGCSStorage(GCSStorageDeleteRequest request)
> >> >
> >> > Secret Registration
> >> >
> >> > registerSecret(SecretRegistrationRequest request)
> >> > deleteSecret(SecretDeleteRequest request)
> >> > resolveStorageType (StorageTypeResolveRequest request)
> >> >
> >> > Secret  - Internal APIs
> >> >
> >> > Optional<S3Secret> getS3Secret(S3SecretGetRequest request)
> >> > S3Secret createS3Secret(S3SecretCreateRequest request)
> >> > boolean updateS3Secret(S3SecretUpdateRequest request)
> >> > boolean deleteS3Secret(S3SecretDeleteRequest request)
> >> >
> >> > Optional<AzureSecret> getAzureSecret(AzureSecretGetRequest request)
> >> > AzureSecret createAzureSecret(AzureSecretCreateRequest request)
> >> > boolean updateAzureSecret(AzureSecretUpdateRequest request)
> >> > boolean deleteAzureSecret(AzureSecretDeleteRequest request)
> >> >
> >> > Optional<GCSSecret> getGCSSecret(GCSSecretGetRequest request)
> >> > GCSSecret createGCSSecret(GCSSecretCreateRequest request)
> >> > boolean updateGCSSecret(GCSSecretUpdateRequest request)
> >> > boolean deleteGCSSecret(GCSSecretDeleteRequest request)
> >> >
> >> >
> >> > Poc[3] : https://github.com/Jayancv/airavata-replica-catalog
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJayancv%2Fairavata-replica-catalog&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qk%2FEq31C%2F9HXARoeNts8KKpl4wU81Qrl7JHeR%2BNIPa0%3D&reserved=0>
> (Defining API calls)
> >> > Draft APIs : refer the attachment
> replicaCatalogAPIsDocumentation.html[4] which generated using the Poc [3]
> >> >
> >> > I greatly appreciate your thoughts and feedback on the designs[5], as
> they can help us improve and adopt a more generalized approach.
> Additionally, I would like to identify any other factors that we should
> take into account to minimize potential issues in the future. Are there any
> other considerations that we should keep in mind?
> >> >
> >> >
> >> > [1] - https://github.com/apache/airavata-data-catalog
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairavata-data-catalog&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=b%2FvIm7QWok0ETVh1oPpghadEfgX%2F8xn%2B5i5WCkJnvnI%3D&reserved=0>
> >> > [2] - https://github.com/apache/airavata-mft
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fairavata-mft&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ywmticWwASaJocdI1NSQrwRnghOs8fi08USaX8GGdxk%3D&reserved=0>
> >> > [3] - https://github.com/Jayancv/airavata-replica-catalog
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FJayancv%2Fairavata-replica-catalog&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qk%2FEq31C%2F9HXARoeNts8KKpl4wU81Qrl7JHeR%2BNIPa0%3D&reserved=0>
> >> > [4] -
> https://drive.google.com/file/d/1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah/view?usp=sharing
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.com%2Ffile%2Fd%2F1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah%2Fview%3Fusp%3Dsharing&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Fe4HgpdDyS%2BTWJ6fN20B9Xb7gJkucKJgS2kYTn7XWUc%3D&reserved=0>
> >> > [5] -
> https://docs.google.com/document/d/1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac/edit?usp=sharing
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac%2Fedit%3Fusp%3Dsharing&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=L5eFpv%2BPYcrJosXfMw%2B4HyLJbIHcpcLzyjLDa4teEgU%3D&reserved=0>
> >> >
> >> > Thanks.
> >> > --
> >> > Best Regards
> >> >
> >> > Jayan Vidanapathirana
> >> >
> >> > <replicaCatalogAPIsDocumentation.html>
> >>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jayan Vidanapathirana
> >>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jayan Vidanapathirana
> >>
> >
> >
> >
> > --
> > Best Regards
> >
> > Jayan Vidanapathirana
> >
>
>
>
>
> --
>
> Best Regards
>
>
>
> Jayan Vidanapathirana
>
>
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flk.linkedin.com%2Fin%2Fjayancv&data=05%7C01%7Cmachrist%40iu.edu%7C74d0687285274df3f47008db37920200%7C1113be34aed14d00ab4bcdd02510be91%7C0%7C0%7C638164872477540061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=M63TS5041I8y5uVwJxaBqyIPujgAKrcLTLLth%2FNetVM%3D&reserved=0>
>


-- 
Best Regards

Jayan Vidanapathirana
[image: https://lk.linkedin.com/in/jayancv]
<https://lk.linkedin.com/in/jayancv>

Re: Cybershuttle Replica Catalog API

Posted by "Christie, Marcus Aaron" <ma...@iu.edu>.
Hi Jayan, 

I think we can keep them loosely coupled. When a client is searching for files, it will search the Data Catalog. That will yield data product ids. Then the client needs to resolve those data product ids to replica locations. So that means that the Replica Catalog needs to record the data product id associated with each replica location. 

For deployment plans, I haven’t though much about it but I think we’ll deploy them as separate services. That is, separate JVM processes. They will likely connect to the same backend database out of convenience, but I don’t think they would have to do so, they could have separate databases. Does that answer your question? 

Thanks, 

Marcus 

From: Jayan Vidanapathirana <jc...@gmail.com>
Date: Friday, April 7, 2023 at 2:00 PM
To: dev@airavata.apache.org <de...@airavata.apache.org>
Cc: Dimuthu Upeksha Wannipurage <di...@gmail.com>, Christie, Marcus Aaron <ma...@iu.edu>
Subject: Re: Cybershuttle Replica Catalog API 

Hi Marcus, 

I sincerely appreciate you taking the time to review my changes. Actually, I thought these two services needed to be operated independently when i'm starting the API definition. 
If we couple both services then users can simply use the Data-Catalog Data Product APIs and Data models. Then I can remove data product APIs and models from the replica catalog. 


Can I know if there is any overall deployment plan for these 2 catalogs ?



Thanks. 



On Thu, Apr 6, 2023 at 7:44 PM Christie, Marcus Aaron <machrist@iu.edu <ma...@iu.edu>> wrote:

Hi Jayan,

Thanks for sharing. One question, the airavata-data-catalog already has a DATA_PRODUCT table and a way to store a data product's metadata. Could that be used instead of adding a new table?

Or more generally my question is how does this replica catalog API relate to the data catalog API/data model?

Thanks,

Marcus

> On Mar 31, 2023, at 4:11 PM, Jayan Vidanapathirana <jcvidanapathirana@gmail.com <_blank>> wrote:
> 
> Hi All,
> 
> I have implemented basic flow(simple create and retrieve) of the replica catalog and drafted a pull request[1] to the Airavata data catalog as a new module. According to that implementation I have come to the following database structure for the replica catalog and I greatly appreciate your thoughts and feedback on the designs[2]. At this stage S3 storage type was considered as a sample. 
> 
> <Replica Catalog V2.drawio (1).png>
> 
> Also please refer to the following google doc[3] to review the implemented APIs.
> 
> [1] https://github.com/apache/airavata-data-catalog/pull/28 <_blank>
> [2] https://drive.google.com/file/d/1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN/view?usp=sharing <_blank>
> [3] https://docs.google.com/document/d/1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k/edit?usp=sharing <_blank>
> 
> Thank you.
> 
> On Mon, Mar 20, 2023 at 2:20 AM Suresh Marru <smarru@apache.org <_blank>> wrote:
> Hi Jayan,
> 
> Can you contribute a PR to the data catalog repo so we can keep the feedback on that issue?
> 
> Thanks for your contribution,
> Suresh
> 
>> On Mar 19, 2023, at 12:55 PM, Jayan Vidanapathirana <jcvidanapathirana@gmail.com <_blank>> wrote:
>> 
>> Hi All,
>> 
>> I have updated the draft code base[1] with a simple workflow of adding data to replica catalog. Still services are not yet finalized and will be enhanced with the workflow. 
>> 
>> [1] https://github.com/Jayancv/airavata-replica-catalog <_blank>
>> 
>> Thanks.
>> 
>> On Sat, Feb 25, 2023 at 4:21 PM Jayan Vidanapathirana <jcvidanapathirana@gmail.com <_blank>> wrote:
>> Hi Dimuthu and Marcus,
>> 
>> Thank you both for checking my PoC and providing valuable feedback.
>> 
>> Dimuthu,
>> • Im agree with you regarding Replica location categories. It should be a data catalog level attribute. 
>> • To manage replica data access permissions don't we need user information at Replica catalog level ? I'm a bit confused on the permission management side of this catalog. 
>> • ReplicaListEntry - Added to expose the list DataReplicaLocation s with basic details in AllDataReplicaGetResponse which provide all the replica items for the given product_id. However, here I was not considering that hierarchical structure. ReplicaGroupEntry is actually a one product replica which holds the file structure of the replica data. According to your suggestion we can model that AllDataReplicaGetResponse as follows,
>> message AllDataReplicaGetResponse {
>> data_product_id = 1
>> repeated ReplicaGroupEntry replica_list = 2;
>> }
>> 
>> message ReplicaGroupEntry {
>> string replica_group_id = 1
>> repeated ReplicaGroupEntry directories = 2;
>> repeated DataReplicaLocation files = 3;
>> }
>> 
>> Marcus, 
>> • Yes, I will remove replica_id from the data catalog diagram. 
>> • I added that parent_data_product_id to replica data by considering full context with replica catalog and data catalog relation. But within replica catalog context there is no such paranet product relationship. Therefore we can rename it to data_product_id. Thanks for pointing this out. 
>> 
>> Thanks.
>> 
>> On Thu, Feb 23, 2023 at 2:48 AM Christie, Marcus Aaron <machrist@iu.edu <_blank>> wrote:
>> Hi Jayan,
>> 
>> I would like to echo Dimuthu and say that this looks great and I appreciate the effort in your pulling this all together. I have some feedback to share.
>> 
>> The high-level architecture diagram shows the replica id being stored in the data catalog. That was an initial idea that we had, but we decided that the replica catalog would store the data product id. That seems reflected in your API design so I think you already know this, but I wanted to point it out since the diagram might be a little confusing for others.
>> 
>> In the ReplicaCatalogAPI.proto the name of the data product id field is "parent_data_product_id". I would suggest calling it "data_product_id" instead. "parent_data_product_id" means "the id of the parent data product of this data product" in the data catalog. It might be confusing to use the same name in ReplicaCatalogAPI.proto.
>> 
>> 
>> Thanks,
>> 
>> Marcus
>> 
>> > On Feb 18, 2023, at 3:09 PM, Jayan Vidanapathirana <jcvidanapathirana@gmail.com <_blank>> wrote:
>> > 
>> > Hi All, 
>> > 
>> > As a new contributor to the Cybershuttle project, I have been actively involved in implementing the Data Replica Catalog. This new catalog is designed to interface with both the Apache Airavata Data Catalog [1] and Airavata MFT[2]. This replica catalog should be able to store each replica resource storage details and secret/credential details specific to the storage type. The proposed high-level architecture will be as follows:
>> > 
>> > 
>> > 
>> > I will mainly work on the highlighted area (red color box) and as an initial step started defining APIs which communicate with Replica Catalog. This API calls will be gRPC APIs and following methods will be implement,
>> > 
>> > Replica Registration
>> > 
>> > • registerReplicaLocation(DataReplicaCreateRequest createRequest)
>> > • updateReplicaLocation(DataReplicaCreateRequest updateRequest)
>> > • DataReplicaLocationModel getReplicaLocation(DataReplicaGetRequest getReplicaRequest)
>> > • removeReplicaLocation(DataReplicaDeleteRequest deleteReplicaRequest)
>> > • getAllReplicaLocations(AllDataReplicaGetRequest allDataGetRequest)
>> > • removeAllReplicaLocations(AllDataReplicaDeleteRequest allDataDeleteRequest)
>> > 
>> > Storage Registration
>> > 
>> > registerSecretForStorage(SecretForStorage request)
>> > deleteSecretsForStorage(SecretForStorageDeleteRequest request)
>> > getSecretForStorage(SecretForStorageGetRequest request)
>> > searchStorages(StorageSearchRequest request)
>> > listStorages(StorageListRequest request)
>> > resolveStorageType (StorageTypeResolveRequest request)
>> > 
>> > Storage - Internal APIs
>> > 
>> > S3StorageListResponse listS3Storage(S3StorageListRequest request) 
>> > Optional<S3Storage> getS3Storage(S3StorageGetRequest request) 
>> > S3Storage createS3Storage(S3StorageCreateRequest request) 
>> > boolean updateS3Storage(S3StorageUpdateRequest request) 
>> > boolean deleteS3Storage(S3StorageDeleteRequest request) 
>> > 
>> > AzureStorageListResponse listAzureStorage(AzureStorageListRequest request) 
>> > Optional<AzureStorage> getAzureStorage(AzureStorageGetRequest request) 
>> > AzureStorage createAzureStorage(AzureStorageCreateRequest request) 
>> > boolean updateAzureStorage(AzureStorageUpdateRequest request) 
>> > boolean deleteAzureStorage(AzureStorageDeleteRequest request) 
>> > 
>> > GCSStorageListResponse listGCSStorage(GCSStorageListRequest request) 
>> > Optional<GCSStorage> getGCSStorage(GCSStorageGetRequest request) 
>> > GCSStorage createGCSStorage(GCSStorageCreateRequest request) 
>> > boolean updateGCSStorage(GCSStorageUpdateRequest request) 
>> > boolean deleteGCSStorage(GCSStorageDeleteRequest request) 
>> > 
>> > Secret Registration
>> > 
>> > registerSecret(SecretRegistrationRequest request)
>> > deleteSecret(SecretDeleteRequest request)
>> > resolveStorageType (StorageTypeResolveRequest request)
>> > 
>> > Secret - Internal APIs
>> > 
>> > Optional<S3Secret> getS3Secret(S3SecretGetRequest request) 
>> > S3Secret createS3Secret(S3SecretCreateRequest request) 
>> > boolean updateS3Secret(S3SecretUpdateRequest request) 
>> > boolean deleteS3Secret(S3SecretDeleteRequest request) 
>> > 
>> > Optional<AzureSecret> getAzureSecret(AzureSecretGetRequest request) 
>> > AzureSecret createAzureSecret(AzureSecretCreateRequest request) 
>> > boolean updateAzureSecret(AzureSecretUpdateRequest request) 
>> > boolean deleteAzureSecret(AzureSecretDeleteRequest request) 
>> > 
>> > Optional<GCSSecret> getGCSSecret(GCSSecretGetRequest request) 
>> > GCSSecret createGCSSecret(GCSSecretCreateRequest request) 
>> > boolean updateGCSSecret(GCSSecretUpdateRequest request) 
>> > boolean deleteGCSSecret(GCSSecretDeleteRequest request) 
>> > 
>> > 
>> > Poc[3] : https://github.com/Jayancv/airavata-replica-catalog <_blank> (Defining API calls)
>> > Draft APIs : refer the attachment replicaCatalogAPIsDocumentation.html[4] which generated using the Poc [3]
>> > 
>> > I greatly appreciate your thoughts and feedback on the designs[5], as they can help us improve and adopt a more generalized approach. Additionally, I would like to identify any other factors that we should take into account to minimize potential issues in the future. Are there any other considerations that we should keep in mind? 
>> > 
>> > 
>> > [1] - https://github.com/apache/airavata-data-catalog <_blank>
>> > [2] - https://github.com/apache/airavata-mft <_blank>
>> > [3] - https://github.com/Jayancv/airavata-replica-catalog <_blank> 
>> > [4] - https://drive.google.com/file/d/1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah/view?usp=sharing <_blank>
>> > [5] - https://docs.google.com/document/d/1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac/edit?usp=sharing <_blank>
>> > 
>> > Thanks.
>> > -- 
>> > Best Regards
>> > 
>> > Jayan Vidanapathirana
>> > 
>> > <replicaCatalogAPIsDocumentation.html>
>> 
>> 
>> 
>> -- 
>> Best Regards
>> 
>> Jayan Vidanapathirana
>> 
>> 
>> 
>> -- 
>> Best Regards
>> 
>> Jayan Vidanapathirana
>> 
> 
> 
> 
> -- 
> Best Regards
> 
> Jayan Vidanapathirana
> 





-- 
Best Regards


Jayan Vidanapathirana 


<_blank>











Re: Cybershuttle Replica Catalog API

Posted by Jayan Vidanapathirana <jc...@gmail.com>.
Hi Marcus,

I sincerely appreciate you taking the time to review my changes. Actually,
I thought these two services needed to be operated independently when i'm
starting the API definition.
If we couple both services then users can simply use the Data-Catalog Data
Product  APIs and Data models. Then I can remove data product APIs and
models from the replica catalog.

Can I know if there is any overall deployment plan for these 2 catalogs ?

Thanks.

On Thu, Apr 6, 2023 at 7:44 PM Christie, Marcus Aaron <ma...@iu.edu>
wrote:

> Hi Jayan,
>
> Thanks for sharing. One question, the airavata-data-catalog already has a
> DATA_PRODUCT table and a way to store a data product's metadata. Could that
> be used instead of adding a new table?
>
> Or more generally my question is how does this replica catalog API relate
> to the data catalog API/data model?
>
> Thanks,
>
> Marcus
>
> > On Mar 31, 2023, at 4:11 PM, Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >
> > Hi All,
> >
> > I have implemented basic flow(simple create and retrieve) of the replica
> catalog and drafted a pull request[1] to the Airavata data catalog as a new
> module. According to that implementation I have come to the following
> database structure for the replica catalog and I greatly appreciate your
> thoughts and feedback on the designs[2]. At this stage S3 storage type was
> considered as a sample.
> >
> > <Replica Catalog V2.drawio (1).png>
> >
> > Also please refer to the following google doc[3] to review the
> implemented APIs.
> >
> > [1] https://github.com/apache/airavata-data-catalog/pull/28
> > [2]
> https://drive.google.com/file/d/1KP-8IWdvpPvjSWUG2t41K7WQXW2f9_qN/view?usp=sharing
> > [3]
> https://docs.google.com/document/d/1U-ok1ICt_EmjjxR9UuACV6g6YYkgoADfm0ECJ9ZDI3k/edit?usp=sharing
> >
> > Thank you.
> >
> > On Mon, Mar 20, 2023 at 2:20 AM Suresh Marru <sm...@apache.org> wrote:
> > Hi Jayan,
> >
> > Can you contribute a PR to the data catalog repo so we can keep the
> feedback on that issue?
> >
> > Thanks for your contribution,
> > Suresh
> >
> >> On Mar 19, 2023, at 12:55 PM, Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >>
> >> Hi All,
> >>
> >> I have updated the draft code base[1] with a simple workflow of adding
> data to replica catalog. Still services are not yet finalized and will be
> enhanced with the workflow.
> >>
> >> [1] https://github.com/Jayancv/airavata-replica-catalog
> >>
> >> Thanks.
> >>
> >> On Sat, Feb 25, 2023 at 4:21 PM Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >> Hi Dimuthu and Marcus,
> >>
> >> Thank you both for checking my PoC and providing valuable feedback.
> >>
> >> Dimuthu,
> >>      • Im agree with you regarding Replica location categories. It
> should be a data catalog level attribute.
> >>      • To manage replica data access permissions don't we need user
> information at Replica catalog level ? I'm a bit confused on the permission
> management side of this catalog.
> >>      • ReplicaListEntry  - Added to expose the list DataReplicaLocation
> s with basic details in AllDataReplicaGetResponse which provide all the
> replica items for the given product_id. However, here I was not considering
> that hierarchical structure. ReplicaGroupEntry is actually a one product
> replica which holds the file structure of the replica data. According to
> your suggestion we can model that AllDataReplicaGetResponse as follows,
> >> message AllDataReplicaGetResponse {
> >>   data_product_id = 1
> >>   repeated ReplicaGroupEntry replica_list = 2;
> >> }
> >>
> >> message ReplicaGroupEntry {
> >>   string replica_group_id = 1
> >>   repeated ReplicaGroupEntry directories = 2;
> >>   repeated DataReplicaLocation files = 3;
> >> }
> >>
> >> Marcus,
> >>      • Yes, I will remove replica_id  from the data catalog diagram.
> >>      • I added that parent_data_product_id to replica data by
> considering full context with replica catalog and data catalog relation.
> But within replica catalog context there is no such paranet product
> relationship. Therefore we can rename it to data_product_id. Thanks for
> pointing this out.
> >>
> >> Thanks.
> >>
> >> On Thu, Feb 23, 2023 at 2:48 AM Christie, Marcus Aaron <ma...@iu.edu>
> wrote:
> >> Hi Jayan,
> >>
> >> I would like to echo Dimuthu and say that this looks great and I
> appreciate the effort in your pulling this all together.  I have some
> feedback to share.
> >>
> >> The high-level architecture diagram shows the replica id being stored
> in the data catalog. That was an initial idea that we had, but we decided
> that the replica catalog would store the data product id. That seems
> reflected in your API design so I think you already know this, but I wanted
> to point it out since the diagram might be a little confusing for others.
> >>
> >> In the ReplicaCatalogAPI.proto the name of the data product id field is
> "parent_data_product_id". I would suggest calling it "data_product_id"
> instead. "parent_data_product_id" means "the id of the parent data product
> of this data product" in the data catalog. It might be confusing to use the
> same name in ReplicaCatalogAPI.proto.
> >>
> >>
> >> Thanks,
> >>
> >> Marcus
> >>
> >> > On Feb 18, 2023, at 3:09 PM, Jayan Vidanapathirana <
> jcvidanapathirana@gmail.com> wrote:
> >> >
> >> > Hi All,
> >> >
> >> > As a new contributor to the Cybershuttle project, I have been
> actively involved in implementing the Data Replica Catalog. This new
> catalog is designed to interface with both the Apache Airavata Data Catalog
> [1] and Airavata MFT[2]. This replica catalog should be able to store each
> replica resource storage details and secret/credential details specific to
> the storage type. The proposed high-level architecture will be as follows:
> >> >
> >> >
> >> >
> >> > I will mainly work on the highlighted area (red color box) and as an
> initial step started defining APIs which communicate with Replica Catalog.
> This API calls will be gRPC APIs and following methods will be implement,
> >> >
> >> > Replica Registration
> >> >
> >> >       • registerReplicaLocation(DataReplicaCreateRequest
> createRequest)
> >> >       • updateReplicaLocation(DataReplicaCreateRequest updateRequest)
> >> >       • DataReplicaLocationModel
> getReplicaLocation(DataReplicaGetRequest getReplicaRequest)
> >> >       • removeReplicaLocation(DataReplicaDeleteRequest
> deleteReplicaRequest)
> >> >       • getAllReplicaLocations(AllDataReplicaGetRequest
> allDataGetRequest)
> >> >       • removeAllReplicaLocations(AllDataReplicaDeleteRequest
> allDataDeleteRequest)
> >> >
> >> > Storage Registration
> >> >
> >> > registerSecretForStorage(SecretForStorage request)
> >> > deleteSecretsForStorage(SecretForStorageDeleteRequest request)
> >> > getSecretForStorage(SecretForStorageGetRequest request)
> >> > searchStorages(StorageSearchRequest request)
> >> > listStorages(StorageListRequest request)
> >> > resolveStorageType (StorageTypeResolveRequest request)
> >> >
> >> > Storage - Internal APIs
> >> >
> >> > S3StorageListResponse listS3Storage(S3StorageListRequest request)
> >> > Optional<S3Storage> getS3Storage(S3StorageGetRequest request)
> >> > S3Storage createS3Storage(S3StorageCreateRequest request)
> >> > boolean updateS3Storage(S3StorageUpdateRequest request)
> >> > boolean deleteS3Storage(S3StorageDeleteRequest request)
> >> >
> >> > AzureStorageListResponse listAzureStorage(AzureStorageListRequest
> request)
> >> > Optional<AzureStorage> getAzureStorage(AzureStorageGetRequest
> request)
> >> > AzureStorage createAzureStorage(AzureStorageCreateRequest request)
> >> > boolean updateAzureStorage(AzureStorageUpdateRequest request)
> >> > boolean deleteAzureStorage(AzureStorageDeleteRequest request)
> >> >
> >> > GCSStorageListResponse listGCSStorage(GCSStorageListRequest request)
> >> > Optional<GCSStorage> getGCSStorage(GCSStorageGetRequest request)
> >> > GCSStorage createGCSStorage(GCSStorageCreateRequest request)
> >> > boolean updateGCSStorage(GCSStorageUpdateRequest request)
> >> > boolean deleteGCSStorage(GCSStorageDeleteRequest request)
> >> >
> >> > Secret Registration
> >> >
> >> > registerSecret(SecretRegistrationRequest request)
> >> > deleteSecret(SecretDeleteRequest request)
> >> > resolveStorageType (StorageTypeResolveRequest request)
> >> >
> >> > Secret  - Internal APIs
> >> >
> >> > Optional<S3Secret> getS3Secret(S3SecretGetRequest request)
> >> > S3Secret createS3Secret(S3SecretCreateRequest request)
> >> > boolean updateS3Secret(S3SecretUpdateRequest request)
> >> > boolean deleteS3Secret(S3SecretDeleteRequest request)
> >> >
> >> > Optional<AzureSecret> getAzureSecret(AzureSecretGetRequest request)
> >> > AzureSecret createAzureSecret(AzureSecretCreateRequest request)
> >> > boolean updateAzureSecret(AzureSecretUpdateRequest request)
> >> > boolean deleteAzureSecret(AzureSecretDeleteRequest request)
> >> >
> >> > Optional<GCSSecret> getGCSSecret(GCSSecretGetRequest request)
> >> > GCSSecret createGCSSecret(GCSSecretCreateRequest request)
> >> > boolean updateGCSSecret(GCSSecretUpdateRequest request)
> >> > boolean deleteGCSSecret(GCSSecretDeleteRequest request)
> >> >
> >> >
> >> > Poc[3] : https://github.com/Jayancv/airavata-replica-catalog
> (Defining API calls)
> >> > Draft APIs : refer the attachment
> replicaCatalogAPIsDocumentation.html[4] which generated using the Poc [3]
> >> >
> >> > I greatly appreciate your thoughts and feedback on the designs[5], as
> they can help us improve and adopt a more generalized approach.
> Additionally, I would like to identify any other factors that we should
> take into account to minimize potential issues in the future. Are there any
> other considerations that we should keep in mind?
> >> >
> >> >
> >> > [1] - https://github.com/apache/airavata-data-catalog
> >> > [2] - https://github.com/apache/airavata-mft
> >> > [3] - https://github.com/Jayancv/airavata-replica-catalog
> >> > [4] -
> https://drive.google.com/file/d/1C4_H_Y5fZ4-5fmIHBNZyh3lXbV7vL5Ah/view?usp=sharing
> >> > [5] -
> https://docs.google.com/document/d/1dQUpHVkccx-O9mbYuAo-wtcLQWJ1LaKUzBpaBMCgSac/edit?usp=sharing
> >> >
> >> > Thanks.
> >> > --
> >> > Best Regards
> >> >
> >> > Jayan Vidanapathirana
> >> >
> >> > <replicaCatalogAPIsDocumentation.html>
> >>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jayan Vidanapathirana
> >>
> >>
> >>
> >> --
> >> Best Regards
> >>
> >> Jayan Vidanapathirana
> >>
> >
> >
> >
> > --
> > Best Regards
> >
> > Jayan Vidanapathirana
> >
>
>

-- 
Best Regards

Jayan Vidanapathirana
[image: https://lk.linkedin.com/in/jayancv]
<https://lk.linkedin.com/in/jayancv>