You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by "Pierce, Marlon" <ma...@iu.edu> on 2020/11/09 21:22:29 UTC

Re: [External] Re: Data Catalog into its own repo

+1 for this refactoring. 

On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar" <pa...@iu.edu> wrote:

    This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.
    -------

    +2. 

    Thanks,
    Sudhakar.

    On 11/9/20, 9:37 AM, "Marru, Suresh" <sm...@iu.edu> wrote:

        Hi All,

        Airavata Experiment catalog evolved over time and though the replica catalog and data product models are stand alone, they are buried to use them outside the experiment context. Any objections to refactor experiment catalog and make data catalog a first class repo, in the lines of Custos and MFT?

        Cheers,
        Suresh


Re: [External] Data Catalog into its own repo

Posted by "Pierce, Marlon" <ma...@iu.edu>.
Schema.org has a lot of uptake in the NSF EarthCube community (https://www.earthcube.org/p418) and the DataONE activity (https://ui.adsabs.harvard.edu/abs/2018AGUFMIN31B..29M/abstract).   

 

Marlon

 

 

From: Suresh Marru <sm...@apache.org>
Reply-To: dev <de...@airavata.apache.org>
Date: Tuesday, November 10, 2020 at 1:13 PM
To: dev <de...@airavata.apache.org>
Subject: Re: [External] Data Catalog into its own repo

 

Thank you all for weighing in. I bootstrapped the repo with some basic information, please contribute to set the goals for this refactored sub system - https://github.com/apache/airavata-data-lake

 

I was doing a literature and software survey on any open source metadata and provenance systems we can integrate with, I found this survey paper useful - https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf

 

Seems like we can build fairly flexible and yet sophisticated capabilities using the schema.org JSON-LD schema - https://schema.org/ , thoughts?

 

Please contribute any other pointers we should brainstorm before proceeding. 

 

Cheers,

Suresh



On Nov 9, 2020, at 4:22 PM, Pierce, Marlon <ma...@iu.edu> wrote:

 

+1 for this refactoring. 

On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar" <pa...@iu.edu> wrote:

   This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.
   -------

   +2. 

   Thanks,
   Sudhakar.

   On 11/9/20, 9:37 AM, "Marru, Suresh" <sm...@iu.edu> wrote:

       Hi All,

       Airavata Experiment catalog evolved over time and though the replica catalog and data product models are stand alone, they are buried to use them outside the experiment context. Any objections to refactor experiment catalog and make data catalog a first class repo, in the lines of Custos and MFT?

       Cheers,
       Suresh

 


Re: [External] Data Catalog into its own repo

Posted by Suresh Marru <sm...@apache.org>.
Also a pointer to a potential JSON-LD java implementation to consider - https://github.com/jsonld-java/jsonld-java <https://github.com/jsonld-java/jsonld-java>

Suresh

> On Nov 10, 2020, at 1:06 PM, Suresh Marru <sm...@apache.org> wrote:
> 
> Thank you all for weighing in. I bootstrapped the repo with some basic information, please contribute to set the goals for this refactored sub system - https://github.com/apache/airavata-data-lake <https://github.com/apache/airavata-data-lake>
> 
> I was doing a literature and software survey on any open source metadata and provenance systems we can integrate with, I found this survey paper useful - https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf <https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf>
> 
> Seems like we can build fairly flexible and yet sophisticated capabilities using the schema.org <http://schema.org/> JSON-LD schema - https://schema.org/ <https://schema.org/> , thoughts?
> 
> Please contribute any other pointers we should brainstorm before proceeding. 
> 
> Cheers,
> Suresh
> 
>> On Nov 9, 2020, at 4:22 PM, Pierce, Marlon <marpierc@iu.edu <ma...@iu.edu>> wrote:
>> 
>> +1 for this refactoring. 
>> 
>> On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar" <pamidigs@iu.edu <ma...@iu.edu>> wrote:
>> 
>>    This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.
>>    -------
>> 
>>    +2. 
>> 
>>    Thanks,
>>    Sudhakar.
>> 
>>    On 11/9/20, 9:37 AM, "Marru, Suresh" <smarru@iu.edu <ma...@iu.edu>> wrote:
>> 
>>        Hi All,
>> 
>>        Airavata Experiment catalog evolved over time and though the replica catalog and data product models are stand alone, they are buried to use them outside the experiment context. Any objections to refactor experiment catalog and make data catalog a first class repo, in the lines of Custos and MFT?
>> 
>>        Cheers,
>>        Suresh
>> 
> 


Re: [External] Data Catalog into its own repo

Posted by Suresh Marru <sm...@apache.org>.
Thank you all for weighing in. I bootstrapped the repo with some basic information, please contribute to set the goals for this refactored sub system - https://github.com/apache/airavata-data-lake <https://github.com/apache/airavata-data-lake>

I was doing a literature and software survey on any open source metadata and provenance systems we can integrate with, I found this survey paper useful - https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf <https://www.researchgate.net/profile/Carlos_Saenz-Adan/publication/323242431_A_systematic_review_of_provenance_systems/links/5b34ae1caca2720785effb1a/A-systematic-review-of-provenance-systems.pdf>

Seems like we can build fairly flexible and yet sophisticated capabilities using the schema.org JSON-LD schema - https://schema.org/ <https://schema.org/> , thoughts?

Please contribute any other pointers we should brainstorm before proceeding. 

Cheers,
Suresh

> On Nov 9, 2020, at 4:22 PM, Pierce, Marlon <ma...@iu.edu> wrote:
> 
> +1 for this refactoring. 
> 
> On 11/9/20, 9:47 AM, "Pamidighantam, Sudhakar" <pa...@iu.edu> wrote:
> 
>    This message was sent from a non-IU address. Please exercise caution when clicking links or opening attachments from external sources.
>    -------
> 
>    +2. 
> 
>    Thanks,
>    Sudhakar.
> 
>    On 11/9/20, 9:37 AM, "Marru, Suresh" <sm...@iu.edu> wrote:
> 
>        Hi All,
> 
>        Airavata Experiment catalog evolved over time and though the replica catalog and data product models are stand alone, they are buried to use them outside the experiment context. Any objections to refactor experiment catalog and make data catalog a first class repo, in the lines of Custos and MFT?
> 
>        Cheers,
>        Suresh
>