You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airavata.apache.org by "ptangcha@umail.iu.edu" <pt...@umail.iu.edu> on 2011/07/29 20:09:05 UTC

GFAC Type Architecture Design

Hi devs,

I want to discuss about the type system in GFAC-Core.

Currently, GFAC module read and write a necessary information based on XML
schema (called GFAC-Schema) as a definition. GFAC-Schema library is
generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
the project.

Examples of GFAC-Schema are:
HostTypeDescription, which describes an environment for a host such as Java
version, Temp directory, GridFTP endpoint etc.
ServiceTypeDescription, which describes a service such as parameters,
service name, etc.
GFAC-SimpleType, which defines a simple parameter type to the service such
as Boolean, Double, Integer, etc.

This is how system work roughly:
After deploying their software on a computing host, users will register
their host, application, service description via XBaya-GUI (Java Swing).
This registration information will be saved to XRegistry as XML string
according to XML schema.
When users invoke a (Web) service, GFAC will load the necessary information
(host, application directory, parameters, etc.) and execute the deployed
software .
Then, GFAC parses the output from the software, wraps it and send out as an
appropriate parameter type format.


So, the question is do we want to continue using XML-Schema.
If, we agree to use XML-Schema, we should import some initial schema from
OGCE GFAC as a new module in Airavata. Also, we need to redesign some
schema.
For Instance, current HostType schema requires GridFTP Endpoint element
which is not necessary if a computing host doesn't have GridFTP.

Otherwise, what do you propose? POJO, JSON, etc.

-- 
Best Regards,
Patanachai Tangchaisin

Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by Suresh Marru <sm...@cs.indiana.edu>.
Hi Chris, 

That will be very useful. As soon as we get through the first goal of integrating all the moving components into a cohesive release, we should start focusing on these important tasks. I will take up on your offer and will ask for your help when we get to these tasks.

Appreciate your insights and pointers,

Thanks,
Suresh

On Aug 9, 2011, at 10:31 PM, Mattmann, Chris A (388J) wrote:

> Hey Suresh,
> 
> Yep. Depending on Airavata's repository search needs, we can also pull in Apache Lucene, and Solr, as we need to. I'm very familiar 
> with those technologies and a former member of the Lucene PMC so I know those guys and their technology well.
> 
> Cheers,
> Chris
> 
> On Aug 9, 2011, at 7:28 PM, Suresh Marru wrote:
> 
>> 
>> On Aug 9, 2011, at 1:25 PM, Mattmann, Chris A (388J) wrote:
>> 
>>>> It indeed looks like a very active project and the reference implementation for JCR, thank for the pointer. I was poking through the documentation, but did not get yet get my hands dirty. It might be quick to ask you, do you know how easy will it be to add custom schemas and make the content of the document searchable? For example, can I add a WSDL or a BPEL document and find out across the repository which of the application services wsdl's wrap Gaussian molecular chemistry model? This is a just an illustrative example, but I am curious how the indexes will be built for content and how bad the performance will be if we make lot of content searchable. 
>>> 
>>> I definitely think you can do this, as you can define user-tags on the content items at each node in the repository and then search for those nodes 
>>> later on. It's probably best to sign up to user@jackrabbit.apache.org and ask there but that's based on my limited understanding of the system.
>> 
>> Thanks Chris for this additional information.
>> 
>> I will create some JIRA tasks so we can try out JCR and Jackrabbit for some simple repository tasks in gfac and xbaya. I think Airavata will have more complicated repository tasks, but to start with we can try simple examples. As a long term task I think it will be better we consolidate all Airavata repository needs so we can create interfaces and try out different implementations before we agree upon one. 
>> 
>> Suresh
>> 
>> 
>>> Thanks,
>>> Chris
>>> 
>>>> 
>>>> Thanks for your insights,
>>>> Suresh
>>>> 
>>>>> 
>>>>> Cheers,
>>>>> Chris
>>>>> 
>>>>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
>>>>> 
>>>>>> Hi All,
>>>>>> 
>>>>>> We are stalled on this thread, so how about getting to a consensus. Since I did not see any further discussion on the use of schemas, should we assume we want to retain XML Schemas and add simplified beans to easily work with instead of generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai explained in the original message below, there are three types of schema documents for GFAC to describe the computational host, application deployment description and finally service interface. Using these three descriptions, a application service wsdl is generated and GFAC manages the deployed application on various computational resources. There is a mapping between these deployment descriptions. I am reading the JCR API document [2] and intrigued by the relevance. But my inference is from a theoretical stand point and wondering if any one on the list has experience good and bad on working against JCR spec.
>>>>>> 
>>>>>> Suresh
>>>>>> 
>>>>>> [1] - https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
>>>>>> [2] - http://jcp.org/en/jsr/detail?id=283
>>>>>> 
>>>>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
>>>>>> 
>>>>>>> Hi Patanachai,
>>>>>>> 
>>>>>>> Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 
>>>>>>> 
>>>>>>> I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 
>>>>>>> 
>>>>>>> I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Suresh
>>>>>>> 
>>>>>>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
>>>>>>> 
>>>>>>>> Hi devs,
>>>>>>>> 
>>>>>>>> I want to discuss about the type system in GFAC-Core.
>>>>>>>> 
>>>>>>>> Currently, GFAC module read and write a necessary information based on XML
>>>>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>>>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
>>>>>>>> the project.
>>>>>>>> 
>>>>>>>> Examples of GFAC-Schema are:
>>>>>>>> HostTypeDescription, which describes an environment for a host such as Java
>>>>>>>> version, Temp directory, GridFTP endpoint etc.
>>>>>>>> ServiceTypeDescription, which describes a service such as parameters,
>>>>>>>> service name, etc.
>>>>>>>> GFAC-SimpleType, which defines a simple parameter type to the service such
>>>>>>>> as Boolean, Double, Integer, etc.
>>>>>>>> 
>>>>>>>> This is how system work roughly:
>>>>>>>> After deploying their software on a computing host, users will register
>>>>>>>> their host, application, service description via XBaya-GUI (Java Swing).
>>>>>>>> This registration information will be saved to XRegistry as XML string
>>>>>>>> according to XML schema.
>>>>>>>> When users invoke a (Web) service, GFAC will load the necessary information
>>>>>>>> (host, application directory, parameters, etc.) and execute the deployed
>>>>>>>> software .
>>>>>>>> Then, GFAC parses the output from the software, wraps it and send out as an
>>>>>>>> appropriate parameter type format.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> So, the question is do we want to continue using XML-Schema.
>>>>>>>> If, we agree to use XML-Schema, we should import some initial schema from
>>>>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>>>>>>>> schema.
>>>>>>>> For Instance, current HostType schema requires GridFTP Endpoint element
>>>>>>>> which is not necessary if a computing host doesn't have GridFTP.
>>>>>>>> 
>>>>>>>> Otherwise, what do you propose? POJO, JSON, etc.
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Best Regards,
>>>>>>>> Patanachai Tangchaisin
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> Chris Mattmann, Ph.D.
>>>>> Senior Computer Scientist
>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>>> Office: 171-266B, Mailstop: 171-246
>>>>> Email: chris.a.mattmann@nasa.gov
>>>>> WWW:   http://sunset.usc.edu/~mattmann/
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> Adjunct Assistant Professor, Computer Science Department
>>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> 
>>>> 
>>> 
>>> 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Senior Computer Scientist
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 171-266B, Mailstop: 171-246
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:   http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Assistant Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 
>> 
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Lahiru,

These deps would be pulled in the same way as other deps -- specified via 
Maven and/or any other dependency framework and thus wouldn't really 
add any more complexity.

Cheers,
Chris

On Aug 10, 2011, at 5:56 AM, Lahiru Gunathilake wrote:

> I think these approaches make the users life harder, we need to tell users
> to download jackrabbit/Solr or Lucene and set it up and ask users to
> configure lucene or Solr server since we are not going ship these products
> with airavata.
> 
> Or are we planning to ship a Jackrabbit with Lucent or Solr integrated ? I
> propose to experiment with some other repositories based on our requirements
> and then proceed with some repository which address our requirements.
> 
> Regards
> Lahiru
> 
> On Tue, Aug 9, 2011 at 10:31 PM, Mattmann, Chris A (388J) <
> chris.a.mattmann@jpl.nasa.gov> wrote:
> 
>> Hey Suresh,
>> 
>> Yep. Depending on Airavata's repository search needs, we can also pull in
>> Apache Lucene, and Solr, as we need to. I'm very familiar
>> with those technologies and a former member of the Lucene PMC so I know
>> those guys and their technology well.
>> 
>> Cheers,
>> Chris
>> 
>> On Aug 9, 2011, at 7:28 PM, Suresh Marru wrote:
>> 
>>> 
>>> On Aug 9, 2011, at 1:25 PM, Mattmann, Chris A (388J) wrote:
>>> 
>>>>> It indeed looks like a very active project and the reference
>> implementation for JCR, thank for the pointer. I was poking through the
>> documentation, but did not get yet get my hands dirty. It might be quick to
>> ask you, do you know how easy will it be to add custom schemas and make the
>> content of the document searchable? For example, can I add a WSDL or a BPEL
>> document and find out across the repository which of the application
>> services wsdl's wrap Gaussian molecular chemistry model? This is a just an
>> illustrative example, but I am curious how the indexes will be built for
>> content and how bad the performance will be if we make lot of content
>> searchable.
>>>> 
>>>> I definitely think you can do this, as you can define user-tags on the
>> content items at each node in the repository and then search for those nodes
>>>> later on. It's probably best to sign up to user@jackrabbit.apache.organd ask there but that's based on my limited understanding of the system.
>>> 
>>> Thanks Chris for this additional information.
>>> 
>>> I will create some JIRA tasks so we can try out JCR and Jackrabbit for
>> some simple repository tasks in gfac and xbaya. I think Airavata will have
>> more complicated repository tasks, but to start with we can try simple
>> examples. As a long term task I think it will be better we consolidate all
>> Airavata repository needs so we can create interfaces and try out different
>> implementations before we agree upon one.
>>> 
>>> Suresh
>>> 
>>> 
>>>> Thanks,
>>>> Chris
>>>> 
>>>>> 
>>>>> Thanks for your insights,
>>>>> Suresh
>>>>> 
>>>>>> 
>>>>>> Cheers,
>>>>>> Chris
>>>>>> 
>>>>>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
>>>>>> 
>>>>>>> Hi All,
>>>>>>> 
>>>>>>> We are stalled on this thread, so how about getting to a consensus.
>> Since I did not see any further discussion on the use of schemas, should we
>> assume we want to retain XML Schemas and add simplified beans to easily work
>> with instead of generated xmlbeans? The schemas for reference are at [1].
>> Also, as Patanachai explained in the original message below, there are three
>> types of schema documents for GFAC to describe the computational host,
>> application deployment description and finally service interface. Using
>> these three descriptions, a application service wsdl is generated and GFAC
>> manages the deployed application on various computational resources. There
>> is a mapping between these deployment descriptions. I am reading the JCR API
>> document [2] and intrigued by the relevance. But my inference is from a
>> theoretical stand point and wondering if any one on the list has experience
>> good and bad on working against JCR spec.
>>>>>>> 
>>>>>>> Suresh
>>>>>>> 
>>>>>>> [1] -
>> https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
>>>>>>> [2] - http://jcp.org/en/jsr/detail?id=283
>>>>>>> 
>>>>>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
>>>>>>> 
>>>>>>>> Hi Patanachai,
>>>>>>>> 
>>>>>>>> Thanks for explaining the issue in detail. In simple terms, we need
>> multiple client components register a description about an application and
>> store it in a registry. GFac will need to pull the registered description
>> document and execute and manage the compute job. Along with XBaya as the
>> client which registers the document, there are other clients including a
>> gadget interface.
>>>>>>>> 
>>>>>>>> I agree that the current scheme has to revisited (and fix minor
>> issues like you mention about the gridftp tags). But  moving from xmlschema
>> to a light weight option is a bigger question. With a proper bean generation
>> library and serializing/deserializing methods I personally favor xml schema
>> but I do not want to be biased either. I am -1 for POJO simply because it
>> will limit non-java bases clients like a simple php web form. JSON in
>> general sounds like a good alternative, but I do not experience with it in a
>> validation and schema sense.
>>>>>>>> 
>>>>>>>> I will wait for others to chime in, if there are no better
>> alternatives suggestion, I will import the missing GFac schema from code
>> donation into a commons area -
>> https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Suresh
>>>>>>>> 
>>>>>>>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
>>>>>>>> 
>>>>>>>>> Hi devs,
>>>>>>>>> 
>>>>>>>>> I want to discuss about the type system in GFAC-Core.
>>>>>>>>> 
>>>>>>>>> Currently, GFAC module read and write a necessary information based
>> on XML
>>>>>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>>>>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is
>> referenced in
>>>>>>>>> the project.
>>>>>>>>> 
>>>>>>>>> Examples of GFAC-Schema are:
>>>>>>>>> HostTypeDescription, which describes an environment for a host such
>> as Java
>>>>>>>>> version, Temp directory, GridFTP endpoint etc.
>>>>>>>>> ServiceTypeDescription, which describes a service such as
>> parameters,
>>>>>>>>> service name, etc.
>>>>>>>>> GFAC-SimpleType, which defines a simple parameter type to the
>> service such
>>>>>>>>> as Boolean, Double, Integer, etc.
>>>>>>>>> 
>>>>>>>>> This is how system work roughly:
>>>>>>>>> After deploying their software on a computing host, users will
>> register
>>>>>>>>> their host, application, service description via XBaya-GUI (Java
>> Swing).
>>>>>>>>> This registration information will be saved to XRegistry as XML
>> string
>>>>>>>>> according to XML schema.
>>>>>>>>> When users invoke a (Web) service, GFAC will load the necessary
>> information
>>>>>>>>> (host, application directory, parameters, etc.) and execute the
>> deployed
>>>>>>>>> software .
>>>>>>>>> Then, GFAC parses the output from the software, wraps it and send
>> out as an
>>>>>>>>> appropriate parameter type format.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> So, the question is do we want to continue using XML-Schema.
>>>>>>>>> If, we agree to use XML-Schema, we should import some initial
>> schema from
>>>>>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign
>> some
>>>>>>>>> schema.
>>>>>>>>> For Instance, current HostType schema requires GridFTP Endpoint
>> element
>>>>>>>>> which is not necessary if a computing host doesn't have GridFTP.
>>>>>>>>> 
>>>>>>>>> Otherwise, what do you propose? POJO, JSON, etc.
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Best Regards,
>>>>>>>>> Patanachai Tangchaisin
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> Chris Mattmann, Ph.D.
>>>>>> Senior Computer Scientist
>>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>>>> Office: 171-266B, Mailstop: 171-246
>>>>>> Email: chris.a.mattmann@nasa.gov
>>>>>> WWW:   http://sunset.usc.edu/~mattmann/
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> Adjunct Assistant Professor, Computer Science Department
>>>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Chris Mattmann, Ph.D.
>>>> Senior Computer Scientist
>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>> Office: 171-266B, Mailstop: 171-246
>>>> Email: chris.a.mattmann@nasa.gov
>>>> WWW:   http://sunset.usc.edu/~mattmann/
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Adjunct Assistant Professor, Computer Science Department
>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> 
>>> 
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
>> 
> 
> 
> -- 
> System Analyst Programmer
> PTI Lab
> Indiana University


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by Lahiru Gunathilake <gl...@gmail.com>.
I think these approaches make the users life harder, we need to tell users
to download jackrabbit/Solr or Lucene and set it up and ask users to
configure lucene or Solr server since we are not going ship these products
with airavata.

Or are we planning to ship a Jackrabbit with Lucent or Solr integrated ? I
propose to experiment with some other repositories based on our requirements
and then proceed with some repository which address our requirements.

Regards
Lahiru

On Tue, Aug 9, 2011 at 10:31 PM, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hey Suresh,
>
> Yep. Depending on Airavata's repository search needs, we can also pull in
> Apache Lucene, and Solr, as we need to. I'm very familiar
> with those technologies and a former member of the Lucene PMC so I know
> those guys and their technology well.
>
> Cheers,
> Chris
>
> On Aug 9, 2011, at 7:28 PM, Suresh Marru wrote:
>
> >
> > On Aug 9, 2011, at 1:25 PM, Mattmann, Chris A (388J) wrote:
> >
> >>> It indeed looks like a very active project and the reference
> implementation for JCR, thank for the pointer. I was poking through the
> documentation, but did not get yet get my hands dirty. It might be quick to
> ask you, do you know how easy will it be to add custom schemas and make the
> content of the document searchable? For example, can I add a WSDL or a BPEL
> document and find out across the repository which of the application
> services wsdl's wrap Gaussian molecular chemistry model? This is a just an
> illustrative example, but I am curious how the indexes will be built for
> content and how bad the performance will be if we make lot of content
> searchable.
> >>
> >> I definitely think you can do this, as you can define user-tags on the
> content items at each node in the repository and then search for those nodes
> >> later on. It's probably best to sign up to user@jackrabbit.apache.organd ask there but that's based on my limited understanding of the system.
> >
> > Thanks Chris for this additional information.
> >
> > I will create some JIRA tasks so we can try out JCR and Jackrabbit for
> some simple repository tasks in gfac and xbaya. I think Airavata will have
> more complicated repository tasks, but to start with we can try simple
> examples. As a long term task I think it will be better we consolidate all
> Airavata repository needs so we can create interfaces and try out different
> implementations before we agree upon one.
> >
> > Suresh
> >
> >
> >> Thanks,
> >> Chris
> >>
> >>>
> >>> Thanks for your insights,
> >>> Suresh
> >>>
> >>>>
> >>>> Cheers,
> >>>> Chris
> >>>>
> >>>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
> >>>>
> >>>>> Hi All,
> >>>>>
> >>>>> We are stalled on this thread, so how about getting to a consensus.
> Since I did not see any further discussion on the use of schemas, should we
> assume we want to retain XML Schemas and add simplified beans to easily work
> with instead of generated xmlbeans? The schemas for reference are at [1].
> Also, as Patanachai explained in the original message below, there are three
> types of schema documents for GFAC to describe the computational host,
> application deployment description and finally service interface. Using
> these three descriptions, a application service wsdl is generated and GFAC
> manages the deployed application on various computational resources. There
> is a mapping between these deployment descriptions. I am reading the JCR API
> document [2] and intrigued by the relevance. But my inference is from a
> theoretical stand point and wondering if any one on the list has experience
> good and bad on working against JCR spec.
> >>>>>
> >>>>> Suresh
> >>>>>
> >>>>> [1] -
> https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
> >>>>> [2] - http://jcp.org/en/jsr/detail?id=283
> >>>>>
> >>>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
> >>>>>
> >>>>>> Hi Patanachai,
> >>>>>>
> >>>>>> Thanks for explaining the issue in detail. In simple terms, we need
> multiple client components register a description about an application and
> store it in a registry. GFac will need to pull the registered description
> document and execute and manage the compute job. Along with XBaya as the
> client which registers the document, there are other clients including a
> gadget interface.
> >>>>>>
> >>>>>> I agree that the current scheme has to revisited (and fix minor
> issues like you mention about the gridftp tags). But  moving from xmlschema
> to a light weight option is a bigger question. With a proper bean generation
> library and serializing/deserializing methods I personally favor xml schema
> but I do not want to be biased either. I am -1 for POJO simply because it
> will limit non-java bases clients like a simple php web form. JSON in
> general sounds like a good alternative, but I do not experience with it in a
> validation and schema sense.
> >>>>>>
> >>>>>> I will wait for others to chime in, if there are no better
> alternatives suggestion, I will import the missing GFac schema from code
> donation into a commons area -
> https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Suresh
> >>>>>>
> >>>>>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
> >>>>>>
> >>>>>>> Hi devs,
> >>>>>>>
> >>>>>>> I want to discuss about the type system in GFAC-Core.
> >>>>>>>
> >>>>>>> Currently, GFAC module read and write a necessary information based
> on XML
> >>>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
> >>>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is
> referenced in
> >>>>>>> the project.
> >>>>>>>
> >>>>>>> Examples of GFAC-Schema are:
> >>>>>>> HostTypeDescription, which describes an environment for a host such
> as Java
> >>>>>>> version, Temp directory, GridFTP endpoint etc.
> >>>>>>> ServiceTypeDescription, which describes a service such as
> parameters,
> >>>>>>> service name, etc.
> >>>>>>> GFAC-SimpleType, which defines a simple parameter type to the
> service such
> >>>>>>> as Boolean, Double, Integer, etc.
> >>>>>>>
> >>>>>>> This is how system work roughly:
> >>>>>>> After deploying their software on a computing host, users will
> register
> >>>>>>> their host, application, service description via XBaya-GUI (Java
> Swing).
> >>>>>>> This registration information will be saved to XRegistry as XML
> string
> >>>>>>> according to XML schema.
> >>>>>>> When users invoke a (Web) service, GFAC will load the necessary
> information
> >>>>>>> (host, application directory, parameters, etc.) and execute the
> deployed
> >>>>>>> software .
> >>>>>>> Then, GFAC parses the output from the software, wraps it and send
> out as an
> >>>>>>> appropriate parameter type format.
> >>>>>>>
> >>>>>>>
> >>>>>>> So, the question is do we want to continue using XML-Schema.
> >>>>>>> If, we agree to use XML-Schema, we should import some initial
> schema from
> >>>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign
> some
> >>>>>>> schema.
> >>>>>>> For Instance, current HostType schema requires GridFTP Endpoint
> element
> >>>>>>> which is not necessary if a computing host doesn't have GridFTP.
> >>>>>>>
> >>>>>>> Otherwise, what do you propose? POJO, JSON, etc.
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards,
> >>>>>>> Patanachai Tangchaisin
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>> Chris Mattmann, Ph.D.
> >>>> Senior Computer Scientist
> >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>>> Office: 171-266B, Mailstop: 171-246
> >>>> Email: chris.a.mattmann@nasa.gov
> >>>> WWW:   http://sunset.usc.edu/~mattmann/
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>> Adjunct Assistant Professor, Computer Science Department
> >>>> University of Southern California, Los Angeles, CA 90089 USA
> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>
> >>>
> >>
> >>
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Chris Mattmann, Ph.D.
> >> Senior Computer Scientist
> >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >> Office: 171-266B, Mailstop: 171-246
> >> Email: chris.a.mattmann@nasa.gov
> >> WWW:   http://sunset.usc.edu/~mattmann/
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >> Adjunct Assistant Professor, Computer Science Department
> >> University of Southern California, Los Angeles, CA 90089 USA
> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>
> >
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Suresh,

Yep. Depending on Airavata's repository search needs, we can also pull in Apache Lucene, and Solr, as we need to. I'm very familiar 
with those technologies and a former member of the Lucene PMC so I know those guys and their technology well.

Cheers,
Chris

On Aug 9, 2011, at 7:28 PM, Suresh Marru wrote:

> 
> On Aug 9, 2011, at 1:25 PM, Mattmann, Chris A (388J) wrote:
> 
>>> It indeed looks like a very active project and the reference implementation for JCR, thank for the pointer. I was poking through the documentation, but did not get yet get my hands dirty. It might be quick to ask you, do you know how easy will it be to add custom schemas and make the content of the document searchable? For example, can I add a WSDL or a BPEL document and find out across the repository which of the application services wsdl's wrap Gaussian molecular chemistry model? This is a just an illustrative example, but I am curious how the indexes will be built for content and how bad the performance will be if we make lot of content searchable. 
>> 
>> I definitely think you can do this, as you can define user-tags on the content items at each node in the repository and then search for those nodes 
>> later on. It's probably best to sign up to user@jackrabbit.apache.org and ask there but that's based on my limited understanding of the system.
> 
> Thanks Chris for this additional information.
> 
> I will create some JIRA tasks so we can try out JCR and Jackrabbit for some simple repository tasks in gfac and xbaya. I think Airavata will have more complicated repository tasks, but to start with we can try simple examples. As a long term task I think it will be better we consolidate all Airavata repository needs so we can create interfaces and try out different implementations before we agree upon one. 
> 
> Suresh
> 
> 
>> Thanks,
>> Chris
>> 
>>> 
>>> Thanks for your insights,
>>> Suresh
>>> 
>>>> 
>>>> Cheers,
>>>> Chris
>>>> 
>>>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
>>>> 
>>>>> Hi All,
>>>>> 
>>>>> We are stalled on this thread, so how about getting to a consensus. Since I did not see any further discussion on the use of schemas, should we assume we want to retain XML Schemas and add simplified beans to easily work with instead of generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai explained in the original message below, there are three types of schema documents for GFAC to describe the computational host, application deployment description and finally service interface. Using these three descriptions, a application service wsdl is generated and GFAC manages the deployed application on various computational resources. There is a mapping between these deployment descriptions. I am reading the JCR API document [2] and intrigued by the relevance. But my inference is from a theoretical stand point and wondering if any one on the list has experience good and bad on working against JCR spec.
>>>>> 
>>>>> Suresh
>>>>> 
>>>>> [1] - https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
>>>>> [2] - http://jcp.org/en/jsr/detail?id=283
>>>>> 
>>>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
>>>>> 
>>>>>> Hi Patanachai,
>>>>>> 
>>>>>> Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 
>>>>>> 
>>>>>> I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 
>>>>>> 
>>>>>> I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>>>>>> 
>>>>>> Cheers,
>>>>>> Suresh
>>>>>> 
>>>>>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
>>>>>> 
>>>>>>> Hi devs,
>>>>>>> 
>>>>>>> I want to discuss about the type system in GFAC-Core.
>>>>>>> 
>>>>>>> Currently, GFAC module read and write a necessary information based on XML
>>>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
>>>>>>> the project.
>>>>>>> 
>>>>>>> Examples of GFAC-Schema are:
>>>>>>> HostTypeDescription, which describes an environment for a host such as Java
>>>>>>> version, Temp directory, GridFTP endpoint etc.
>>>>>>> ServiceTypeDescription, which describes a service such as parameters,
>>>>>>> service name, etc.
>>>>>>> GFAC-SimpleType, which defines a simple parameter type to the service such
>>>>>>> as Boolean, Double, Integer, etc.
>>>>>>> 
>>>>>>> This is how system work roughly:
>>>>>>> After deploying their software on a computing host, users will register
>>>>>>> their host, application, service description via XBaya-GUI (Java Swing).
>>>>>>> This registration information will be saved to XRegistry as XML string
>>>>>>> according to XML schema.
>>>>>>> When users invoke a (Web) service, GFAC will load the necessary information
>>>>>>> (host, application directory, parameters, etc.) and execute the deployed
>>>>>>> software .
>>>>>>> Then, GFAC parses the output from the software, wraps it and send out as an
>>>>>>> appropriate parameter type format.
>>>>>>> 
>>>>>>> 
>>>>>>> So, the question is do we want to continue using XML-Schema.
>>>>>>> If, we agree to use XML-Schema, we should import some initial schema from
>>>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>>>>>>> schema.
>>>>>>> For Instance, current HostType schema requires GridFTP Endpoint element
>>>>>>> which is not necessary if a computing host doesn't have GridFTP.
>>>>>>> 
>>>>>>> Otherwise, what do you propose? POJO, JSON, etc.
>>>>>>> 
>>>>>>> -- 
>>>>>>> Best Regards,
>>>>>>> Patanachai Tangchaisin
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Chris Mattmann, Ph.D.
>>>> Senior Computer Scientist
>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>>> Office: 171-266B, Mailstop: 171-246
>>>> Email: chris.a.mattmann@nasa.gov
>>>> WWW:   http://sunset.usc.edu/~mattmann/
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> Adjunct Assistant Professor, Computer Science Department
>>>> University of Southern California, Los Angeles, CA 90089 USA
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> 
>>> 
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by Suresh Marru <sm...@cs.indiana.edu>.
On Aug 9, 2011, at 1:25 PM, Mattmann, Chris A (388J) wrote:

>> It indeed looks like a very active project and the reference implementation for JCR, thank for the pointer. I was poking through the documentation, but did not get yet get my hands dirty. It might be quick to ask you, do you know how easy will it be to add custom schemas and make the content of the document searchable? For example, can I add a WSDL or a BPEL document and find out across the repository which of the application services wsdl's wrap Gaussian molecular chemistry model? This is a just an illustrative example, but I am curious how the indexes will be built for content and how bad the performance will be if we make lot of content searchable. 
> 
> I definitely think you can do this, as you can define user-tags on the content items at each node in the repository and then search for those nodes 
> later on. It's probably best to sign up to user@jackrabbit.apache.org and ask there but that's based on my limited understanding of the system.

Thanks Chris for this additional information.

I will create some JIRA tasks so we can try out JCR and Jackrabbit for some simple repository tasks in gfac and xbaya. I think Airavata will have more complicated repository tasks, but to start with we can try simple examples. As a long term task I think it will be better we consolidate all Airavata repository needs so we can create interfaces and try out different implementations before we agree upon one. 

Suresh


> Thanks,
> Chris
> 
>> 
>> Thanks for your insights,
>> Suresh
>> 
>>> 
>>> Cheers,
>>> Chris
>>> 
>>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
>>> 
>>>> Hi All,
>>>> 
>>>> We are stalled on this thread, so how about getting to a consensus. Since I did not see any further discussion on the use of schemas, should we assume we want to retain XML Schemas and add simplified beans to easily work with instead of generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai explained in the original message below, there are three types of schema documents for GFAC to describe the computational host, application deployment description and finally service interface. Using these three descriptions, a application service wsdl is generated and GFAC manages the deployed application on various computational resources. There is a mapping between these deployment descriptions. I am reading the JCR API document [2] and intrigued by the relevance. But my inference is from a theoretical stand point and wondering if any one on the list has experience good and bad on working against JCR spec.
>>>> 
>>>> Suresh
>>>> 
>>>> [1] - https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
>>>> [2] - http://jcp.org/en/jsr/detail?id=283
>>>> 
>>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
>>>> 
>>>>> Hi Patanachai,
>>>>> 
>>>>> Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 
>>>>> 
>>>>> I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 
>>>>> 
>>>>> I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>>>>> 
>>>>> Cheers,
>>>>> Suresh
>>>>> 
>>>>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
>>>>> 
>>>>>> Hi devs,
>>>>>> 
>>>>>> I want to discuss about the type system in GFAC-Core.
>>>>>> 
>>>>>> Currently, GFAC module read and write a necessary information based on XML
>>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
>>>>>> the project.
>>>>>> 
>>>>>> Examples of GFAC-Schema are:
>>>>>> HostTypeDescription, which describes an environment for a host such as Java
>>>>>> version, Temp directory, GridFTP endpoint etc.
>>>>>> ServiceTypeDescription, which describes a service such as parameters,
>>>>>> service name, etc.
>>>>>> GFAC-SimpleType, which defines a simple parameter type to the service such
>>>>>> as Boolean, Double, Integer, etc.
>>>>>> 
>>>>>> This is how system work roughly:
>>>>>> After deploying their software on a computing host, users will register
>>>>>> their host, application, service description via XBaya-GUI (Java Swing).
>>>>>> This registration information will be saved to XRegistry as XML string
>>>>>> according to XML schema.
>>>>>> When users invoke a (Web) service, GFAC will load the necessary information
>>>>>> (host, application directory, parameters, etc.) and execute the deployed
>>>>>> software .
>>>>>> Then, GFAC parses the output from the software, wraps it and send out as an
>>>>>> appropriate parameter type format.
>>>>>> 
>>>>>> 
>>>>>> So, the question is do we want to continue using XML-Schema.
>>>>>> If, we agree to use XML-Schema, we should import some initial schema from
>>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>>>>>> schema.
>>>>>> For Instance, current HostType schema requires GridFTP Endpoint element
>>>>>> which is not necessary if a computing host doesn't have GridFTP.
>>>>>> 
>>>>>> Otherwise, what do you propose? POJO, JSON, etc.
>>>>>> 
>>>>>> -- 
>>>>>> Best Regards,
>>>>>> Patanachai Tangchaisin
>>>>> 
>>>> 
>>> 
>>> 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Senior Computer Scientist
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 171-266B, Mailstop: 171-246
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:   http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Assistant Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 
>> 
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Suresh,

On Aug 9, 2011, at 10:09 AM, Suresh Marru wrote:

> On Aug 9, 2011, at 1:00 PM, Mattmann, Chris A (388J) wrote:
> 
>> Hey Guys,
>> 
>> I would check out the Apache Jackrabbit project:
>> 
>> http://jackrabbit.apache.org/
>> 
>> It's a full implementation of the JCR spec and very active and healthy as Apache projects 
>> go.
> 
> HI Chris, 
> 
> It indeed looks like a very active project and the reference implementation for JCR, thank for the pointer. I was poking through the documentation, but did not get yet get my hands dirty. It might be quick to ask you, do you know how easy will it be to add custom schemas and make the content of the document searchable? For example, can I add a WSDL or a BPEL document and find out across the repository which of the application services wsdl's wrap Gaussian molecular chemistry model? This is a just an illustrative example, but I am curious how the indexes will be built for content and how bad the performance will be if we make lot of content searchable. 

I definitely think you can do this, as you can define user-tags on the content items at each node in the repository and then search for those nodes 
later on. It's probably best to sign up to user@jackrabbit.apache.org and ask there but that's based on my limited understanding of the system.

Thanks,
Chris

> 
> Thanks for your insights,
> Suresh
> 
>> 
>> Cheers,
>> Chris
>> 
>> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
>> 
>>> Hi All,
>>> 
>>> We are stalled on this thread, so how about getting to a consensus. Since I did not see any further discussion on the use of schemas, should we assume we want to retain XML Schemas and add simplified beans to easily work with instead of generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai explained in the original message below, there are three types of schema documents for GFAC to describe the computational host, application deployment description and finally service interface. Using these three descriptions, a application service wsdl is generated and GFAC manages the deployed application on various computational resources. There is a mapping between these deployment descriptions. I am reading the JCR API document [2] and intrigued by the relevance. But my inference is from a theoretical stand point and wondering if any one on the list has experience good and bad on working against JCR spec.
>>> 
>>> Suresh
>>> 
>>> [1] - https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
>>> [2] - http://jcp.org/en/jsr/detail?id=283
>>> 
>>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
>>> 
>>>> Hi Patanachai,
>>>> 
>>>> Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 
>>>> 
>>>> I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 
>>>> 
>>>> I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>>>> 
>>>> Cheers,
>>>> Suresh
>>>> 
>>>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
>>>> 
>>>>> Hi devs,
>>>>> 
>>>>> I want to discuss about the type system in GFAC-Core.
>>>>> 
>>>>> Currently, GFAC module read and write a necessary information based on XML
>>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
>>>>> the project.
>>>>> 
>>>>> Examples of GFAC-Schema are:
>>>>> HostTypeDescription, which describes an environment for a host such as Java
>>>>> version, Temp directory, GridFTP endpoint etc.
>>>>> ServiceTypeDescription, which describes a service such as parameters,
>>>>> service name, etc.
>>>>> GFAC-SimpleType, which defines a simple parameter type to the service such
>>>>> as Boolean, Double, Integer, etc.
>>>>> 
>>>>> This is how system work roughly:
>>>>> After deploying their software on a computing host, users will register
>>>>> their host, application, service description via XBaya-GUI (Java Swing).
>>>>> This registration information will be saved to XRegistry as XML string
>>>>> according to XML schema.
>>>>> When users invoke a (Web) service, GFAC will load the necessary information
>>>>> (host, application directory, parameters, etc.) and execute the deployed
>>>>> software .
>>>>> Then, GFAC parses the output from the software, wraps it and send out as an
>>>>> appropriate parameter type format.
>>>>> 
>>>>> 
>>>>> So, the question is do we want to continue using XML-Schema.
>>>>> If, we agree to use XML-Schema, we should import some initial schema from
>>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>>>>> schema.
>>>>> For Instance, current HostType schema requires GridFTP Endpoint element
>>>>> which is not necessary if a computing host doesn't have GridFTP.
>>>>> 
>>>>> Otherwise, what do you propose? POJO, JSON, etc.
>>>>> 
>>>>> -- 
>>>>> Best Regards,
>>>>> Patanachai Tangchaisin
>>>> 
>>> 
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by Suresh Marru <sm...@apache.org>.
On Aug 9, 2011, at 1:00 PM, Mattmann, Chris A (388J) wrote:

> Hey Guys,
> 
> I would check out the Apache Jackrabbit project:
> 
> http://jackrabbit.apache.org/
> 
> It's a full implementation of the JCR spec and very active and healthy as Apache projects 
> go.

HI Chris, 

It indeed looks like a very active project and the reference implementation for JCR, thank for the pointer. I was poking through the documentation, but did not get yet get my hands dirty. It might be quick to ask you, do you know how easy will it be to add custom schemas and make the content of the document searchable? For example, can I add a WSDL or a BPEL document and find out across the repository which of the application services wsdl's wrap Gaussian molecular chemistry model? This is a just an illustrative example, but I am curious how the indexes will be built for content and how bad the performance will be if we make lot of content searchable. 

Thanks for your insights,
Suresh

> 
> Cheers,
> Chris
> 
> On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:
> 
>> Hi All,
>> 
>> We are stalled on this thread, so how about getting to a consensus. Since I did not see any further discussion on the use of schemas, should we assume we want to retain XML Schemas and add simplified beans to easily work with instead of generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai explained in the original message below, there are three types of schema documents for GFAC to describe the computational host, application deployment description and finally service interface. Using these three descriptions, a application service wsdl is generated and GFAC manages the deployed application on various computational resources. There is a mapping between these deployment descriptions. I am reading the JCR API document [2] and intrigued by the relevance. But my inference is from a theoretical stand point and wondering if any one on the list has experience good and bad on working against JCR spec.
>> 
>> Suresh
>> 
>> [1] - https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
>> [2] - http://jcp.org/en/jsr/detail?id=283
>> 
>> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
>> 
>>> Hi Patanachai,
>>> 
>>> Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 
>>> 
>>> I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 
>>> 
>>> I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>>> 
>>> Cheers,
>>> Suresh
>>> 
>>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
>>> 
>>>> Hi devs,
>>>> 
>>>> I want to discuss about the type system in GFAC-Core.
>>>> 
>>>> Currently, GFAC module read and write a necessary information based on XML
>>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
>>>> the project.
>>>> 
>>>> Examples of GFAC-Schema are:
>>>> HostTypeDescription, which describes an environment for a host such as Java
>>>> version, Temp directory, GridFTP endpoint etc.
>>>> ServiceTypeDescription, which describes a service such as parameters,
>>>> service name, etc.
>>>> GFAC-SimpleType, which defines a simple parameter type to the service such
>>>> as Boolean, Double, Integer, etc.
>>>> 
>>>> This is how system work roughly:
>>>> After deploying their software on a computing host, users will register
>>>> their host, application, service description via XBaya-GUI (Java Swing).
>>>> This registration information will be saved to XRegistry as XML string
>>>> according to XML schema.
>>>> When users invoke a (Web) service, GFAC will load the necessary information
>>>> (host, application directory, parameters, etc.) and execute the deployed
>>>> software .
>>>> Then, GFAC parses the output from the software, wraps it and send out as an
>>>> appropriate parameter type format.
>>>> 
>>>> 
>>>> So, the question is do we want to continue using XML-Schema.
>>>> If, we agree to use XML-Schema, we should import some initial schema from
>>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>>>> schema.
>>>> For Instance, current HostType schema requires GridFTP Endpoint element
>>>> which is not necessary if a computing host doesn't have GridFTP.
>>>> 
>>>> Otherwise, what do you propose? POJO, JSON, etc.
>>>> 
>>>> -- 
>>>> Best Regards,
>>>> Patanachai Tangchaisin
>>> 
>> 
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Guys,

I would check out the Apache Jackrabbit project:

http://jackrabbit.apache.org/

It's a full implementation of the JCR spec and very active and healthy as Apache projects 
go.

Cheers,
Chris

On Aug 9, 2011, at 9:55 AM, Suresh Marru wrote:

> Hi All,
> 
> We are stalled on this thread, so how about getting to a consensus. Since I did not see any further discussion on the use of schemas, should we assume we want to retain XML Schemas and add simplified beans to easily work with instead of generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai explained in the original message below, there are three types of schema documents for GFAC to describe the computational host, application deployment description and finally service interface. Using these three descriptions, a application service wsdl is generated and GFAC manages the deployed application on various computational resources. There is a mapping between these deployment descriptions. I am reading the JCR API document [2] and intrigued by the relevance. But my inference is from a theoretical stand point and wondering if any one on the list has experience good and bad on working against JCR spec.
> 
> Suresh
> 
> [1] - https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
> [2] - http://jcp.org/en/jsr/detail?id=283
> 
> On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:
> 
>> Hi Patanachai,
>> 
>> Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 
>> 
>> I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 
>> 
>> I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
>> 
>> Cheers,
>> Suresh
>> 
>> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
>> 
>>> Hi devs,
>>> 
>>> I want to discuss about the type system in GFAC-Core.
>>> 
>>> Currently, GFAC module read and write a necessary information based on XML
>>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
>>> the project.
>>> 
>>> Examples of GFAC-Schema are:
>>> HostTypeDescription, which describes an environment for a host such as Java
>>> version, Temp directory, GridFTP endpoint etc.
>>> ServiceTypeDescription, which describes a service such as parameters,
>>> service name, etc.
>>> GFAC-SimpleType, which defines a simple parameter type to the service such
>>> as Boolean, Double, Integer, etc.
>>> 
>>> This is how system work roughly:
>>> After deploying their software on a computing host, users will register
>>> their host, application, service description via XBaya-GUI (Java Swing).
>>> This registration information will be saved to XRegistry as XML string
>>> according to XML schema.
>>> When users invoke a (Web) service, GFAC will load the necessary information
>>> (host, application directory, parameters, etc.) and execute the deployed
>>> software .
>>> Then, GFAC parses the output from the software, wraps it and send out as an
>>> appropriate parameter type format.
>>> 
>>> 
>>> So, the question is do we want to continue using XML-Schema.
>>> If, we agree to use XML-Schema, we should import some initial schema from
>>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>>> schema.
>>> For Instance, current HostType schema requires GridFTP Endpoint element
>>> which is not necessary if a computing host doesn't have GridFTP.
>>> 
>>> Otherwise, what do you propose? POJO, JSON, etc.
>>> 
>>> -- 
>>> Best Regards,
>>> Patanachai Tangchaisin
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Usefulness of JCR for GFac Descriptions (was Re: GFAC Type Architecture Design)

Posted by Suresh Marru <sm...@apache.org>.
Hi All,

We are stalled on this thread, so how about getting to a consensus. Since I did not see any further discussion on the use of schemas, should we assume we want to retain XML Schemas and add simplified beans to easily work with instead of generated xmlbeans? The schemas for reference are at [1]. Also, as Patanachai explained in the original message below, there are three types of schema documents for GFAC to describe the computational host, application deployment description and finally service interface. Using these three descriptions, a application service wsdl is generated and GFAC manages the deployed application on various computational resources. There is a mapping between these deployment descriptions. I am reading the JCR API document [2] and intrigued by the relevance. But my inference is from a theoretical stand point and wondering if any one on the list has experience good and bad on working against JCR spec.

Suresh

[1] - https://svn.apache.org/repos/asf/incubator/airavata/trunk/modules/commons/gfac-schema/schemas/
[2] - http://jcp.org/en/jsr/detail?id=283

On Aug 1, 2011, at 12:07 AM, Suresh Marru wrote:

> Hi Patanachai,
> 
> Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 
> 
> I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 
> 
> I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/
> 
> Cheers,
> Suresh
> 
> On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:
> 
>> Hi devs,
>> 
>> I want to discuss about the type system in GFAC-Core.
>> 
>> Currently, GFAC module read and write a necessary information based on XML
>> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
>> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
>> the project.
>> 
>> Examples of GFAC-Schema are:
>> HostTypeDescription, which describes an environment for a host such as Java
>> version, Temp directory, GridFTP endpoint etc.
>> ServiceTypeDescription, which describes a service such as parameters,
>> service name, etc.
>> GFAC-SimpleType, which defines a simple parameter type to the service such
>> as Boolean, Double, Integer, etc.
>> 
>> This is how system work roughly:
>> After deploying their software on a computing host, users will register
>> their host, application, service description via XBaya-GUI (Java Swing).
>> This registration information will be saved to XRegistry as XML string
>> according to XML schema.
>> When users invoke a (Web) service, GFAC will load the necessary information
>> (host, application directory, parameters, etc.) and execute the deployed
>> software .
>> Then, GFAC parses the output from the software, wraps it and send out as an
>> appropriate parameter type format.
>> 
>> 
>> So, the question is do we want to continue using XML-Schema.
>> If, we agree to use XML-Schema, we should import some initial schema from
>> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
>> schema.
>> For Instance, current HostType schema requires GridFTP Endpoint element
>> which is not necessary if a computing host doesn't have GridFTP.
>> 
>> Otherwise, what do you propose? POJO, JSON, etc.
>> 
>> -- 
>> Best Regards,
>> Patanachai Tangchaisin
> 


Re: GFAC Type Architecture Design

Posted by Suresh Marru <sm...@apache.org>.
Hi Patanachai,

Thanks for explaining the issue in detail. In simple terms, we need multiple client components register a description about an application and store it in a registry. GFac will need to pull the registered description document and execute and manage the compute job. Along with XBaya as the client which registers the document, there are other clients including a gadget interface. 

I agree that the current scheme has to revisited (and fix minor issues like you mention about the gridftp tags). But  moving from xmlschema to a light weight option is a bigger question. With a proper bean generation library and serializing/deserializing methods I personally favor xml schema but I do not want to be biased either. I am -1 for POJO simply because it will limit non-java bases clients like a simple php web form. JSON in general sounds like a good alternative, but I do not experience with it in a validation and schema sense. 

I will wait for others to chime in, if there are no better alternatives suggestion, I will import the missing GFac schema from code donation into a commons area - https://svn.apache.org/repos/asf/incubator/airavata/donations/ogce-donation/modules/utils/schemas/gfac-schema-utils/

Cheers,
Suresh

On Jul 29, 2011, at 2:09 PM, ptangcha@umail.iu.edu wrote:

> Hi devs,
> 
> I want to discuss about the type system in GFAC-Core.
> 
> Currently, GFAC module read and write a necessary information based on XML
> schema (called GFAC-Schema) as a definition. GFAC-Schema library is
> generated from XMLbeans (http://xmlbeans.apache.org/) and is referenced in
> the project.
> 
> Examples of GFAC-Schema are:
> HostTypeDescription, which describes an environment for a host such as Java
> version, Temp directory, GridFTP endpoint etc.
> ServiceTypeDescription, which describes a service such as parameters,
> service name, etc.
> GFAC-SimpleType, which defines a simple parameter type to the service such
> as Boolean, Double, Integer, etc.
> 
> This is how system work roughly:
> After deploying their software on a computing host, users will register
> their host, application, service description via XBaya-GUI (Java Swing).
> This registration information will be saved to XRegistry as XML string
> according to XML schema.
> When users invoke a (Web) service, GFAC will load the necessary information
> (host, application directory, parameters, etc.) and execute the deployed
> software .
> Then, GFAC parses the output from the software, wraps it and send out as an
> appropriate parameter type format.
> 
> 
> So, the question is do we want to continue using XML-Schema.
> If, we agree to use XML-Schema, we should import some initial schema from
> OGCE GFAC as a new module in Airavata. Also, we need to redesign some
> schema.
> For Instance, current HostType schema requires GridFTP Endpoint element
> which is not necessary if a computing host doesn't have GridFTP.
> 
> Otherwise, what do you propose? POJO, JSON, etc.
> 
> -- 
> Best Regards,
> Patanachai Tangchaisin