You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by "Starch, Michael D (388L)" <Mi...@jpl.nasa.gov> on 2011/05/24 18:02:03 UTC

More on the Changes to Elements.xml file

All,

So upon testing my first pass at the implementation, I realized that not all of our elements should be of the new subtype, as not all of the elements are backed in the Database.

I see two solutions to this:

1. Change the name of the element tag to a new name, for any element belonging to the new subtype.
2. If the element tag has the new fields I am adding, it is treated as the new subtype, otherwise it is treated as the old element type.

I like solution 1, as it is a bit clearer from reading the xml file that there is a difference between types.
Solution 2 requires fewer changes to the elements.xml file.

Do you all have a preferred solution?

-Michael 


On May 19, 2011, at 11:06 PM, Mattmann, Chris A (388J) wrote:

> Hi Michael,
> 
> I'd be happy to reply below, but yes I'd suggest initially CC'ing dev@oodt.apache.org. Giving this information is great, but I prefer the community to benefit from these types of discussions (and it means that I or you or Albert or Cecilia can point others to this question if it is asked again rather than typing the same answer again or fwd'ing something private). Also no worries about researching it -- you can ask questions on that list -- researched or not. We welcome feedback from the community. 
> 
> Answers inline below:
> 
> On May 19, 2011, at 2:44 PM, Starch, Michael D (388L) wrote:
> 
>> Chris,
>> 
>> In order to create tables in the new database architecture that Brian Foster set up for use, we need the ability to specify some information beyond that which is stored in elements.xml.  We wish to store this information as part of PCS, rather than continually haggling back and forth with our DBA in order to get this set in an ad-hoc manner.
>> 
>> The data we need to store is:
>> 
>> -Is it a vector type?
>> -Data-type
>> -Max data size
>> and potentially a few more items.
> 
> Gotcha.
> 
>> 
>> As it stands now, Brian has used the DCElement tag in elements.xml file in order to store some of this information but not all of it.  Thus it seems natural that the rest of this information gets stored along side it.
>> 
>> There seem to be three ways to accomplish this:
>> 
>> 1. Add more information to the DCElement tag, 
>> 2. Add additional tags to the element file containing this information
>> 3. Create a sperate file to store this information (Not part of PCS)
> 
> I've got a 4th option.
> 
> 4. Write a new ValidationLayer implementation, that extends XMLValidationLayer, but adds your desired information to elements.xml. For that extra information, I'd favor your proposed option #2 -- I think it's cleaner.
> 
>> 
>> As head of oodt, we were looking for you input as to which option to choose, or if you wish this question to be brought to dev@oodt.apache, where can I research this information before asking the question on that list?
>> 
>> Below is a brief analysis of each option, from our perspective.
>> 
>> Thanks,
>> 
>> Michael Starch
>> 
>> 1. Add more information to the DCElements tag.
>> 	-I believe Brain said this tag was unused, so he "borrowed" it to suit this purpose.
>> 	-Adding more information to the tag would "hijack" this tag to a further extent.
>> 	-The changes could be made locally to our ColumnBasedDataSourceCatalog in the manner Brian already used.
>> 	-If the DCElement is used ouside of the new use Brian invented, this change could affect the intended (original) use
> 
> I don't like this one (nor do I think others will) since it hijacks dcElement (which is supposed to be a mapping to a Dublin Core element name, rather than used for other information).
> 
>> 
>> 2. Add more tags to elements.xml
>> 	-Most elegant solution, as each data field has a tag specified to hold exactly it
>> 	-No need to use tags which were originally intended for some other purpose
>> 	-It is an architectural change, changing the format of the elements.xml file
>> 	-Must be done carefully to prevent us from deviating from apache-oodt when it is not necessary to do so.
> 
> +1 for this option.
> 
>> 	
>> 3. A separate file
>> 	-Creates data/configuration duplication, as some of these fields must stay (in some form) in elements.xml
>> 	-Separates similar pieces of data
>> 	-Doesn't alter PCS as it exists now
> 
> This is fine too, but I prefer #2.
> 
>> 
>> We favor number 1 or number 2 as it keeps similar data together, and feels like it is the most logical place to store this information.
>> 
>> I personally favor number 2, as it does not use tags for unintended purposes, but I do not wish to make a change that affects our compatibility with apache code.  Hence your input.
>> 
> 
> +1 to your preference. I think the way to accomplish it is to write a YourExtendedValidationLayer extends XMLValidationLayer and that provides a customized:
>  1. YourElementClass extends Element 
>     - provides extra props (getters+setters) that you want to leverage
>  2. implementations of the SerDe for your specialized elements.xml that includes those extra tags
>  3. provides the ability to access this ValidationLayer information and Element information in ColumnBasedDataSourceCatalog.
> 
> BTW, I think it would be great to bake up patches for Apache OODT for what you are working on, including ColumnBasedDataSourceCatalog (already there but in a branch, would be nice to forward port to trunk, and your proposed ValidationLayer and Element extensions). Thoughts? Do you have the time? I think the community at Apache would sincerely appreciate the effort.
> 
> Cheers,
> Chris
> 
> P.S. Would you be OK with me forwarding this thread to dev@oodt.a.o? I think there's some good info here that I'd hate to be lost in the ether...
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> Phone: +1 (818) 354-8810
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: More on the Changes to Elements.xml file

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Michael,

> 
> I am more than happy to discuss my methods, naming conventions etc.  I am trying to get a working model in my environment by the end of the week, but I have until the end of June to come up with a polished, final solution that ACOS and other projects will use.  

Thanks!

> 
> Below is a description of my method so far:
> 
> I called the new element "columnBasedElement" stealing the naming convention that differentiates "ColumnBasedDataSourceCatalog" from "DataSourceCatalog".

+1, I like it b/c it tells a user, hey, this is going to be used along with the ColumnBasedDataSourceCatalog.

> 
> This new element contains 4 new child tags: dataType, dataSize, isVector (I want to change this to multiValued), and dataDigits.  These are the 4 essential values that I need to fully specify database tables.

How about something shorter like type, size, isMutiValued, and digits (since "data" is just repeated int he beginning).

> 
> When I read in a "columnBasedElement" I create an instance of ColumnBasedElement, a subclass of element.  The subclass contains getters and setters for these new child fields.  In this way these new elements are treated as elements by all code, until they find there way back to the code I implement.

+1

> 
> Once back, I use instanceof, and a cast to the subtype (messy but I felt it was less messy than duplicating many lines of code).

I hear ya -- instanceOf is something I've resorted to plenty of times in the past -- no worries!

> 
> On top of this I have decided to move my utility that checks that the database is defined consistently with the elements.xml and product-types-to-element-map.xml inside PCS so that it takes advantage of that added functionality. 

Awesome. Do you mean inside of the pcs/core package? If so, +1, I'm thinking that folks that required the CAS services + PCS stuff on top should put their tools inside of that package. 

Also, if you get a chance, check out OODT-147 [1] . I think it will play synergistically with this and some of the functionality will be available as a Core File Manager structure at that point.

Cheers,
Chris


[1] https://issues.apache.org/jira/browse/OODT-147

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: More on the Changes to Elements.xml file

Posted by "Starch, Michael D (388L)" <Mi...@jpl.nasa.gov>.
Chris,

I am more than happy to discuss my methods, naming conventions etc.  I am trying to get a working model in my environment by the end of the week, but I have until the end of June to come up with a polished, final solution that ACOS and other projects will use.  

Below is a description of my method so far:

I called the new element "columnBasedElement" stealing the naming convention that differentiates "ColumnBasedDataSourceCatalog" from "DataSourceCatalog".

This new element contains 4 new child tags: dataType, dataSize, isVector (I want to change this to multiValued), and dataDigits.  These are the 4 essential values that I need to fully specify database tables.

When I read in a "columnBasedElement" I create an instance of ColumnBasedElement, a subclass of element.  The subclass contains getters and setters for these new child fields.  In this way these new elements are treated as elements by all code, until they find there way back to the code I implement.

Once back, I use instanceof, and a cast to the subtype (messy but I felt it was less messy than duplicating many lines of code).

On top of this I have decided to move my utility that checks that the database is defined consistently with the elements.xml and product-types-to-element-map.xml inside PCS so that it takes advantage of that added functionality. 

-Michael
 

On May 25, 2011, at 10:57 PM, Mattmann, Chris A (388J) wrote:

> Hi Michael,
> 
> Great questions. I think I'd prefer solution 1 too, if we talk through the naming conventions and schema for these new tag names. What would you call them? IOW, how are these elements differentiated in terms of properties or backing? Also what do you mean "backed in the Database"?
> 
> Thanks!
> 
> Cheers,
> Chris
> 
> On May 24, 2011, at 6:02 AM, Starch, Michael D (388L) wrote:
> 
>> All,
>> 
>> So upon testing my first pass at the implementation, I realized that not all of our elements should be of the new subtype, as not all of the elements are backed in the Database.
>> 
>> I see two solutions to this:
>> 
>> 1. Change the name of the element tag to a new name, for any element belonging to the new subtype.
>> 2. If the element tag has the new fields I am adding, it is treated as the new subtype, otherwise it is treated as the old element type.
>> 
>> I like solution 1, as it is a bit clearer from reading the xml file that there is a difference between types.
>> Solution 2 requires fewer changes to the elements.xml file.
>> 
>> Do you all have a preferred solution?
>> 
>> -Michael 
>> 
>> 
>> On May 19, 2011, at 11:06 PM, Mattmann, Chris A (388J) wrote:
>> 
>>> Hi Michael,
>>> 
>>> I'd be happy to reply below, but yes I'd suggest initially CC'ing dev@oodt.apache.org. Giving this information is great, but I prefer the community to benefit from these types of discussions (and it means that I or you or Albert or Cecilia can point others to this question if it is asked again rather than typing the same answer again or fwd'ing something private). Also no worries about researching it -- you can ask questions on that list -- researched or not. We welcome feedback from the community. 
>>> 
>>> Answers inline below:
>>> 
>>> On May 19, 2011, at 2:44 PM, Starch, Michael D (388L) wrote:
>>> 
>>>> Chris,
>>>> 
>>>> In order to create tables in the new database architecture that Brian Foster set up for use, we need the ability to specify some information beyond that which is stored in elements.xml.  We wish to store this information as part of PCS, rather than continually haggling back and forth with our DBA in order to get this set in an ad-hoc manner.
>>>> 
>>>> The data we need to store is:
>>>> 
>>>> -Is it a vector type?
>>>> -Data-type
>>>> -Max data size
>>>> and potentially a few more items.
>>> 
>>> Gotcha.
>>> 
>>>> 
>>>> As it stands now, Brian has used the DCElement tag in elements.xml file in order to store some of this information but not all of it.  Thus it seems natural that the rest of this information gets stored along side it.
>>>> 
>>>> There seem to be three ways to accomplish this:
>>>> 
>>>> 1. Add more information to the DCElement tag, 
>>>> 2. Add additional tags to the element file containing this information
>>>> 3. Create a sperate file to store this information (Not part of PCS)
>>> 
>>> I've got a 4th option.
>>> 
>>> 4. Write a new ValidationLayer implementation, that extends XMLValidationLayer, but adds your desired information to elements.xml. For that extra information, I'd favor your proposed option #2 -- I think it's cleaner.
>>> 
>>>> 
>>>> As head of oodt, we were looking for you input as to which option to choose, or if you wish this question to be brought to dev@oodt.apache, where can I research this information before asking the question on that list?
>>>> 
>>>> Below is a brief analysis of each option, from our perspective.
>>>> 
>>>> Thanks,
>>>> 
>>>> Michael Starch
>>>> 
>>>> 1. Add more information to the DCElements tag.
>>>> 	-I believe Brain said this tag was unused, so he "borrowed" it to suit this purpose.
>>>> 	-Adding more information to the tag would "hijack" this tag to a further extent.
>>>> 	-The changes could be made locally to our ColumnBasedDataSourceCatalog in the manner Brian already used.
>>>> 	-If the DCElement is used ouside of the new use Brian invented, this change could affect the intended (original) use
>>> 
>>> I don't like this one (nor do I think others will) since it hijacks dcElement (which is supposed to be a mapping to a Dublin Core element name, rather than used for other information).
>>> 
>>>> 
>>>> 2. Add more tags to elements.xml
>>>> 	-Most elegant solution, as each data field has a tag specified to hold exactly it
>>>> 	-No need to use tags which were originally intended for some other purpose
>>>> 	-It is an architectural change, changing the format of the elements.xml file
>>>> 	-Must be done carefully to prevent us from deviating from apache-oodt when it is not necessary to do so.
>>> 
>>> +1 for this option.
>>> 
>>>> 	
>>>> 3. A separate file
>>>> 	-Creates data/configuration duplication, as some of these fields must stay (in some form) in elements.xml
>>>> 	-Separates similar pieces of data
>>>> 	-Doesn't alter PCS as it exists now
>>> 
>>> This is fine too, but I prefer #2.
>>> 
>>>> 
>>>> We favor number 1 or number 2 as it keeps similar data together, and feels like it is the most logical place to store this information.
>>>> 
>>>> I personally favor number 2, as it does not use tags for unintended purposes, but I do not wish to make a change that affects our compatibility with apache code.  Hence your input.
>>>> 
>>> 
>>> +1 to your preference. I think the way to accomplish it is to write a YourExtendedValidationLayer extends XMLValidationLayer and that provides a customized:
>>> 1. YourElementClass extends Element 
>>>   - provides extra props (getters+setters) that you want to leverage
>>> 2. implementations of the SerDe for your specialized elements.xml that includes those extra tags
>>> 3. provides the ability to access this ValidationLayer information and Element information in ColumnBasedDataSourceCatalog.
>>> 
>>> BTW, I think it would be great to bake up patches for Apache OODT for what you are working on, including ColumnBasedDataSourceCatalog (already there but in a branch, would be nice to forward port to trunk, and your proposed ValidationLayer and Element extensions). Thoughts? Do you have the time? I think the community at Apache would sincerely appreciate the effort.
>>> 
>>> Cheers,
>>> Chris
>>> 
>>> P.S. Would you be OK with me forwarding this thread to dev@oodt.a.o? I think there's some good info here that I'd hate to be lost in the ether...
>>> 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Senior Computer Scientist
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 171-266B, Mailstop: 171-246
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:   http://sunset.usc.edu/~mattmann/
>>> Phone: +1 (818) 354-8810
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Assistant Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> 
>> 
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: More on the Changes to Elements.xml file

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Michael,

Great questions. I think I'd prefer solution 1 too, if we talk through the naming conventions and schema for these new tag names. What would you call them? IOW, how are these elements differentiated in terms of properties or backing? Also what do you mean "backed in the Database"?

Thanks!

Cheers,
Chris

On May 24, 2011, at 6:02 AM, Starch, Michael D (388L) wrote:

> All,
> 
> So upon testing my first pass at the implementation, I realized that not all of our elements should be of the new subtype, as not all of the elements are backed in the Database.
> 
> I see two solutions to this:
> 
> 1. Change the name of the element tag to a new name, for any element belonging to the new subtype.
> 2. If the element tag has the new fields I am adding, it is treated as the new subtype, otherwise it is treated as the old element type.
> 
> I like solution 1, as it is a bit clearer from reading the xml file that there is a difference between types.
> Solution 2 requires fewer changes to the elements.xml file.
> 
> Do you all have a preferred solution?
> 
> -Michael 
> 
> 
> On May 19, 2011, at 11:06 PM, Mattmann, Chris A (388J) wrote:
> 
>> Hi Michael,
>> 
>> I'd be happy to reply below, but yes I'd suggest initially CC'ing dev@oodt.apache.org. Giving this information is great, but I prefer the community to benefit from these types of discussions (and it means that I or you or Albert or Cecilia can point others to this question if it is asked again rather than typing the same answer again or fwd'ing something private). Also no worries about researching it -- you can ask questions on that list -- researched or not. We welcome feedback from the community. 
>> 
>> Answers inline below:
>> 
>> On May 19, 2011, at 2:44 PM, Starch, Michael D (388L) wrote:
>> 
>>> Chris,
>>> 
>>> In order to create tables in the new database architecture that Brian Foster set up for use, we need the ability to specify some information beyond that which is stored in elements.xml.  We wish to store this information as part of PCS, rather than continually haggling back and forth with our DBA in order to get this set in an ad-hoc manner.
>>> 
>>> The data we need to store is:
>>> 
>>> -Is it a vector type?
>>> -Data-type
>>> -Max data size
>>> and potentially a few more items.
>> 
>> Gotcha.
>> 
>>> 
>>> As it stands now, Brian has used the DCElement tag in elements.xml file in order to store some of this information but not all of it.  Thus it seems natural that the rest of this information gets stored along side it.
>>> 
>>> There seem to be three ways to accomplish this:
>>> 
>>> 1. Add more information to the DCElement tag, 
>>> 2. Add additional tags to the element file containing this information
>>> 3. Create a sperate file to store this information (Not part of PCS)
>> 
>> I've got a 4th option.
>> 
>> 4. Write a new ValidationLayer implementation, that extends XMLValidationLayer, but adds your desired information to elements.xml. For that extra information, I'd favor your proposed option #2 -- I think it's cleaner.
>> 
>>> 
>>> As head of oodt, we were looking for you input as to which option to choose, or if you wish this question to be brought to dev@oodt.apache, where can I research this information before asking the question on that list?
>>> 
>>> Below is a brief analysis of each option, from our perspective.
>>> 
>>> Thanks,
>>> 
>>> Michael Starch
>>> 
>>> 1. Add more information to the DCElements tag.
>>> 	-I believe Brain said this tag was unused, so he "borrowed" it to suit this purpose.
>>> 	-Adding more information to the tag would "hijack" this tag to a further extent.
>>> 	-The changes could be made locally to our ColumnBasedDataSourceCatalog in the manner Brian already used.
>>> 	-If the DCElement is used ouside of the new use Brian invented, this change could affect the intended (original) use
>> 
>> I don't like this one (nor do I think others will) since it hijacks dcElement (which is supposed to be a mapping to a Dublin Core element name, rather than used for other information).
>> 
>>> 
>>> 2. Add more tags to elements.xml
>>> 	-Most elegant solution, as each data field has a tag specified to hold exactly it
>>> 	-No need to use tags which were originally intended for some other purpose
>>> 	-It is an architectural change, changing the format of the elements.xml file
>>> 	-Must be done carefully to prevent us from deviating from apache-oodt when it is not necessary to do so.
>> 
>> +1 for this option.
>> 
>>> 	
>>> 3. A separate file
>>> 	-Creates data/configuration duplication, as some of these fields must stay (in some form) in elements.xml
>>> 	-Separates similar pieces of data
>>> 	-Doesn't alter PCS as it exists now
>> 
>> This is fine too, but I prefer #2.
>> 
>>> 
>>> We favor number 1 or number 2 as it keeps similar data together, and feels like it is the most logical place to store this information.
>>> 
>>> I personally favor number 2, as it does not use tags for unintended purposes, but I do not wish to make a change that affects our compatibility with apache code.  Hence your input.
>>> 
>> 
>> +1 to your preference. I think the way to accomplish it is to write a YourExtendedValidationLayer extends XMLValidationLayer and that provides a customized:
>> 1. YourElementClass extends Element 
>>    - provides extra props (getters+setters) that you want to leverage
>> 2. implementations of the SerDe for your specialized elements.xml that includes those extra tags
>> 3. provides the ability to access this ValidationLayer information and Element information in ColumnBasedDataSourceCatalog.
>> 
>> BTW, I think it would be great to bake up patches for Apache OODT for what you are working on, including ColumnBasedDataSourceCatalog (already there but in a branch, would be nice to forward port to trunk, and your proposed ValidationLayer and Element extensions). Thoughts? Do you have the time? I think the community at Apache would sincerely appreciate the effort.
>> 
>> Cheers,
>> Chris
>> 
>> P.S. Would you be OK with me forwarding this thread to dev@oodt.a.o? I think there's some good info here that I'd hate to be lost in the ether...
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> Phone: +1 (818) 354-8810
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++