You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oodt.apache.org by "Nguyen, Ricky" <rn...@chla.usc.edu> on 2012/01/18 19:27:39 UTC

storing a list of objects in metadata

Hi all,

Suppose I wanted to store a list of episodes in metadata. An episode has 2 properties: start time and end time. I can think of 2 ways to do this:

1) use sub-metadata groups
<key>Episode/1/start</key>
<val>2011-01-15</val>
<key>Episode/1/end</key>
<val>2011-01-16</val>
<key>Episode/2/start</key>
<val>2011-01-17</val>
<key>Episode/2/end</key>
<val>2011-01-18</val>

2) declare length, and follow key patterns EpStartN and EpEndN
<key>NumEpisodes</key>
<val>2</val>
<key>EpStart1</key>
<val>2011-01-15</val>

So then the next question is, how do I write elements.xml (in FileMgr policy) to accommodate the variable number of keys?

And a follow up question is, how do I retrieve/use this information when creating PGE config queries? (SQL(Format='$whatgoeshere') SELECT whatgoeshere FROM …)

Or maybe I just shouldn't store objects in metadata...

Thanks,
Ricky

Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Cam,

Sent from my iPhone

On Jan 20, 2012, at 11:19 AM, "Cameron Goodale" <go...@apache.org>> wrote:

I think between Ricky, Rishi and Paul R a Java mongoDB catalog is on the horizon.  I would imagine the setups would be similar to the jdbc connection settings (URI, port, username, etc...).  The real oddity I see is the db will have a dynamic schema.  For my use case I would use a (python) Driver to Query the catalog instead of going through FileManager via XML-RPC.


FM already has use cases for this since its Lucene Catalog is basically nosql like mongo so we can follow a similar model.



The power of the schema-free collection means you could potentially throw nested metadata (documents) into mongo without ever defining policy, which could potentially mean no need to restart if you throw in a new doc/metadata schema since the underlying catalog doesn't care.


Policy is important in the OODT context (think met extractor definitions as an example) but yes we can make it transparent to mongo and to the OODT user.


There is still so much we need to consider from how to declare indexes, replica sets, etc....but it seems like a worth while effort given how often people want to query the catalog to drive web UI's from graphs to maps.

Yep!

Cheers,
Chris (who now loves open source law)


-Cam

On Thu, Jan 19, 2012 at 7:05 PM, Mattmann, Chris A (388J) <ch...@jpl.nasa.gov>> wrote:
Super +1!

Sent from my iPhone

On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu>> wrote:

> MongoDB? Haha
>
> Sent from my iPhone
>
> On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>> wrote:
>
>> Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one
>> (or all of the following):
>>
>> 1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
>> properly namespaced, and read in the XMLValidationLayerFactory.
>> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
>> option, and the need to officially "change" the selected ValidationLayer extension point.
>> 3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
>> as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
>> own form of elements.xml with those types of Fields.
>
>
>
> ---------------------------------------------------------------------
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
> is for the sole use of the intended recipient(s) and may contain confidential
> or legally privileged information. Any unauthorized review, use, disclosure
> or distribution is prohibited. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of this original message.
>
> ---------------------------------------------------------------------
>


Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Cam,

Sent from my iPhone

On Jan 20, 2012, at 11:19 AM, "Cameron Goodale" <go...@apache.org>> wrote:

I think between Ricky, Rishi and Paul R a Java mongoDB catalog is on the horizon.  I would imagine the setups would be similar to the jdbc connection settings (URI, port, username, etc...).  The real oddity I see is the db will have a dynamic schema.  For my use case I would use a (python) Driver to Query the catalog instead of going through FileManager via XML-RPC.


FM already has use cases for this since its Lucene Catalog is basically nosql like mongo so we can follow a similar model.



The power of the schema-free collection means you could potentially throw nested metadata (documents) into mongo without ever defining policy, which could potentially mean no need to restart if you throw in a new doc/metadata schema since the underlying catalog doesn't care.


Policy is important in the OODT context (think met extractor definitions as an example) but yes we can make it transparent to mongo and to the OODT user.


There is still so much we need to consider from how to declare indexes, replica sets, etc....but it seems like a worth while effort given how often people want to query the catalog to drive web UI's from graphs to maps.

Yep!

Cheers,
Chris (who now loves open source law)


-Cam

On Thu, Jan 19, 2012 at 7:05 PM, Mattmann, Chris A (388J) <ch...@jpl.nasa.gov>> wrote:
Super +1!

Sent from my iPhone

On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu>> wrote:

> MongoDB? Haha
>
> Sent from my iPhone
>
> On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>> wrote:
>
>> Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one
>> (or all of the following):
>>
>> 1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
>> properly namespaced, and read in the XMLValidationLayerFactory.
>> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
>> option, and the need to officially "change" the selected ValidationLayer extension point.
>> 3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
>> as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
>> own form of elements.xml with those types of Fields.
>
>
>
> ---------------------------------------------------------------------
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
> is for the sole use of the intended recipient(s) and may contain confidential
> or legally privileged information. Any unauthorized review, use, disclosure
> or distribution is prohibited. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of this original message.
>
> ---------------------------------------------------------------------
>


Re: storing a list of objects in metadata

Posted by Sheryl John <sh...@gmail.com>.
+1!
I agree with Cameron regarding the indexing that would help with querying
the metadata.
MongoDB's multi-key feature indexes arrays of object values. ( #mongoDBLA
:D )

So for Ricky's use-case with the list of Start and End datetimes, we could
just query for a value in the list to get the whole object.

On Fri, Jan 20, 2012 at 7:18 AM, Cameron Goodale <go...@apache.org> wrote:

> I think between Ricky, Rishi and Paul R a Java mongoDB catalog is on the
> horizon.  I would imagine the setups would be similar to the jdbc
> connection settings (URI, port, username, etc...).  The real oddity I see
> is the db will have a dynamic schema.  For my use case I would use a
> (python) Driver to Query the catalog instead of going through FileManager
> via XML-RPC.
>
> The power of the schema-free collection means you could potentially throw
> nested metadata (documents) into mongo without ever defining policy, which
> could potentially mean no need to restart if you throw in a new
> doc/metadata schema since the underlying catalog doesn't care.
>
> There is still so much we need to consider from how to declare indexes,
> replica sets, etc....but it seems like a worth while effort given how often
> people want to query the catalog to drive web UI's from graphs to maps.
>
> -Cam
>
>
> On Thu, Jan 19, 2012 at 7:05 PM, Mattmann, Chris A (388J) <
> chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Super +1!
>>
>> Sent from my iPhone
>>
>> On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu>
>> wrote:
>>
>> > MongoDB? Haha
>> >
>> > Sent from my iPhone
>> >
>> > On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <
>> chris.a.mattmann@jpl.nasa.gov> wrote:
>> >
>> >> Right now all the keys must be defined in elements.xml (per the
>> XMLValidationLayer). There has been discussions of any one
>> >> (or all of the following):
>> >>
>> >> 1. config option in XMLValidationLayer to simply accept all provided
>> metadata. Config option would appear in filemgr.properties
>> >> properly namespaced, and read in the XMLValidationLayerFactory.
>> >> 2. a new ValidationLayer would be created
>> (AcceptAllMetValidationLayer) that did the same thing as #1, but without a
>> config
>> >> option, and the need to officially "change" the selected
>> ValidationLayer extension point.
>> >> 3. the addition of (similar to Apache Solr) the ability to read '*'
>> fields, or regex fields, and to specify those in elements.xml, either
>> >> as (a) an addition to the XMLValidationLayer [with config options];
>> and/or (b) a new ValidationLayer with the ability to read its
>> >> own form of elements.xml with those types of Fields.
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
>> > is for the sole use of the intended recipient(s) and may contain
>> confidential
>> > or legally privileged information. Any unauthorized review, use,
>> disclosure
>> > or distribution is prohibited. If you are not the intended recipient,
>> please
>> > contact the sender by reply e-mail and destroy all copies of this
>> original message.
>> >
>> > ---------------------------------------------------------------------
>> >
>>
>
>


-- 
-Sheryl

Re: storing a list of objects in metadata

Posted by Sheryl John <sh...@gmail.com>.
+1!
I agree with Cameron regarding the indexing that would help with querying
the metadata.
MongoDB's multi-key feature indexes arrays of object values. ( #mongoDBLA
:D )

So for Ricky's use-case with the list of Start and End datetimes, we could
just query for a value in the list to get the whole object.

On Fri, Jan 20, 2012 at 7:18 AM, Cameron Goodale <go...@apache.org> wrote:

> I think between Ricky, Rishi and Paul R a Java mongoDB catalog is on the
> horizon.  I would imagine the setups would be similar to the jdbc
> connection settings (URI, port, username, etc...).  The real oddity I see
> is the db will have a dynamic schema.  For my use case I would use a
> (python) Driver to Query the catalog instead of going through FileManager
> via XML-RPC.
>
> The power of the schema-free collection means you could potentially throw
> nested metadata (documents) into mongo without ever defining policy, which
> could potentially mean no need to restart if you throw in a new
> doc/metadata schema since the underlying catalog doesn't care.
>
> There is still so much we need to consider from how to declare indexes,
> replica sets, etc....but it seems like a worth while effort given how often
> people want to query the catalog to drive web UI's from graphs to maps.
>
> -Cam
>
>
> On Thu, Jan 19, 2012 at 7:05 PM, Mattmann, Chris A (388J) <
> chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Super +1!
>>
>> Sent from my iPhone
>>
>> On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu>
>> wrote:
>>
>> > MongoDB? Haha
>> >
>> > Sent from my iPhone
>> >
>> > On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <
>> chris.a.mattmann@jpl.nasa.gov> wrote:
>> >
>> >> Right now all the keys must be defined in elements.xml (per the
>> XMLValidationLayer). There has been discussions of any one
>> >> (or all of the following):
>> >>
>> >> 1. config option in XMLValidationLayer to simply accept all provided
>> metadata. Config option would appear in filemgr.properties
>> >> properly namespaced, and read in the XMLValidationLayerFactory.
>> >> 2. a new ValidationLayer would be created
>> (AcceptAllMetValidationLayer) that did the same thing as #1, but without a
>> config
>> >> option, and the need to officially "change" the selected
>> ValidationLayer extension point.
>> >> 3. the addition of (similar to Apache Solr) the ability to read '*'
>> fields, or regex fields, and to specify those in elements.xml, either
>> >> as (a) an addition to the XMLValidationLayer [with config options];
>> and/or (b) a new ValidationLayer with the ability to read its
>> >> own form of elements.xml with those types of Fields.
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
>> > is for the sole use of the intended recipient(s) and may contain
>> confidential
>> > or legally privileged information. Any unauthorized review, use,
>> disclosure
>> > or distribution is prohibited. If you are not the intended recipient,
>> please
>> > contact the sender by reply e-mail and destroy all copies of this
>> original message.
>> >
>> > ---------------------------------------------------------------------
>> >
>>
>
>


-- 
-Sheryl

Re: storing a list of objects in metadata

Posted by Cameron Goodale <go...@apache.org>.
I think between Ricky, Rishi and Paul R a Java mongoDB catalog is on the
horizon.  I would imagine the setups would be similar to the jdbc
connection settings (URI, port, username, etc...).  The real oddity I see
is the db will have a dynamic schema.  For my use case I would use a
(python) Driver to Query the catalog instead of going through FileManager
via XML-RPC.

The power of the schema-free collection means you could potentially throw
nested metadata (documents) into mongo without ever defining policy, which
could potentially mean no need to restart if you throw in a new
doc/metadata schema since the underlying catalog doesn't care.

There is still so much we need to consider from how to declare indexes,
replica sets, etc....but it seems like a worth while effort given how often
people want to query the catalog to drive web UI's from graphs to maps.

-Cam

On Thu, Jan 19, 2012 at 7:05 PM, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Super +1!
>
> Sent from my iPhone
>
> On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu> wrote:
>
> > MongoDB? Haha
> >
> > Sent from my iPhone
> >
> > On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <
> chris.a.mattmann@jpl.nasa.gov> wrote:
> >
> >> Right now all the keys must be defined in elements.xml (per the
> XMLValidationLayer). There has been discussions of any one
> >> (or all of the following):
> >>
> >> 1. config option in XMLValidationLayer to simply accept all provided
> metadata. Config option would appear in filemgr.properties
> >> properly namespaced, and read in the XMLValidationLayerFactory.
> >> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer)
> that did the same thing as #1, but without a config
> >> option, and the need to officially "change" the selected
> ValidationLayer extension point.
> >> 3. the addition of (similar to Apache Solr) the ability to read '*'
> fields, or regex fields, and to specify those in elements.xml, either
> >> as (a) an addition to the XMLValidationLayer [with config options];
> and/or (b) a new ValidationLayer with the ability to read its
> >> own form of elements.xml with those types of Fields.
> >
> >
> >
> > ---------------------------------------------------------------------
> > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
> > is for the sole use of the intended recipient(s) and may contain
> confidential
> > or legally privileged information. Any unauthorized review, use,
> disclosure
> > or distribution is prohibited. If you are not the intended recipient,
> please
> > contact the sender by reply e-mail and destroy all copies of this
> original message.
> >
> > ---------------------------------------------------------------------
> >
>

Re: storing a list of objects in metadata

Posted by Cameron Goodale <go...@apache.org>.
I think between Ricky, Rishi and Paul R a Java mongoDB catalog is on the
horizon.  I would imagine the setups would be similar to the jdbc
connection settings (URI, port, username, etc...).  The real oddity I see
is the db will have a dynamic schema.  For my use case I would use a
(python) Driver to Query the catalog instead of going through FileManager
via XML-RPC.

The power of the schema-free collection means you could potentially throw
nested metadata (documents) into mongo without ever defining policy, which
could potentially mean no need to restart if you throw in a new
doc/metadata schema since the underlying catalog doesn't care.

There is still so much we need to consider from how to declare indexes,
replica sets, etc....but it seems like a worth while effort given how often
people want to query the catalog to drive web UI's from graphs to maps.

-Cam

On Thu, Jan 19, 2012 at 7:05 PM, Mattmann, Chris A (388J) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Super +1!
>
> Sent from my iPhone
>
> On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu> wrote:
>
> > MongoDB? Haha
> >
> > Sent from my iPhone
> >
> > On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <
> chris.a.mattmann@jpl.nasa.gov> wrote:
> >
> >> Right now all the keys must be defined in elements.xml (per the
> XMLValidationLayer). There has been discussions of any one
> >> (or all of the following):
> >>
> >> 1. config option in XMLValidationLayer to simply accept all provided
> metadata. Config option would appear in filemgr.properties
> >> properly namespaced, and read in the XMLValidationLayerFactory.
> >> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer)
> that did the same thing as #1, but without a config
> >> option, and the need to officially "change" the selected
> ValidationLayer extension point.
> >> 3. the addition of (similar to Apache Solr) the ability to read '*'
> fields, or regex fields, and to specify those in elements.xml, either
> >> as (a) an addition to the XMLValidationLayer [with config options];
> and/or (b) a new ValidationLayer with the ability to read its
> >> own form of elements.xml with those types of Fields.
> >
> >
> >
> > ---------------------------------------------------------------------
> > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,
> > is for the sole use of the intended recipient(s) and may contain
> confidential
> > or legally privileged information. Any unauthorized review, use,
> disclosure
> > or distribution is prohibited. If you are not the intended recipient,
> please
> > contact the sender by reply e-mail and destroy all copies of this
> original message.
> >
> > ---------------------------------------------------------------------
> >
>

Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Super +1!

Sent from my iPhone

On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu> wrote:

> MongoDB? Haha
> 
> Sent from my iPhone
> 
> On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> wrote:
> 
>> Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one 
>> (or all of the following):
>> 
>> 1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
>> properly namespaced, and read in the XMLValidationLayerFactory.
>> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
>> option, and the need to officially "change" the selected ValidationLayer extension point.
>> 3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
>> as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
>> own form of elements.xml with those types of Fields.
> 
> 
> 
> ---------------------------------------------------------------------
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, 
> is for the sole use of the intended recipient(s) and may contain confidential
> or legally privileged information. Any unauthorized review, use, disclosure
> or distribution is prohibited. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of this original message.  
> 
> ---------------------------------------------------------------------
> 

Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Super +1!

Sent from my iPhone

On Jan 19, 2012, at 6:28 PM, "Nguyen, Ricky" <rn...@chla.usc.edu> wrote:

> MongoDB? Haha
> 
> Sent from my iPhone
> 
> On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> wrote:
> 
>> Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one 
>> (or all of the following):
>> 
>> 1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
>> properly namespaced, and read in the XMLValidationLayerFactory.
>> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
>> option, and the need to officially "change" the selected ValidationLayer extension point.
>> 3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
>> as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
>> own form of elements.xml with those types of Fields.
> 
> 
> 
> ---------------------------------------------------------------------
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, 
> is for the sole use of the intended recipient(s) and may contain confidential
> or legally privileged information. Any unauthorized review, use, disclosure
> or distribution is prohibited. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of this original message.  
> 
> ---------------------------------------------------------------------
> 

Re: storing a list of objects in metadata

Posted by "Nguyen, Ricky" <rn...@chla.usc.edu>.
MongoDB? Haha

Sent from my iPhone

On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> wrote:

> Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one 
> (or all of the following):
> 
> 1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
> properly namespaced, and read in the XMLValidationLayerFactory.
> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
> option, and the need to officially "change" the selected ValidationLayer extension point.
> 3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
> as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
> own form of elements.xml with those types of Fields.



---------------------------------------------------------------------
CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, 
is for the sole use of the intended recipient(s) and may contain confidential
or legally privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of this original message.  

---------------------------------------------------------------------


Re: storing a list of objects in metadata

Posted by "Nguyen, Ricky" <rn...@chla.usc.edu>.
MongoDB? Haha

Sent from my iPhone

On Jan 18, 2012, at 6:54 PM, "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> wrote:

> Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one 
> (or all of the following):
> 
> 1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
> properly namespaced, and read in the XMLValidationLayerFactory.
> 2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
> option, and the need to officially "change" the selected ValidationLayer extension point.
> 3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
> as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
> own form of elements.xml with those types of Fields.



---------------------------------------------------------------------
CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, 
is for the sole use of the intended recipient(s) and may contain confidential
or legally privileged information. Any unauthorized review, use, disclosure
or distribution is prohibited. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of this original message.  

---------------------------------------------------------------------


Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Ricky,

On Jan 18, 2012, at 3:35 PM, Nguyen, Ricky wrote:

> Chris,
> 
> I like this idea. Sheryl came up with the same thing independently (great minds!) when we were talking offline.

+1, darn right! Good job guys.

>> 
>> Here's another way that I think will help with what your intended use
>> of the metadata is (to query for it in the PGE, for example).
>> 
>> <keyval>
>> <key>EpisodeStartDateTime</key>
>> <val>2011-01-17</val>
>> <val>2011-01-18</val>
>> </keyval>
>> 
>> <keyval>
>> <key>EpisodeEndDateTime</key>
>> <val>TBD</val>
>> <val>TBD</val>
>> </keyval>
>> 
>> [...snip...]
> 
> 
> <key>NumEpisodes</key>
> <val>2</val>
> 
> would help with this.

+1, agreed.

>> 
>> No need it assumes that by default, since it's a NoSQL type store :)
> 
> 
> +1. With your proposed metadata, the _keys_ are still be pre-defined in elements.xml as NumEpisodes, EpisodeStartTime, EpisodeEndTime. But now the _values_ are multi-valued lists which can grow as necessary.

+1.

> 
> A related question: is there a way to use dynamic metadata keys, or must all keys be predefined in elements.xml ?

Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one 
(or all of the following):

1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
properly namespaced, and read in the XMLValidationLayerFactory.
2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
option, and the need to officially "change" the selected ValidationLayer extension point.
3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
own form of elements.xml with those types of Fields.

HTH!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Ricky,

On Jan 18, 2012, at 3:35 PM, Nguyen, Ricky wrote:

> Chris,
> 
> I like this idea. Sheryl came up with the same thing independently (great minds!) when we were talking offline.

+1, darn right! Good job guys.

>> 
>> Here's another way that I think will help with what your intended use
>> of the metadata is (to query for it in the PGE, for example).
>> 
>> <keyval>
>> <key>EpisodeStartDateTime</key>
>> <val>2011-01-17</val>
>> <val>2011-01-18</val>
>> </keyval>
>> 
>> <keyval>
>> <key>EpisodeEndDateTime</key>
>> <val>TBD</val>
>> <val>TBD</val>
>> </keyval>
>> 
>> [...snip...]
> 
> 
> <key>NumEpisodes</key>
> <val>2</val>
> 
> would help with this.

+1, agreed.

>> 
>> No need it assumes that by default, since it's a NoSQL type store :)
> 
> 
> +1. With your proposed metadata, the _keys_ are still be pre-defined in elements.xml as NumEpisodes, EpisodeStartTime, EpisodeEndTime. But now the _values_ are multi-valued lists which can grow as necessary.

+1.

> 
> A related question: is there a way to use dynamic metadata keys, or must all keys be predefined in elements.xml ?

Right now all the keys must be defined in elements.xml (per the XMLValidationLayer). There has been discussions of any one 
(or all of the following):

1. config option in XMLValidationLayer to simply accept all provided metadata. Config option would appear in filemgr.properties
properly namespaced, and read in the XMLValidationLayerFactory.
2. a new ValidationLayer would be created (AcceptAllMetValidationLayer) that did the same thing as #1, but without a config
option, and the need to officially "change" the selected ValidationLayer extension point.
3. the addition of (similar to Apache Solr) the ability to read '*' fields, or regex fields, and to specify those in elements.xml, either
as (a) an addition to the XMLValidationLayer [with config options]; and/or (b) a new ValidationLayer with the ability to read its
own form of elements.xml with those types of Fields.

HTH!

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: storing a list of objects in metadata

Posted by "Nguyen, Ricky" <rn...@chla.usc.edu>.
Chris,

I like this idea. Sheryl came up with the same thing independently (great minds!) when we were talking offline.

Comments inline...

On Jan 18, 2012, at 1:26 PM, Mattmann, Chris A (388J) wrote:

> Hey Ricky,
> 
> On Jan 18, 2012, at 10:27 AM, Nguyen, Ricky wrote:
> 
>> Hi all,
>> 
>> Suppose I wanted to store a list of episodes in metadata. An episode has 2 properties: start time and end time. I can think of 2 ways to do this:
>> 
>> 1) use sub-metadata groups
>> <key>Episode/1/start</key>
>> <val>2011-01-15</val>
>> <key>Episode/1/end</key>
>> <val>2011-01-16</val>
>> <key>Episode/2/start</key>
>> <val>2011-01-17</val>
>> <key>Episode/2/end</key>
>> <val>2011-01-18</val>
>> 
>> 2) declare length, and follow key patterns EpStartN and EpEndN
>> <key>NumEpisodes</key>
>> <val>2</val>
>> <key>EpStart1</key>
>> <val>2011-01-15</val>
> 
> Here's another way that I think will help with what your intended use
> of the metadata is (to query for it in the PGE, for example).
> 
> <keyval>
> <key>EpisodeStartDateTime</key>
> <val>2011-01-17</val>
> <val>2011-01-18</val>
> </keyval>
> 
> <keyval>
> <key>EpisodeEndDateTime</key>
> <val>TBD</val>
> <val>TBD</val>
> </keyval>
> 
> Then, you just make sure to encode a commensurate number of values
> together in each metadata field, then you can basically pick one of the
> fields to interrogate and do something like:
> 
> Metadata met;
> List<String> startDateTimes = met.getAllMetadata("EpsidodeStartDateTime") != null ? 
>   met.getAllMetadata("EpsidodeStartDateTime"):new Vector<String>();
> List <String> endDateTimes = met.getAllMetadata("EpsidodeEndDateTime") != null ? 
> met.getAllMetadata("EpsidodeEndDateTime"):new Vector<String>();
> 
> for(int i=0; i < startDateTimes.size() ; i++){
>      // i == the episodeNum? 
>     // if it isn't, then just store an associated EpisodeNum field with the same
>    // size as the start/end date time
>     String startDateTime = startDateTimes.get(i);
>     String endDateTime = endDateTimes.get(i);
> }
> 
> Does that help?


<key>NumEpisodes</key>
<val>2</val>

would help with this.


> 
>> 
>> So then the next question is, how do I write elements.xml (in FileMgr policy) to accommodate the variable number of keys?
> 
> No need it assumes that by default, since it's a NoSQL type store :)


+1. With your proposed metadata, the _keys_ are still be pre-defined in elements.xml as NumEpisodes, EpisodeStartTime, EpisodeEndTime. But now the _values_ are multi-valued lists which can grow as necessary.

A related question: is there a way to use dynamic metadata keys, or must all keys be predefined in elements.xml ?


> 
>> 
>> And a follow up question is, how do I retrieve/use this information when creating PGE config queries? (SQL(Format='$whatgoeshere') SELECT whatgoeshere FROM �)
> 
> You can query that way and you'll just get back all the values (delimited by ",") and concatenated together.


+1. Your proposed metadata is easily query-able.


> 
> Cheers,
> Chris
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 



Re: storing a list of objects in metadata

Posted by "Nguyen, Ricky" <rn...@chla.usc.edu>.
Chris,

I like this idea. Sheryl came up with the same thing independently (great minds!) when we were talking offline.

Comments inline...

On Jan 18, 2012, at 1:26 PM, Mattmann, Chris A (388J) wrote:

> Hey Ricky,
> 
> On Jan 18, 2012, at 10:27 AM, Nguyen, Ricky wrote:
> 
>> Hi all,
>> 
>> Suppose I wanted to store a list of episodes in metadata. An episode has 2 properties: start time and end time. I can think of 2 ways to do this:
>> 
>> 1) use sub-metadata groups
>> <key>Episode/1/start</key>
>> <val>2011-01-15</val>
>> <key>Episode/1/end</key>
>> <val>2011-01-16</val>
>> <key>Episode/2/start</key>
>> <val>2011-01-17</val>
>> <key>Episode/2/end</key>
>> <val>2011-01-18</val>
>> 
>> 2) declare length, and follow key patterns EpStartN and EpEndN
>> <key>NumEpisodes</key>
>> <val>2</val>
>> <key>EpStart1</key>
>> <val>2011-01-15</val>
> 
> Here's another way that I think will help with what your intended use
> of the metadata is (to query for it in the PGE, for example).
> 
> <keyval>
> <key>EpisodeStartDateTime</key>
> <val>2011-01-17</val>
> <val>2011-01-18</val>
> </keyval>
> 
> <keyval>
> <key>EpisodeEndDateTime</key>
> <val>TBD</val>
> <val>TBD</val>
> </keyval>
> 
> Then, you just make sure to encode a commensurate number of values
> together in each metadata field, then you can basically pick one of the
> fields to interrogate and do something like:
> 
> Metadata met;
> List<String> startDateTimes = met.getAllMetadata("EpsidodeStartDateTime") != null ? 
>   met.getAllMetadata("EpsidodeStartDateTime"):new Vector<String>();
> List <String> endDateTimes = met.getAllMetadata("EpsidodeEndDateTime") != null ? 
> met.getAllMetadata("EpsidodeEndDateTime"):new Vector<String>();
> 
> for(int i=0; i < startDateTimes.size() ; i++){
>      // i == the episodeNum? 
>     // if it isn't, then just store an associated EpisodeNum field with the same
>    // size as the start/end date time
>     String startDateTime = startDateTimes.get(i);
>     String endDateTime = endDateTimes.get(i);
> }
> 
> Does that help?


<key>NumEpisodes</key>
<val>2</val>

would help with this.


> 
>> 
>> So then the next question is, how do I write elements.xml (in FileMgr policy) to accommodate the variable number of keys?
> 
> No need it assumes that by default, since it's a NoSQL type store :)


+1. With your proposed metadata, the _keys_ are still be pre-defined in elements.xml as NumEpisodes, EpisodeStartTime, EpisodeEndTime. But now the _values_ are multi-valued lists which can grow as necessary.

A related question: is there a way to use dynamic metadata keys, or must all keys be predefined in elements.xml ?


> 
>> 
>> And a follow up question is, how do I retrieve/use this information when creating PGE config queries? (SQL(Format='$whatgoeshere') SELECT whatgoeshere FROM …)
> 
> You can query that way and you'll just get back all the values (delimited by ",") and concatenated together.


+1. Your proposed metadata is easily query-able.


> 
> Cheers,
> Chris
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 



Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Ricky,

On Jan 18, 2012, at 10:27 AM, Nguyen, Ricky wrote:

> Hi all,
> 
> Suppose I wanted to store a list of episodes in metadata. An episode has 2 properties: start time and end time. I can think of 2 ways to do this:
> 
> 1) use sub-metadata groups
> <key>Episode/1/start</key>
> <val>2011-01-15</val>
> <key>Episode/1/end</key>
> <val>2011-01-16</val>
> <key>Episode/2/start</key>
> <val>2011-01-17</val>
> <key>Episode/2/end</key>
> <val>2011-01-18</val>
> 
> 2) declare length, and follow key patterns EpStartN and EpEndN
> <key>NumEpisodes</key>
> <val>2</val>
> <key>EpStart1</key>
> <val>2011-01-15</val>

Here's another way that I think will help with what your intended use
of the metadata is (to query for it in the PGE, for example).

<keyval>
<key>EpisodeStartDateTime</key>
<val>2011-01-17</val>
<val>2011-01-18</val>
</keyval>

<keyval>
<key>EpisodeEndDateTime</key>
<val>TBD</val>
<val>TBD</val>
</keyval>

Then, you just make sure to encode a commensurate number of values
together in each metadata field, then you can basically pick one of the
fields to interrogate and do something like:

Metadata met;
List<String> startDateTimes = met.getAllMetadata("EpsidodeStartDateTime") != null ? 
   met.getAllMetadata("EpsidodeStartDateTime"):new Vector<String>();
List <String> endDateTimes = met.getAllMetadata("EpsidodeEndDateTime") != null ? 
met.getAllMetadata("EpsidodeEndDateTime"):new Vector<String>();

for(int i=0; i < startDateTimes.size() ; i++){
      // i == the episodeNum? 
     // if it isn't, then just store an associated EpisodeNum field with the same
    // size as the start/end date time
     String startDateTime = startDateTimes.get(i);
     String endDateTime = endDateTimes.get(i);
}

Does that help?

> 
> So then the next question is, how do I write elements.xml (in FileMgr policy) to accommodate the variable number of keys?

No need it assumes that by default, since it's a NoSQL type store :)

> 
> And a follow up question is, how do I retrieve/use this information when creating PGE config queries? (SQL(Format='$whatgoeshere') SELECT whatgoeshere FROM …)

You can query that way and you'll just get back all the values (delimited by ",") and concatenated together.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: storing a list of objects in metadata

Posted by Bruce Barkstrom <br...@gmail.com>.
If these are episodes that follow one another, as they
might in a production schedule, an interesting variant
of this list might be the predecessor and successor
links that you'd need to fill in a production graph of
production.  In other words, Episode 1 might have
Episode 1a and Episode 2 as successors.  You might
want to store the graph, which could be fed to a production
script or to a group of users to give them the expected
time of production.

Bruce b.

On Wed, Jan 18, 2012 at 1:27 PM, Nguyen, Ricky <rn...@chla.usc.edu> wrote:
> Hi all,
>
> Suppose I wanted to store a list of episodes in metadata. An episode has 2 properties: start time and end time. I can think of 2 ways to do this:
>
> 1) use sub-metadata groups
> <key>Episode/1/start</key>
> <val>2011-01-15</val>
> <key>Episode/1/end</key>
> <val>2011-01-16</val>
> <key>Episode/2/start</key>
> <val>2011-01-17</val>
> <key>Episode/2/end</key>
> <val>2011-01-18</val>
>
> 2) declare length, and follow key patterns EpStartN and EpEndN
> <key>NumEpisodes</key>
> <val>2</val>
> <key>EpStart1</key>
> <val>2011-01-15</val>
>
> So then the next question is, how do I write elements.xml (in FileMgr policy) to accommodate the variable number of keys?
>
> And a follow up question is, how do I retrieve/use this information when creating PGE config queries? (SQL(Format='$whatgoeshere') SELECT whatgoeshere FROM …)
>
> Or maybe I just shouldn't store objects in metadata...
>
> Thanks,
> Ricky

Re: storing a list of objects in metadata

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hey Ricky,

On Jan 18, 2012, at 10:27 AM, Nguyen, Ricky wrote:

> Hi all,
> 
> Suppose I wanted to store a list of episodes in metadata. An episode has 2 properties: start time and end time. I can think of 2 ways to do this:
> 
> 1) use sub-metadata groups
> <key>Episode/1/start</key>
> <val>2011-01-15</val>
> <key>Episode/1/end</key>
> <val>2011-01-16</val>
> <key>Episode/2/start</key>
> <val>2011-01-17</val>
> <key>Episode/2/end</key>
> <val>2011-01-18</val>
> 
> 2) declare length, and follow key patterns EpStartN and EpEndN
> <key>NumEpisodes</key>
> <val>2</val>
> <key>EpStart1</key>
> <val>2011-01-15</val>

Here's another way that I think will help with what your intended use
of the metadata is (to query for it in the PGE, for example).

<keyval>
<key>EpisodeStartDateTime</key>
<val>2011-01-17</val>
<val>2011-01-18</val>
</keyval>

<keyval>
<key>EpisodeEndDateTime</key>
<val>TBD</val>
<val>TBD</val>
</keyval>

Then, you just make sure to encode a commensurate number of values
together in each metadata field, then you can basically pick one of the
fields to interrogate and do something like:

Metadata met;
List<String> startDateTimes = met.getAllMetadata("EpsidodeStartDateTime") != null ? 
   met.getAllMetadata("EpsidodeStartDateTime"):new Vector<String>();
List <String> endDateTimes = met.getAllMetadata("EpsidodeEndDateTime") != null ? 
met.getAllMetadata("EpsidodeEndDateTime"):new Vector<String>();

for(int i=0; i < startDateTimes.size() ; i++){
      // i == the episodeNum? 
     // if it isn't, then just store an associated EpisodeNum field with the same
    // size as the start/end date time
     String startDateTime = startDateTimes.get(i);
     String endDateTime = endDateTimes.get(i);
}

Does that help?

> 
> So then the next question is, how do I write elements.xml (in FileMgr policy) to accommodate the variable number of keys?

No need it assumes that by default, since it's a NoSQL type store :)

> 
> And a follow up question is, how do I retrieve/use this information when creating PGE config queries? (SQL(Format='$whatgoeshere') SELECT whatgoeshere FROM …)

You can query that way and you'll just get back all the values (delimited by ",") and concatenated together.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: storing a list of objects in metadata

Posted by Bruce Barkstrom <br...@gmail.com>.
If these are episodes that follow one another, as they
might in a production schedule, an interesting variant
of this list might be the predecessor and successor
links that you'd need to fill in a production graph of
production.  In other words, Episode 1 might have
Episode 1a and Episode 2 as successors.  You might
want to store the graph, which could be fed to a production
script or to a group of users to give them the expected
time of production.

Bruce b.

On Wed, Jan 18, 2012 at 1:27 PM, Nguyen, Ricky <rn...@chla.usc.edu> wrote:
> Hi all,
>
> Suppose I wanted to store a list of episodes in metadata. An episode has 2 properties: start time and end time. I can think of 2 ways to do this:
>
> 1) use sub-metadata groups
> <key>Episode/1/start</key>
> <val>2011-01-15</val>
> <key>Episode/1/end</key>
> <val>2011-01-16</val>
> <key>Episode/2/start</key>
> <val>2011-01-17</val>
> <key>Episode/2/end</key>
> <val>2011-01-18</val>
>
> 2) declare length, and follow key patterns EpStartN and EpEndN
> <key>NumEpisodes</key>
> <val>2</val>
> <key>EpStart1</key>
> <val>2011-01-15</val>
>
> So then the next question is, how do I write elements.xml (in FileMgr policy) to accommodate the variable number of keys?
>
> And a follow up question is, how do I retrieve/use this information when creating PGE config queries? (SQL(Format='$whatgoeshere') SELECT whatgoeshere FROM …)
>
> Or maybe I just shouldn't store objects in metadata...
>
> Thanks,
> Ricky