You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-dev@xml.apache.org by Dave Viner <dv...@yahoo-inc.com> on 2002/11/26 06:39:18 UTC

first impl of metadata storage for xindice

Hi,
	For the past few weeks, I've been working on developing the meta
data storage facility that we discussed long ago on the xindice-dev list.
(See http://marc.theaimsgroup.com/?t=102873960400001&r=1&w=2 for
discussion.)
Essentially, this capability is a major advantage to Xindice when its being
used as something like a content management system.  All such systems
provide
some mechanisms similar to this.  Such functionality is not really available
within Xindice currently, and it takes a while to add on such features.  As
such, these features would definitely help Xindice attract CMS developers
who would like to use a native XML database.  Locally, we have been using
a modified version of Xindice with these features, and it is extremely
useful.

	Here are a series of new files and patches that will enable the
storage and retrieval of metadata within Xindice.  However, in the spirit of
not bogging all users down, an administrator must enable metadata storage
inside the server.xml by adding 'use-metadata="on"' to the root-collection
element.  So this functionality is completely configurable and can be added
to the core without any functionality change for users who don't need
metadata type functionality.
	Here's a short explanation of how it works.  When metadata is
enabled, the database will maintain a MetaSystemCollection.  This collection
is analagous to the SystemCollection in that no documents can be stored into
it directly by the user.  Each collection and document has a MetaData object
associated with it.  That object keeps track of several pieces of
information
that can be represented in xml as:
    <meta>
        <system [type="doc|col"]>
            <attr name="created" value="1038266252196" />
            <attr name="modified" value="1038266252196" />
        </system>
        <attrs>
            <attr name="name" value="value"/>
            <attr name="name" value="value"/>
        </attrs>
        <custom>
            <any>custom xml you want</any>
        </custom>
    </meta>
The system will keep track of the created and last modified times for you.
The attrs are an arbitrary set of key value pairs that a user can add to
or subtract from.  The custom section is used for any well-formed xml that
you might want.  (The create/modified values are stored as milliseconds
since
midnight Jan 1 1970, exactly as System.currentTimeMillisecs().)
	To access the metadata information, I've created a set of XMLRPC
'messages'.  They are GetCollectionMeta, GetDocumentMeta, SetCollectionMeta,
and SetDocumentMeta.  These allow complete access to the metadata
information.

	Attached to this message are the patches for the core server.  In
addition there are a few new files that are required.  Both the zip file
and the patch file should be run from the xml-xindice dir.  As this is a
pretty significant addition to the core of Xindice, please take a minute to
look it over and let me know what you think either of the design or this
initial implementation.  It is certainly not perfect, but it is a good first
step towards an implementation of metadata storage in the core of Xindice.

Thanks
Dave Viner

Re: first impl of metadata storage for xindice

Posted by Gianugo Rabellino <gi...@apache.org>.
Dave Viner wrote:

> 	I haven't thought about adding namespaces, but it is probably a good 
> idea.
> It a small change that has very little impact, but as you mention 
> would open
> up lots of possibilities.  I'd like to get this first cut into the 
> mainline
> code base, then add features.  But your suggestion will be high on my 
> list.

I'd rather decide a namespace and go for it ASAP (we might well change 
it in the future, but we might be unable to roll back from a 
non-namespaced metadata system, don't you think so?

> 	Once you enable metadata, all new collections and documents will have a
> metadata object associated with them.  Collections and documents that
> already exist will have metadata objects created for them when you request
> it. 


Cool.

> 	I have not thought too much about the XML:DB API extensions.  I 
> implemented
> a few simple XML-RPC methods that allow users access to the metadata.  

Would you mind sending a code snippet in order to understand how you 
deal both with XML:DB API to get a document and plain RPC messages to 
get data? I'm walking through the code, this would save me some time. :-)

> My
> understanding of "Service" in the context of Xindice is pretty 
> limited.  It
> might be worthwhile to create a MetaDataService.  However, I think simple
> XML-RPC methods are simpler for now.  Are there clear benefits to 
> providing
> a MetaDataService that are not available by simply using the XML-RPC
> methods?


Well, for one it would be cleaner, not tied to XML-RPC and transparent 
to users. it would be something like:

MetaDataQueryService service =
     (MetaDataQueryService) collection.getService("XPathQueryService", 
"1.0");

MetaData metas = service.getMetaData();

easy and clean, don't you think?

I can see another clean solution, more proprietary (at least until we 
manage to have an XML:DB extension about metadata) but even easier and 
close to JDBC:

MetaData metas;
if (collection instanceof XindiceCollection) {
	metas = ((XindiceCollection)collection).getMetaData();
}

> 	Thanks for taking the time to look at this feature!  I personally think
> it's pretty cool, and could help to differentiate Xindice from other XML
> dbs.

It is pretty cool, definitely. It will serve a noble purpose also: 
bridging the gap from other XML dbs that already support metadata.

 From what I saw until now, I must say that I really like your solution. 
It might be worth to discuss about the MetaData interface (which seems 
pretty complete to me though) in order to define what we exactly might 
want from metadata: this would be crucial for the next logical step, 
which would be going to xapi-dev and ask for an official addition to the 
XML:DB API. How about it?

Ciao,

-- 
Gianugo Rabellino


RE: first impl of metadata storage for xindice

Posted by Dave Viner <dv...@yahoo-inc.com>.
Hi Gianugo,
	I haven't thought about adding namespaces, but it is probably a good idea.
It a small change that has very little impact, but as you mention would open
up lots of possibilities.  I'd like to get this first cut into the mainline
code base, then add features.  But your suggestion will be high on my list.
	Once you enable metadata, all new collections and documents will have a
metadata object associated with them.  Collections and documents that
already exist will have metadata objects created for them when you request
it.  The current implementation stores the creation time and last modified
time for you.  We could add more, but I was think of a regular filesystem,
and these are two of the more beneficial features of filesystems.
	I have not thought too much about the XML:DB API extensions.  I implemented
a few simple XML-RPC methods that allow users access to the metadata.  My
understanding of "Service" in the context of Xindice is pretty limited.  It
might be worthwhile to create a MetaDataService.  However, I think simple
XML-RPC methods are simpler for now.  Are there clear benefits to providing
a MetaDataService that are not available by simply using the XML-RPC
methods?

	Thanks for taking the time to look at this feature!  I personally think
it's pretty cool, and could help to differentiate Xindice from other XML
dbs.

Thanks
Dave Viner


-----Original Message-----
From: Gianugo Rabellino [mailto:gianugo@apache.org]
Sent: Tuesday, November 26, 2002 1:13 AM
To: xindice-dev@xml.apache.org
Subject: Re: first impl of metadata storage for xindice


Dave Viner wrote:

> Hi,
> 	For the past few weeks, I've been working on developing the meta
> data storage facility that we discussed long ago on the xindice-dev list.
> (See http://marc.theaimsgroup.com/?t=102873960400001&r=1&w=2 for
> discussion.)


Dave,

this is really cool stuff, thanks a lot for providing it.

I have to go through it a bit more to understand the implementation, but
as of now I have already some questions/suggestions floating around:

1. Have you thought about properly namespacing the metadata XML? Even a
default namespace it might be safer (think about "including" the
metadata in the data stream). I think this is an easy yet powerful
addition, and it might allow for cool stuff like embedding (namespaced)
metadata in a Collection or Resource result;

2. Do we have to explicitly set the metadata for any resource or, by
just enabling a collection-level metadata each resource will get a
minimum set such as creation/modification time? I understand that those
data are available in the Xindice core, so it should be fairly easy to
use them;

3. I'm thinking about how metadata might be exposed by a simple
extension of the XML:DB API. I think that the "Service" extension point
might be a good place to start (something like doing a
MetadataQueryService). Have you thought about it? I think that, while we
will have to go our way sooner or later, we should try to stay as close
as possible to the API.

Let me know what you think. Meanwhile, I'm eager to play with your
implementation! :-)

Ciao,

--
Gianugo Rabellino



Re: first impl of metadata storage for xindice

Posted by Gianugo Rabellino <gi...@apache.org>.
Dave Viner wrote:

> Hi,
> 	For the past few weeks, I've been working on developing the meta
> data storage facility that we discussed long ago on the xindice-dev list.
> (See http://marc.theaimsgroup.com/?t=102873960400001&r=1&w=2 for
> discussion.)


Dave,

this is really cool stuff, thanks a lot for providing it.

I have to go through it a bit more to understand the implementation, but 
as of now I have already some questions/suggestions floating around:

1. Have you thought about properly namespacing the metadata XML? Even a 
default namespace it might be safer (think about "including" the 
metadata in the data stream). I think this is an easy yet powerful 
addition, and it might allow for cool stuff like embedding (namespaced) 
metadata in a Collection or Resource result;

2. Do we have to explicitly set the metadata for any resource or, by 
just enabling a collection-level metadata each resource will get a 
minimum set such as creation/modification time? I understand that those 
data are available in the Xindice core, so it should be fairly easy to 
use them;

3. I'm thinking about how metadata might be exposed by a simple 
extension of the XML:DB API. I think that the "Service" extension point 
might be a good place to start (something like doing a 
MetadataQueryService). Have you thought about it? I think that, while we 
will have to go our way sooner or later, we should try to stay as close 
as possible to the API.

Let me know what you think. Meanwhile, I'm eager to play with your 
implementation! :-)

Ciao,

-- 
Gianugo Rabellino


RE: first impl of metadata storage for xindice

Posted by Dave Viner <dv...@yahoo-inc.com>.
Damn.  I knew I'd forget something.  Here's a revised zip file with the
MetaSystemCollection.java included.

dave


-----Original Message-----
From: Gianugo Rabellino [mailto:gianugo@apache.org]
Sent: Tuesday, November 26, 2002 1:51 AM
To: xindice-dev@xml.apache.org
Subject: Re: first impl of metadata storage for xindice


Dave Viner wrote:

>
> 	Here's a short explanation of how it works.  When metadata is
> enabled, the database will maintain a MetaSystemCollection.


Which is exactly the class I'm missing. :-) Where am I supposed to find
it? Did you forget to include it?

LMK,

--
Gianugo Rabellino


Re: first impl of metadata storage for xindice

Posted by Gianugo Rabellino <gi...@apache.org>.
Dave Viner wrote:

> 	
> 	Here's a short explanation of how it works.  When metadata is
> enabled, the database will maintain a MetaSystemCollection. 


Which is exactly the class I'm missing. :-) Where am I supposed to find 
it? Did you forget to include it?

LMK,

-- 
Gianugo Rabellino