You are viewing a plain text version of this content. The canonical link for it is here.

Posted to xindice-dev@xml.apache.org by Gianugo Rabellino <gi...@apache.org> on 2002/02/18 11:00:43 UTC

Metadata (was: Planning the future)

Tom Bradford wrote:
> On Saturday, February 16, 2002, at 10:28 PM, Kimbro Staken wrote:
> 
>> Sorry, that isn't what I meant, I was asking about meta-data. What 
>> meta-data are we tracking about documents?
> 
> 
> At the moment, I think just creation time and last modified time.

This would be a good start, but I'm thinking of something more, like 
having a free-form metadata structure so that the metadata themselves 
are extensible. IMO there should be three metadata categories:

1. XML:DB metadata. This should be a joint effort with the XML:DB 
community, to come up with a common set of sensible metadata. There 
should also be an API to access them (maybe a service?). I tend to think 
that these metadata should include at least all the "filesystem like" 
metadata ("man stat") that make sense in an XML:DB database;

2. DB specific metadata. Any DB vendor might want to come up with a 
metadata set wich is peculiar to its implementation. As an example 
Xindice might come up with filer, versioning or linking informations;

3. User provided metadata. Any user should be allowed to store its 
application-specific metadata. This, IMHO, would be killer. And yes, 
Tom, this would allow me to have a CMS up in minutes... :)

This flexible metadata structure IMO asks for an XML metadata format. 
RDF/DC seems to be the most obvious choice, but I'm afraid that it might 
be overkill. Also, while there should be an XML format for the metadata 
themselves, there should also be some kind of API to quickly access the 
data without XML programming.

Comments?

-- 
Gianugo Rabellino

Re: Metadata (was: Planning the future)

Posted by Kimbro Staken <ks...@xmldatabases.org>.

On Monday, February 18, 2002, at 07:16 AM, Gianugo Rabellino wrote:

> Kimbro Staken wrote:
>> On Monday, February 18, 2002, at 06:05 AM, Dawid Weiss wrote:
>>>
>>> KS> What if it was something that was stupid simple like automatically
>>> KS> creating two collections, one for documents and one for metadata?
>
> OK, this is the cleanest hack I can think of, but still it's a hack. :)
>

Never claimed otherwise, "It's kind of hacky". :-)

> It seems to me that we are playing different levels, let's try to make 
> some points clear: you're talking about practical solutions to achieve 
> the result of having metadata, I'm stuck to a generic metadata support, i.
> e. if should metadata be offered to Joe User.  This is *vital*, and we 
> all agree on this, at least for basic "stat-like" metadata. Now:
>

You're right, this is a completely different issue. The problem with the 
XML:DB API is that it's going to be colored by what other databases 
support. I don't know exactly what they support though, so I have to do 
research in order to figure it out. From what I remember most don't have 
very rich support, if any at all.

> * Supporting this means IMHO extending the XML:DB APIs somehow (I'm not 
> in favor of proprietary extensions and the like).
>

You're probably right, I just don't want to tackle that yet. The XML:DB 
API is probably going to change quite a bit before too much longer. I want 
to focus on Xindice right now. Let's figure out what we want to support 
and how we want to expose it through our network API. Then we can look at 
seeing how the XML:DB API shapes up. I'd also prefer to have a little more 
real world experience with this before tackling the real API changes. My 
opinion is that we just add a proprietary service until then

In addition, I'm kind of driving both projects, among others, and right 
now I want to make sure we get Xindice moving nicely first. We've been 
kind of stalled as far as development goes. Once we're rolling along 
nicely, then I'll focus a little more on the XML:DB API project which is 
also kind of stalled. I also need to factor in SixDML which is becoming an 
XML:DB project too.

The major problem with XML:DB is that the XML database industry overall is 
really struggling. We need to get SiXDML going as there is a lot more 
vendor interest in that then the raw API. The good thing is it builds on 
the raw API so it will drive that too. It's just going to take some time 
and I don't want it to affect this project too much. It's more important 
that we get solid support in Xindice first. After all if the XML database 
industry doesn't pick up soon, Xindice may be the only one left and then 
we won't have to worry about anything else. :-(

> * If we are to undertake an effort to change the XML:DB APIs IMHO it's 
> worth to have some investigation and understand what kind of metadata we 
> are to offer. I'm all for having a set of "system level" metadata plus a 
> hook for user ones, but I might be stuck by FS at its best, so please don'
> t hesitate to try my asbestos underwear on this issue. :)
>
> * Given this, it doesn't really matter, at this point of the discussion, 
> whether metadata should be stored inside the DOM, in a parallel document,
>  in a parallel Collection or in my closet: this is just an implementation 
> detail.
>
> I hope I made myself clear now :) Yet the more I think about it the more 
> I'm convinced that this discussion really belongs to xapi-dev...
>

Yes it does.

> Ciao,
>
> -- Gianugo Rabellino
>
>
Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/

Re: Metadata (was: Planning the future)

Posted by Gianugo Rabellino <gi...@apache.org>.

Kimbro Staken wrote:
> 
> On Monday, February 18, 2002, at 06:05 AM, Dawid Weiss wrote:
> 
>>
>> KS> What if it was something that was stupid simple like automatically
>> KS> creating two collections, one for documents and one for metadata? 

OK, this is the cleanest hack I can think of, but still it's a hack. :)

It seems to me that we are playing different levels, let's try to make 
some points clear: you're talking about practical solutions to achieve 
the result of having metadata, I'm stuck to a generic metadata support, 
i.e. if should metadata be offered to Joe User.  This is *vital*, and we 
all agree on this, at least for basic "stat-like" metadata. Now:

* Supporting this means IMHO extending the XML:DB APIs somehow (I'm not 
in favor of proprietary extensions and the like).

* If we are to undertake an effort to change the XML:DB APIs IMHO it's 
worth to have some investigation and understand what kind of metadata we 
are to offer. I'm all for having a set of "system level" metadata plus a 
hook for user ones, but I might be stuck by FS at its best, so please 
don't hesitate to try my asbestos underwear on this issue. :)

* Given this, it doesn't really matter, at this point of the discussion, 
whether metadata should be stored inside the DOM, in a parallel 
document, in a parallel Collection or in my closet: this is just an 
implementation detail.

I hope I made myself clear now :) Yet the more I think about it the more 
I'm convinced that this discussion really belongs to xapi-dev...

Ciao,

-- 
Gianugo Rabellino

Re[6]: Metadata (was: Planning the future)

Posted by Dawid Weiss <Da...@cs.put.poznan.pl>.

KS> Are you still planning to contribute this code? It sounds like it would be
KS> a really good starting place.

I  think it would be no problem. We have JUST finished the prototype though
and  JUnit  shows it works ok. My good guess would be to hold on a month or
two  -  we'll  gain  experience  on  the  interface, its pros and cons, and
contribute  a  more  reliable, stable version. Another side of this coin is
that   we   have  it  implemented  in a very structural-like approach, i.e.
there  has  been no class-responsibility analysis etc. The reason for it is
that we intentionally wanted the interface to be so simple.

We  may change it in the future - time will tell.

KS> That's what we're trying to figure out. I don't actually use Xindice for
KS> anything, so I rely on the real users to tell us what we need.

I'm glad to hear it.
Dawid

Re: Re[4]: Metadata (was: Planning the future)

Posted by Kimbro Staken <ks...@xmldatabases.org>.

On Monday, February 18, 2002, at 06:21 AM, Dawid Weiss wrote:

>
> KS> Yeah, sorry I just realized Krzysztof posted a description of that 
> very
> KS> thing right before I sent that mail. Heh, heh, see what I get for 
> being
> KS> too busy writing and not reading. So Kudos to you guys for not only 
> coming
> KS> up with it, but also implementing it.
>
> That's  all right, it was just funny to see you came out with the same 
> idea
> :)  We  actually spent some time investigating various concepts and the 
> one
> we  chose  seemed  to  have least drawbacks: the database remains "clean"
>  -
> xpath,  document  retrieval  etc - these are all transparent, which was 
> out
> goal.
>

Are you still planning to contribute this code? It sounds like it would be 
a really good starting place.

> Nonetheless,  I  would  love  to  see  a  standard  API  for versioning 
> and
> metadata, and it seems many other people seem to crave for it, not only 
> me.
> It  may  be quite easy to walk around the problem, but since the project 
> is
> open  source, it should perhaps be taken into account what the actual 
> users
> need,  not  only  what  is the 'ideal path', which Xindice should follow 
> ;)
> Just my remark, nothing personal.
>

That's what we're trying to figure out. I don't actually use Xindice for 
anything, so I rely on the real users to tell us what we need. Now, I'm 
definitely hearing that metadata is an important issue, even more then I 
already thought. This is what I need to hear, and I'm listening.


> Dawid
>
>
Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/

Re[4]: Metadata (was: Planning the future)

Posted by Dawid Weiss <Da...@cs.put.poznan.pl>.

KS> Yeah, sorry I just realized Krzysztof posted a description of that very 
KS> thing right before I sent that mail. Heh, heh, see what I get for being 
KS> too busy writing and not reading. So Kudos to you guys for not only coming 
KS> up with it, but also implementing it.

That's  all right, it was just funny to see you came out with the same idea
:)  We  actually spent some time investigating various concepts and the one
we  chose  seemed  to  have least drawbacks: the database remains "clean" -
xpath,  document  retrieval  etc - these are all transparent, which was out
goal.

Nonetheless,  I  would  love  to  see  a  standard  API  for versioning and
metadata, and it seems many other people seem to crave for it, not only me.
It  may  be quite easy to walk around the problem, but since the project is
open  source, it should perhaps be taken into account what the actual users
need,  not  only  what  is the 'ideal path', which Xindice should follow ;)
Just my remark, nothing personal.

Dawid

Re: Re[2]: Metadata (was: Planning the future)

Posted by Kimbro Staken <ks...@xmldatabases.org>.

On Monday, February 18, 2002, at 06:05 AM, Dawid Weiss wrote:

>
> KS> What if it was something that was stupid simple like automatically
> KS> creating two collections, one for documents and one for metadata? Each
>
> We have adopted such a 'stupid simple' approach. It works well. We'd 
> rather
> think of the approach as simple than stupid of course... Any term used, 
> the
> separation  of metadata from the contents based on a prefixed 
> subcollection
> (prefix   reserved)   has  several  advantages  over  storing  them  
> inside
> documents.
>

Yeah, sorry I just realized Krzysztof posted a description of that very 
thing right before I sent that mail. Heh, heh, see what I get for being 
too busy writing and not reading. So Kudos to you guys for not only coming 
up with it, but also implementing it.

> KS> The metadata doc
> KS> would have database managed portions and user managed portions,
>
> Amazing  coincidence  in  thinking. It only proves the design is simple 
> and
> intuitive...
>
> KS> The meta-data collection name would just have a reserved 
> prefix/suffix and
> KS> could be ignored in collection listings.
>
> Exactly :)
>
> KS> Maybe It could be an option so that if you don't want the overhead you
>
> Exactly :)
>
> KS> access the data like any other collection. It wouldn't even require 
> any
> KS> changes to the XML:DB APi if you layered on top.
>
> Exactly. Amazing, I could have been an author of your e-mail...
>
> KS> It's kind of hacky, but it's simple and in reality it might work 
> decently
> KS> well.
>
> It does. We have such an API.
>
> KS> I don't know, just a thought.
>
> A  very  good  one.  Thanks,  it  to  some extent proves we chose the 
> right
> approach to the problem we faced (i.e. lack of versioning/ metadata).

>
> Dawid
>
>
Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/

Re[2]: Metadata (was: Planning the future)

Posted by Dawid Weiss <Da...@cs.put.poznan.pl>.

KS> What if it was something that was stupid simple like automatically
KS> creating two collections, one for documents and one for metadata? Each

We have adopted such a 'stupid simple' approach. It works well. We'd rather
think of the approach as simple than stupid of course... Any term used, the
separation  of metadata from the contents based on a prefixed subcollection
(prefix   reserved)   has  several  advantages  over  storing  them  inside
documents.

KS> The metadata doc
KS> would have database managed portions and user managed portions,

Amazing  coincidence  in  thinking. It only proves the design is simple and
intuitive...

KS> The meta-data collection name would just have a reserved prefix/suffix and
KS> could be ignored in collection listings.

Exactly :)

KS> Maybe It could be an option so that if you don't want the overhead you

Exactly :)

KS> access the data like any other collection. It wouldn't even require any
KS> changes to the XML:DB APi if you layered on top.

Exactly. Amazing, I could have been an author of your e-mail...

KS> It's kind of hacky, but it's simple and in reality it might work decently 
KS> well.

It does. We have such an API.

KS> I don't know, just a thought.

A  very  good  one.  Thanks,  it  to  some extent proves we chose the right
approach to the problem we faced (i.e. lack of versioning/ metadata).

Dawid

Re: Metadata (was: Planning the future)

Posted by Kimbro Staken <ks...@xmldatabases.org>.

On Monday, February 18, 2002, at 04:49 AM, Gianugo Rabellino wrote:

> Yes and no. Think about RDBMS, where you have a poor set of system-level 
> metadata (basically all you get is the column type). The solution for 
> developers has always been to add fields (growing data complexity) or 
> tables (growing logic complexity with JOINs) just for metadata. Yet on a 
> RDBMS that was and still is the way to go, since it's trivial to extend 
> the system in this way.
>
> Now what can a developer do on Xindice without a generic metadata support?
>  I see just two solutions:
>

What if it was something that was stupid simple like automatically 
creating two collections, one for documents and one for metadata? Each 
document would have a corresponding metadata document. The metadata doc 
would have database managed portions and user managed portions, or maybe 
two separate docs, or even two separate collections for user and system. 
All updates would be within a transaction across both docs so they can't 
get out of sync.

That really isn't much different then the data dictionary in an RDBMS, 
except it's semi-structured and user extensible. I guess you could view it 
as a data fork, but I'd really prefer not to (Resource forks on the mac 
are annoying to me, so I'm hiding from the word. :-) ).

The meta-data collection name would just have a reserved prefix/suffix and 
could be ignored in collection listings.

Maybe It could be an option so that if you don't want the overhead you 
could create a collection without metadata, probably not really necessary 
though.

On top of this you can build whatever type of API you want, or simply 
access the data like any other collection. It wouldn't even require any 
changes to the XML:DB APi if you layered on top.

It's kind of hacky, but it's simple and in reality it might work decently 
well. Performance might not be stellar, but there would be room to 
optimize once the mechanism is worked out. You also leverage all the 
existing indexing mechanics to enable meta-data queries.

I don't know, just a thought.

> 1. Create a parallel document with a sensible name (if you have mydoc.xml 
> then you might want to create .mydoc.xml or mydoc.mxml or whatever you 
> want). This is a trick and a sordid hack :) subject to name collision to 
> say the least;
>
> 2. add metadata inside the document, as you suggested. Yet this would 
> increase complexity *a lot* from an application point of view.
>
> I'm not saying that metadata shouldn't belong to the document, actually I 
> was preparing a first draft where documents had a "xindice" namespace 
> added automatically and an <xindice:metadata/> section when stored on the 
> filer. The difference is that I'd rather have a generic API to work 
> specifically with metadata and be neutral about physical metadata 
> location. This means that Xindice might:
>
> 1. when storing a document (Collection.storeResource()): add the metadata 
> section, if not provided in the document, or update/fill it if otherwise.
>
> 2. when retrieving a document (Resource.getContent()): send back the 
> document alone, stripping the metadata section;
>
> 2,1/2. have the possibility to ask the full document, including metadata 
> (ugly: Resource.getFullContent());
>
> 3. when asked for metadata have two ways of accessing them: as the full 
> set (say Resource.getMetadata()) or as an XPath result 
> (Resource.getMetadata("/last-modified")). Hmmm... to make it even easier 
> I wonder if it might be the case to steal some ideas from Avalon's 
> Configuration framework to give some kind of "direct access" to medata 
> elements.
>
> In this scenario it's just an implementation specific decision to 
> understand where should the metadata be physically placed in the storage.
>  Yet there are at least two possible problems with storing them inside 
> the document:
>
> 1. Collection metadata (I think we badly need them too).

User defined?

>
> 2. Binary resources. (currently unsupported). True, you can wrap any 
> binary content in an XML structure, but I don't really know if this is 
> the best solution.

Binary isn't even on the roadmap so I really don't think we should worry 
about this.

>
> This makes me wonder if the real place for document metadata can be the 
> collection itself (well, this is how the Unix filesystem behaves after 
> all). How about it?
>
> Ciao,
>
> -- Gianugo Rabellino
>
>
Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/

Re: Metadata (was: Planning the future)

Posted by Gianugo Rabellino <gi...@apache.org>.

Kimbro Staken wrote:
> To me, it seems that right now it's really a more specialized 
> application level concept. I think the database should track metadata 
> associated with the system, but application metadata should be layered 
> on top. After all we're talking about an XML database here, metadata is 
> supposed to be inherent in the format. 

Yes and no. Think about RDBMS, where you have a poor set of system-level 
metadata (basically all you get is the column type). The solution for 
developers has always been to add fields (growing data complexity) or 
tables (growing logic complexity with JOINs) just for metadata. Yet on a 
RDBMS that was and still is the way to go, since it's trivial to extend 
the system in this way.

Now what can a developer do on Xindice without a generic metadata 
support? I see just two solutions:

1. Create a parallel document with a sensible name (if you have 
mydoc.xml then you might want to create .mydoc.xml or mydoc.mxml or 
whatever you want). This is a trick and a sordid hack :) subject to name 
collision to say the least;

2. add metadata inside the document, as you suggested. Yet this would 
increase complexity *a lot* from an application point of view.

I'm not saying that metadata shouldn't belong to the document, actually 
I was preparing a first draft where documents had a "xindice" namespace 
added automatically and an <xindice:metadata/> section when stored on 
the filer. The difference is that I'd rather have a generic API to work 
specifically with metadata and be neutral about physical metadata 
location. This means that Xindice might:

1. when storing a document (Collection.storeResource()): add the 
metadata section, if not provided in the document, or update/fill it if 
otherwise.

2. when retrieving a document (Resource.getContent()): send back the 
document alone, stripping the metadata section;

2,1/2. have the possibility to ask the full document, including metadata 
(ugly: Resource.getFullContent());

3. when asked for metadata have two ways of accessing them: as the full 
set (say Resource.getMetadata()) or as an XPath result 
(Resource.getMetadata("/last-modified")). Hmmm... to make it even easier 
I wonder if it might be the case to steal some ideas from Avalon's 
Configuration framework to give some kind of "direct access" to medata 
elements.

In this scenario it's just an implementation specific decision to 
understand where should the metadata be physically placed in the 
storage. Yet there are at least two possible problems with storing them 
inside the document:

1. Collection metadata (I think we badly need them too).

2. Binary resources. (currently unsupported). True, you can wrap any 
binary content in an XML structure, but I don't really know if this is 
the best solution.

This makes me wonder if the real place for document metadata can be the 
collection itself (well, this is how the Unix filesystem behaves after 
all). How about it?

Ciao,

-- 
Gianugo Rabellino

Re: Metadata (was: Planning the future)

Posted by Bertrand Delacretaz <bd...@codeconsult.ch>.

On Monday 18 February 2002 12:14, Kimbro Staken wrote:
>. . .
> After all we're talking about an XML database here,
> metadata is supposed to be inherent in the format. 
>. . .

Yes, but if you're considering implementing versioning in the database 
(I don't know if it's the case), it would be a big advantage in many 
cases to be able to find out which "fork" (data, metadata, etc.) of an 
element was modified. For example, knowing that content wasn't modified 
but simply promoted to "status=approved" can be very important.

Actually I didn't think of that before, but versioning info/history is 
definitely another type of "fork" on the data. 

These last two points make me think that the decision to implement user 
metadata as part of the database is strongly related to the decision 
about versioning.

> Hmm, that actually brings up an interesting point for system level
> metadata too. Seems it should be queriable via XPath like the rest of
> the document content. I guess you can present the data as virtual
> namespaced attributes on the document. Maybe with an option on
> retrieval to get the meta data physically added to the document.

Sounds interesting, goes along with the idea of considering metadata 
like another "facet" of the data.

-- 
 -- Bertrand Delacrétaz, www.codeconsult.ch
 -- web technologies consultant - OO, Java, XML, C++

Re: Metadata (was: Planning the future)

Posted by Kimbro Staken <ks...@xmldatabases.org>.

On Monday, February 18, 2002, at 03:56 AM, Bertrand Delacretaz wrote:

> On Monday 18 February 2002 11:38, Dare Obasanjo wrote:
>> . . .
>> I'd like to see Xindice mature as a DBMS before adding
>> features that are specific to a certain usage model
>> especially since we are not sure would be the primary
>> usage model for Xindice.
>> . . .
>
> I understand your point - that's why I tried to suggest a more general
> concept ("data/resource forks") instead of "just" metadata.
>
> IMHO metadata *is* a fork or alternate view on the data that is stored.
> My point is that if the designers decide on implementing such a
> fork it might not be more complicated to implement a general fork
> mechanism.
>
> But of course either one of them (metadata or forks) could be an
> additional layer on top of XIndice. Not being an active developer here
> I cannot decide which way is best, it's just that as a user I'd find
> the metadata/fork concept very useful.

To me, it seems that right now it's really a more specialized application 
level concept. I think the database should track metadata associated with 
the system, but application metadata should be layered on top. After all 
we're talking about an XML database here, metadata is supposed to be 
inherent in the format. For instance putting the document inside a wrapper 
element that contains your metadata. If you just need the document it self 
then you extract it via an XPath. In a lot of ways this is better anyway 
because you can query your meta data like any other data in the document. 
Putting it into a fork of some kind would require special attention to be 
payed to the query mechanism.

Hmm, that actually brings up an interesting point for system level 
metadata too. Seems it should be queriable via XPath like the rest of the 
document content. I guess you can present the data as virtual namespaced 
attributes on the document. Maybe with an option on retrieval to get the 
meta data physically added to the document.

>
> --
>  -- Bertrand Delacrétaz, www.codeconsult.ch
>  -- web technologies consultant - OO, Java, XML, C++
>
>
>
>
>
>
Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/

Re: Metadata (was: Planning the future)

Posted by Bertrand Delacretaz <bd...@codeconsult.ch>.

On Monday 18 February 2002 11:38, Dare Obasanjo wrote:
> . . .
> I'd like to see Xindice mature as a DBMS before adding
> features that are specific to a certain usage model
> especially since we are not sure would be the primary
> usage model for Xindice.
>. . .

I understand your point - that's why I tried to suggest a more general 
concept ("data/resource forks") instead of "just" metadata. 

IMHO metadata *is* a fork or alternate view on the data that is stored.
My point is that if the designers decide on implementing such a 
fork it might not be more complicated to implement a general fork 
mechanism.

But of course either one of them (metadata or forks) could be an 
additional layer on top of XIndice. Not being an active developer here 
I cannot decide which way is best, it's just that as a user I'd find 
the metadata/fork concept very useful.

-- 
 -- Bertrand Delacrétaz, www.codeconsult.ch
 -- web technologies consultant - OO, Java, XML, C++

Re: Metadata (was: Planning the future)

Posted by Dare Obasanjo <kp...@yahoo.com>.

I would rather that effort is expended in adding
database management features to Xindice than features
specific to content management systems. 

I'd like to see Xindice mature as a DBMS before adding
features that are specific to a certain usage model
especially since we are not sure would be the primary
usage model for Xindice.

I'd hate for Xindice to become a Jack of all trades
and master of none. 

--- Bertrand Delacretaz <bd...@codeconsult.ch>
wrote:
> On Monday 18 February 2002 11:00, Gianugo Rabellino
> wrote:
> >. . .
> > 1. XML:DB metadata. . . .
> > 2. DB specific metadata.. .
> > 3. User provided metadata. . .
> >. . .
> 
> I'm jumping in without much knowledge of XIndice
> internals, but I've 
> been thinking about this lately the context of
> CMS/information 
> management. 
> 
> Maybe these thoughts help:
> 
> Speaking of metadata, the "resource fork/data fork"
> concept of the 
> Mac filesystem comes to mind here:
> 
> -Staging/release forks, the same data at several
> stages of completion 
> ("release" is published, "staging" for internal use
> only, later 
> promoted to "release"). 
> 
> -Various metadata forks as you mentioned, different
> facets of the same 
> data
> 
> Maybe implementing the concept of "forks" internally
> in an open way 
> wouldn't be too hard? I think it would be more
> useful that specifiying 
> a fixed number of metadata types.
> 
> >. . .
> > This flexible metadata structure IMO asks for an
> XML metadata format.
> > RDF/DC seems to be the most obvious choice. . .
> >...
> 
> IMHO the user metadata format should be left open
> for maximum 
> flexibility. Or, if this "forks" concept makes
> sense, schemas could be 
> assigned to forks based on fork names (i.e.
> "dc-meta-fork" maps to DC 
> schema).
> 
> -- 
>  -- Bertrand Delacr�taz, www.codeconsult.ch
>  -- web technologies consultant - OO, Java, XML, C++
> 
> 
> 
> 
> 

=====
THINGS TO DO IF I BECOME AN EVIL OVERLORD #68
I will spare someone who saved my life sometime in the past. This is only reasonable as it encourages others to do so. However, the offer is good one time only. If they want me to spare them again, they'd better save my life again.

__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com

Re: Metadata (was: Planning the future)

Posted by Dare Obasanjo <kp...@yahoo.com>.

I would rather that effort is expended in adding
database management features to Xindice than features
specific to content management systems. 

I'd like to see Xindice mature as a DBMS before adding
features that are specific to a certain usage model
especially since we are not sure would be the primary
usage model for Xindice.

I'd hate for Xindice to become a Jack of all trades
and master of none. 

--- Bertrand Delacretaz <bd...@codeconsult.ch>
wrote:
> On Monday 18 February 2002 11:00, Gianugo Rabellino
> wrote:
> >. . .
> > 1. XML:DB metadata. . . .
> > 2. DB specific metadata.. .
> > 3. User provided metadata. . .
> >. . .
> 
> I'm jumping in without much knowledge of XIndice
> internals, but I've 
> been thinking about this lately the context of
> CMS/information 
> management. 
> 
> Maybe these thoughts help:
> 
> Speaking of metadata, the "resource fork/data fork"
> concept of the 
> Mac filesystem comes to mind here:
> 
> -Staging/release forks, the same data at several
> stages of completion 
> ("release" is published, "staging" for internal use
> only, later 
> promoted to "release"). 
> 
> -Various metadata forks as you mentioned, different
> facets of the same 
> data
> 
> Maybe implementing the concept of "forks" internally
> in an open way 
> wouldn't be too hard? I think it would be more
> useful that specifiying 
> a fixed number of metadata types.
> 
> >. . .
> > This flexible metadata structure IMO asks for an
> XML metadata format.
> > RDF/DC seems to be the most obvious choice. . .
> >...
> 
> IMHO the user metadata format should be left open
> for maximum 
> flexibility. Or, if this "forks" concept makes
> sense, schemas could be 
> assigned to forks based on fork names (i.e.
> "dc-meta-fork" maps to DC 
> schema).
> 
> -- 
>  -- Bertrand Delacr�taz, www.codeconsult.ch
>  -- web technologies consultant - OO, Java, XML, C++
> 
> 
> 
> 
> 

=====
THINGS TO DO IF I BECOME AN EVIL OVERLORD #68
I will spare someone who saved my life sometime in the past. This is only reasonable as it encourages others to do so. However, the offer is good one time only. If they want me to spare them again, they'd better save my life again.

__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com

Re: Metadata (was: Planning the future)

Posted by Bertrand Delacretaz <bd...@codeconsult.ch>.

On Monday 18 February 2002 11:00, Gianugo Rabellino wrote:
>. . .
> 1. XML:DB metadata. . . .
> 2. DB specific metadata.. .
> 3. User provided metadata. . .
>. . .

I'm jumping in without much knowledge of XIndice internals, but I've 
been thinking about this lately the context of CMS/information 
management. 

Maybe these thoughts help:

Speaking of metadata, the "resource fork/data fork" concept of the 
Mac filesystem comes to mind here:

-Staging/release forks, the same data at several stages of completion 
("release" is published, "staging" for internal use only, later 
promoted to "release"). 

-Various metadata forks as you mentioned, different facets of the same 
data

Maybe implementing the concept of "forks" internally in an open way 
wouldn't be too hard? I think it would be more useful that specifiying 
a fixed number of metadata types.

>. . .
> This flexible metadata structure IMO asks for an XML metadata format.
> RDF/DC seems to be the most obvious choice. . .
>...

IMHO the user metadata format should be left open for maximum 
flexibility. Or, if this "forks" concept makes sense, schemas could be 
assigned to forks based on fork names (i.e. "dc-meta-fork" maps to DC 
schema).

-- 
 -- Bertrand Delacrétaz, www.codeconsult.ch
 -- web technologies consultant - OO, Java, XML, C++

Re: Metadata (was: Planning the future)

Posted by Krzysztof Kowalczykiewicz <kr...@cs.put.poznan.pl>.

Hi!

I hope I'm not rude to join your discussion about metadata. As I mentioned
before, we are working on generic XML repository (CORBA service), that wraps
Xindice and adds versioning and locking services.

What we needed is more detailed and customizable document metadata. Each
document has few system attributes: version, date, key, collection,
versionable and deleted flags etc. We've added custom client attributes as
well. Client can add own higher-level attributes (as author, title etc).
Each operation of storing document/version into repository has parameter for
specifying its metadata record (as xml document). Metadata documents are
retrieved together with their documents, they can be listed etc.

Metadata document is stored parallely within _meta subcollection with the
same access key as its document. System attributes are stored as <meta> root
tag attributes and custom user markup is stored within this attribute:
    <meta key="..." version="..." collection="...">
        <author>KK</author>
        <title>Some document</title>
    </meta>

In our versioning solution historical versions of documents are stored
within another system subcollection _attic with access keys same as document
with appended version id.

This just a short presentation of solution we taken to store metadata.

best regards,

Krzysztof Kowalczykiewicz

Re: Metadata (was: Planning the future)

Posted by Kimbro Staken <ks...@xmldatabases.org>.

On Monday, February 18, 2002, at 03:00 AM, Gianugo Rabellino wrote:

> Tom Bradford wrote:
>> On Saturday, February 16, 2002, at 10:28 PM, Kimbro Staken wrote:
>>> Sorry, that isn't what I meant, I was asking about meta-data. What 
>>> meta-data are we tracking about documents?
>> At the moment, I think just creation time and last modified time.
>
> This would be a good start, but I'm thinking of something more, like 
> having a free-form metadata structure so that the metadata themselves are 
> extensible. IMO there should be three metadata categories:
>
> 1. XML:DB metadata. This should be a joint effort with the XML:DB 
> community, to come up with a common set of sensible metadata. There 
> should also be an API to access them (maybe a service?). I tend to think 
> that these metadata should include at least all the "filesystem like" 
> metadata ("man stat") that make sense in an XML:DB database;
>
> 2. DB specific metadata. Any DB vendor might want to come up with a 
> metadata set wich is peculiar to its implementation. As an example 
> Xindice might come up with filer, versioning or linking informations;
>
> 3. User provided metadata. Any user should be allowed to store its 
> application-specific metadata. This, IMHO, would be killer. And yes, Tom,
>  this would allow me to have a CMS up in minutes... :)

Does this really need to be part of the database? It seems like the 
flexibility required would lead to a more application specific solution.

>
> This flexible metadata structure IMO asks for an XML metadata format. RDF/
> DC seems to be the most obvious choice, but I'm afraid that it might be 
> overkill. Also, while there should be an XML format for the metadata 
> themselves, there should also be some kind of API to quickly access the 
> data without XML programming.
>

Ugg, I can't stand RDF. :-)

> Comments?
>
> -- Gianugo Rabellino
>
>
>
Kimbro Staken
XML Database Software, Consulting and Writing
http://www.xmldatabases.org/