You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xml.apache.org by Francois Wirth <fr...@is.co.za> on 2000/07/07 08:43:53 UTC

XML Database

Hi,

Just want to know if anyone considered developing an XML driven database
like Tamino. I would be nice to work on an open source database that uses
Xerces, Xalan etc. for the XML processing. I know this is a huge project,
but I think there could be a lot of uses for this database and it would be a
challenging project. It could just be based on XML technology XQL, Soap, XML
Shemas etc. 

What do you think? 
What is the possibility of this happening?

Thanks

Re: XML Database

Posted by Falko Braeutigam <fa...@softwarebuero.de>.
On Fri, 07 Jul 2000, Stefano Mazzocchi wrote:

[snip]
 
> I have been talking to Falko of the Infozone-group
> (www.infozone-group.org) about joining efforts on this side. They have
> created a project called "Prowler" which is a content management system
> based on XML and on their OODBMS system called "Ozone".
> 
> While they have a GPL-like license, they already agreed on changing this
> to the Apache license and move their community on this project. At this
> point, it's just a matter of deciding what to do.
> 
> I want to be entirely honest with you as I was to the Infozone people: I
> don't picture prowler as the end, but as the beginning. A way to
> "catalize" a community around the problem of XML content management
> (with versioning, authorization workflow and all that).
I perfectly agree. Prowler is by no means an end. We started Infozone/Prowler
because we needed an open source CMS and related technologies. Now there is a
an (early) architecture and a (very early) codebase that all interested people
can discuss about - a starting point - "A way to "catalize" a community..."
Nothing is set in stone yet. The current architecture reflects our current needs
and requirements. Everybody is invited to join the discussion, sharing his/her
ideas about an open source CMS, and in general help us with the development.

Regarding the license of Infozone/Prowler: As Stefano already mentioned, we
have agreed to donate Prowler to Apache XML and thus changing our license, if
you are interested in such a project. 


Falko
-- 
______________________________________________________________________
Falko Braeutigam                         mailto:falko@softwarebuero.de
SMB GmbH                                   http://www.softwarebuero.de


Re: XML Database

Posted by Stefano Mazzocchi <st...@apache.org>.
Francois Wirth wrote:
> 
> Hi,
> 
> Just want to know if anyone considered developing an XML driven database
> like Tamino. I would be nice to work on an open source database that uses
> Xerces, Xalan etc. for the XML processing. I know this is a huge project,
> but I think there could be a lot of uses for this database and it would be a
> challenging project. It could just be based on XML technology XQL, Soap, XML
> Shemas etc.
> 
> What do you think?

almost everybody in the Java/XML world, sooner or later, happen to
think: placing objects or trees in a relational database is a pain in
the ass. It's like mixing apples with oranges.

DBMS research created OODBMS along with things like ODMG, object
oriented query languages and such. Another derivation is EJB.

Now we have tree-stuctured documents.

Suppose you have a million XML pages, these are you data, you content.

The nice thing about trees is that you can add nodes at will, the nice
thing about namespaces is you can have multiple dimensions without
worrying on name collisions.. and XMLSchema still being able to validate
them.

So, you have n documents and you do

 <xdb:database xmlns:xdb="http://xml.apache.org/xdb">
  <xdb:section xdb:title="documents">
   <xdb:tree xdb:uri="...." xmlns="...">
    <page>
     <title>this is one article</title>
     ...
    </page>
   </xdb:tree>
   ...
  </xdb:section>
  <xdb:section xdb:title="news">
   <xdb:tree xdb:uri="...." xmlns="...">
    <news title="ASF starts an XML database">
     blah blah
    </news>
    ...
   </xdb:tree>
  <xdb:section>
 </xdb:database>

this is the XML "dump" of your database while, internally, it should be
able to do special indexing to optmizize queries and all that, just like
any DBMS does.

What do you use as a query language?

Possible usages are:

 - xpath
 - xpointer
 - xql

XPointer extends XPath with ranges (which might be very useful in this
case), but is only for "pop data", nothing to "push data".

XQL will sure add the notion of "joins" "insert" and all that but I
don't have ideas on its status.

Anyway, yes, something like this is _incredibly_ important indeed.

> What is the possibility of this happening?

I have been talking to Falko of the Infozone-group
(www.infozone-group.org) about joining efforts on this side. They have
created a project called "Prowler" which is a content management system
based on XML and on their OODBMS system called "Ozone".

While they have a GPL-like license, they already agreed on changing this
to the Apache license and move their community on this project. At this
point, it's just a matter of deciding what to do.

I want to be entirely honest with you as I was to the Infozone people: I
don't picture prowler as the end, but as the beginning. A way to
"catalize" a community around the problem of XML content management
(with versioning, authorization workflow and all that).

Of course, I see lots of uses of Prowler both from the Cocoon project as
well as independently as it stands now, but I see a bright future of
something like this since it covers a particular aspect of XML that we
do not yet cover and I believe very important.

I'm happy you started this discussion so that now I can see your
comments about this.

Again, this is open development: the fact that we start with some code
is _NOT_ to "stamp" a project with the Apache quality label, but to fuel
innovation, increase visibility and accelerate development.

And since I (rather egoistically, I admit :) need such technology, I'd
rather see it happening here with the Apache spirit rather than
somewhere else.

What do you think?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Re: XML Database

Posted by "Edward Q. Bridges" <ed...@buzznik.com>.
i'd be curious to find out more about this (and possibly help
out too, in whatever small way i may be able to), since we're 
using postgres and xml.  
it's been a point of frustration for me that there's very little
awareness of xml in the postgres project, and probably could benefit
from whatever could be done.

--e--



On Mon, 10 Jul 2000 21:47:55 -0400, Joseph Shraibman wrote:

> I have actually thought about extending postgres since it is easy to
> extend.  We could just add a field of type xml and associate a schema
> with that field.
> 
> Francois Wirth wrote:
> > 
> > Hi,
> > 
> > Just want to know if anyone considered developing an XML driven database
> > like Tamino. I would be nice to work on an open source database that uses
> > Xerces, Xalan etc. for the XML processing. I know this is a huge project,
> > but I think there could be a lot of uses for this database and it would be a
> > challenging project. It could just be based on XML technology XQL, Soap, XML
> > Shemas etc.
> > 
> > What do you think?
> > What is the possibility of this happening?
> > 
> > Thanks
> > 
> > ---------------------------------------------------------------------
> > In case of troubles, e-mail:     webmaster@xml.apache.org
> > To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
> > For additional commands, e-mail: general-help@xml.apache.org
> 
> ---------------------------------------------------------------------
> In case of troubles, e-mail:     webmaster@xml.apache.org
> To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org
> 
> 




Re: XML Database

Posted by Joseph Shraibman <jk...@selectacast.net>.
I have actually thought about extending postgres since it is easy to
extend.  We could just add a field of type xml and associate a schema
with that field.

Francois Wirth wrote:
> 
> Hi,
> 
> Just want to know if anyone considered developing an XML driven database
> like Tamino. I would be nice to work on an open source database that uses
> Xerces, Xalan etc. for the XML processing. I know this is a huge project,
> but I think there could be a lot of uses for this database and it would be a
> challenging project. It could just be based on XML technology XQL, Soap, XML
> Shemas etc.
> 
> What do you think?
> What is the possibility of this happening?
> 
> Thanks
> 
> ---------------------------------------------------------------------
> In case of troubles, e-mail:     webmaster@xml.apache.org
> To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
> For additional commands, e-mail: general-help@xml.apache.org

Re: XML Database

Posted by Stefano Mazzocchi <st...@apache.org>.
Tim Bray wrote:
> 
> At 08:43 AM 07/07/00 +0200, Francois Wirth wrote:
> >Just want to know if anyone considered developing an XML driven database
> >like Tamino. I would be nice to work on an open source database that uses
> >Xerces, Xalan etc. for the XML processing. I know this is a huge project,
> >but I think there could be a lot of uses for this database and it would be a
> >challenging project. It could just be based on XML technology XQL, Soap, XML
> >Shemas etc.
> 
> Two comments.
> 
> 1. It's going to be really hard.  Core to XML are the notions of nested
>    structures, and of sequence being significant.  In particular XML's notion
>    of "mixed content" is devilishly hard to implement efficiently.
> 
> 2. The "market" (probably not the right word for free software) may not be
>    that big... XML is superoptimized for interchange, and a lot of people
>    who are using it are just using it pump data back and forth between one
>    boring old database and another.
> 
> 3. In a lot of cases, you can get perfectly good results by storing
>    small chunks of XML in a boring old relational database.
> 
> So to be honest, I'm not sure what the applications for a native XML
> database are.

If you are talking about stuff like

 <orders>
  <order id="384947988">
   <client>blah</client>
   ...
  </order>
  ...
 </orders>

then I'm totally with you. RDBMS will always kick ass on this no matter
what.

But if you have something like
 
 <article>
  <header>
   <abstract>
    <para><em>This</em> document is <keyword>XML
specific</keyword></para>
    ...
   </abstract>
  </header>
  <body>
   ...
  </body>
 </article>

no matter how you "chunk" your document into pieces, you are not able to
say "give me the abstract of every document that Stefano wrote", unless
the number of relational tables goes sky high and the SQL complexity get
incredibly big!

So, result:

 - small structure complexity XML -> Relational DBMS
 - big structure complexity XML -> XML DBMS
 
> Having said all that, people like Software AG are betting big bucks that
> there's a lot of people who want to do this (disclosure: I did a bunch
> of consulting for them back in '97-98) and they may be right.  And if
> someone around here wants to dive into this big ugly problem, that would
> be really cool.  After all, web servers aren't easy either, and back in
> '92 it wasn't obvious that very many people would need them. -Tim

I picture an XML DBMS more a piece of a XML content management system
rather than a "data" base. It's more of a "contentbase" than a
"database", if you forgive me the neologism.

The closest thing we have for a "contentbase" is CVS, but it's not even
powerful enough for being able to say "give me the abstracts of all the
articles that Stefano wrote, chronologically ordered, the first 10".

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Re: XML Database

Posted by Tim Bray <tb...@textuality.com>.
At 08:43 AM 07/07/00 +0200, Francois Wirth wrote:
>Just want to know if anyone considered developing an XML driven database
>like Tamino. I would be nice to work on an open source database that uses
>Xerces, Xalan etc. for the XML processing. I know this is a huge project,
>but I think there could be a lot of uses for this database and it would be a
>challenging project. It could just be based on XML technology XQL, Soap, XML
>Shemas etc. 

Two comments.

1. It's going to be really hard.  Core to XML are the notions of nested
   structures, and of sequence being significant.  In particular XML's notion
   of "mixed content" is devilishly hard to implement efficiently.

2. The "market" (probably not the right word for free software) may not be
   that big... XML is superoptimized for interchange, and a lot of people
   who are using it are just using it pump data back and forth between one
   boring old database and another.

3. In a lot of cases, you can get perfectly good results by storing   
   small chunks of XML in a boring old relational database.

So to be honest, I'm not sure what the applications for a native XML 
database are.

Having said all that, people like Software AG are betting big bucks that
there's a lot of people who want to do this (disclosure: I did a bunch
of consulting for them back in '97-98) and they may be right.  And if
someone around here wants to dive into this big ugly problem, that would
be really cool.  After all, web servers aren't easy either, and back in 
'92 it wasn't obvious that very many people would need them. -Tim