You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by "Roy T. Fielding" <fi...@gbiv.com> on 2004/10/15 20:13:09 UTC

xml:attribute names

> if it were possible to use use e.g. the prefix 'xml' in a jcr name 
> (i.e. name of node/property/node type), it wouldn't be possible to 
> export that content in xml format as the resulting xml would be 
> illegal.

Out of curiosity, how would a JCR repository map the elements and
attributes of a typical XML file if it were configured to break
the content into a tree structure?  I would think that we would
have a content tree that matched the document tree, such as the
following example

=====
<svg xmlns="http://www.w3.org/2000/svg"
      xmlns:xlink="http://www.w3.org/1999/xlink"
      xml:lang="en"
      viewBox="0 0 640 480" preserveAspectRatio="xMidYMid meet">
<style type="text/css">...
=====
    svg
     |
     +--- xmlns                  "http://www.w3.org/2000/svg"
     +--- xmlns:xlink            "http://www.w3.org/1999/xlink"
     +--- xml:lang               "en"
     +--- viewBox                "0 0 640 480"
     +--- preserveAspectRatio    "xMidYMid meet"
     +--- nt:content
            |
            +---  style
                    |
                    +--- type    "text/css"
                    +--- nt:content
                           ...
=====

But that raises a number of questions:

   1) does jcr have a notion of an inherited namespace like xmlns,
      or does it require explicit typing of each node?

   2) do we need to hardcode namespace values for xml and xmlns?

   3) is the above a reasonable way to map XML elements and attributes?

An alternate mapping would be something like

    x:document
       x:element
           x:name       "svg"
           x:attrib
               x:name   "xmlns"
               x:value  "http://www.w3.org/2000/svg"
           x:attrib
               x:name   "xmlns:xlink"
               x:value  "http://www.w3.org/1999/xlink"
           x:attrib
               x:name   "xml:lang"
               x:value  "en"
           x:attrib
               x:name   "viewBox"
               x:value  "0 0 640 480"
           x:attrib
               x:name   "preserveAspectRatio"
               x:value  "xMidYMid meet"
           x:content
              x:element
                  x:name   "style"
                  x:attrib
                      x:name   "type"
                      x:value  "text/css"
                  x:content
                     ...

That places the namespace names in our content, which is much
less appealing for folks who want to use jcr as a front-end for
XML object stores.  News organizations like to store their content
as XML that is amenable to structured queries.

Is there "one true way" to do this, or does jackrabbit need to support
both ways (and possibly more)?

....Roy

Re: xml:attribute names

Posted by David Nuescheler <da...@gmail.com>.

hi roy,

interesting example...

> =====
> <svg xmlns="http://www.w3.org/2000/svg"
>      xmlns:xlink="http://www.w3.org/1999/xlink"
>      xml:lang="en"
>      viewBox="0 0 640 480" preserveAspectRatio="xMidYMid meet">
> <style type="text/css">...
> =====
>    svg
>     |
>     +--- xmlns                  "http://www.w3.org/2000/svg"
>     +--- xmlns:xlink            "http://www.w3.org/1999/xlink"
>     +--- xml:lang               "en"
>     +--- viewBox                "0 0 640 480"
>     +--- preserveAspectRatio    "xMidYMid meet"
>     +--- nt:content
>            |
>            +---  style
>                    |
>                    +--- type    "text/css"
>                    +--- nt:content
>                           ...
> =====

assuming a jcr-documentview import your above example should 
be turned into (anybody please correct me if i am wrong):
---
+--- myfilename.svg [if you had one]
      |
      +--- jcr:created "2004-09-15T15:31:10.125+01:00"
      +--- jcr:uuid "1234-1234-12341234..."
      +--- jcr:content
            |
            +--- svg
                  |
                  +--- svg:viewBox                      "0 0 640 480"
                  +--- svg:preserveAspectRatio    "xMidYMid meet"
                  +--- svg:style
                        |
                        +--- svg:type                     "text/css"
                        .....

while all the xml* prefixes are considered reserved and dealt 
with on import, the svg namespace would be registered automatically 
in the repository's namespace registry (in case it is not registered
already the prefix would endup being machine generated and would
in your example have to be remapped later to something as pretty 
as "svg")

>   1) does jcr have a notion of an inherited namespace like xmlns,
>      or does it require explicit typing of each node?
hmmm.... there is no node based inheritance. namespace-uri 
/ prefix mappings are either done on a per repository basis but 
can be remapped dynamically on the session. 

>   2) do we need to hardcode namespace values for xml and xmlns?
xml* prefixes should be reserved and treated specially.

>   3) is the above a reasonable way to map XML elements and attributes?
we thought so ;)

> An alternate mapping would be something like
> 
>    x:document
>       x:element
>           x:name       "svg"
>           x:attrib
>               x:name   "xmlns"
>               x:value  "http://www.w3.org/2000/svg"
>           x:attrib
>               x:name   "xmlns:xlink"
>               x:value  "http://www.w3.org/1999/xlink"
>           x:attrib
>               x:name   "xml:lang"
>               x:value  "en"
>           x:attrib
>               x:name   "viewBox"
>               x:value  "0 0 640 480"
>           x:attrib
>               x:name   "preserveAspectRatio"
>               x:value  "xMidYMid meet"
>           x:content
>              x:element
>                  x:name   "style"
>                  x:attrib
>                      x:name   "type"
>                      x:value  "text/css"
>                  x:content
>                     ...
> 
> That places the namespace names in our content, which is much
> less appealing for folks who want to use jcr as a front-end for
> XML object stores.  News organizations like to store their content
> as XML that is amenable to structured queries.
exactly.

> Is there "one true way" to do this, or does jackrabbit need to support
> both ways (and possibly more)?
with the document view and the system view there are two
specified ways, how to transform xml into content
and vice-versa. those two are mandated by the spec for 
every repository to support.

the document view and the system view have totally separate
characteristics:
a) system view:
- can expose all content
- operates on a given structure
- useful for backup, migrate, syndicate or replicate content

b) document view:
- any existing xml-document can be mapped to content
- cannot expose all the information stored in a repository (possibly lossy)
- useful for xml based content editing or publishing

the document view should allow people to take all their 
xml-documents, put them into the content repository and
therefore benefit from all the repository services (observation, 
versioning, searching, ...) and still be able to use their
existing xml applications to work with the content the
same way as before, but probably with a substancial 
performance benefit since the repository will not have to
parse the xml to generate sax events.

personally i believe that there are many other ways on how to 
serialize from and to xml that are very attractive for specific 
applications. so applications may come up with their own 
mappings from/to xml.

additionally, i also think that xml is not the only 
interesting serialization. when we were talking about the 
relational view that we use for the sql query language 
we ended up with a data model that would lend 
itself to be displayed as tabseparated or csv tables that 
can be edited in excel ;)

just my 0.02$.
in case i completely misread your post, please apologize.

regards,
david

.ps: the current docview for example can only do limited
round tripping, in the sense that the order of the attributed,
comments, whitespaces between the attributes etc... are 
not preserved. so an application that would like to do that
might have to extend the current serialization.
and i am sure that it is important to some apps to get 
exactly the same xml out of the repository again.
----------------------------------------------------------------------
standardize your content-repository !
                               http://www.jcp.org/en/jsr/detail?id=170
---------------------------------------< david.nuescheler@day.com >---

This message is a private communication. If you are not the intended
recipient, please do not read, copy, or use it, and do not disclose it
to others. Please notify the sender of the delivery error by replying
to this message, and then delete it from your system. Thank you.

The sender does not assume any liability for timely, trouble free,
complete, virus free, secure, error free or uninterrupted arrival of
this e-mail. For verification please request a hard copy version.


mailto:david.nuescheler@day.com
http://www.day.com

David Nuescheler
Chief Technology Officer
Day Software AG
Barfuesserplatz 6 / Postfach
4001 Basel
Switzerland

T  41 61 226 98 98
F  41 61 226 98 97