You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jspwiki.apache.org by Janne Jalkanen <Ja...@ecyrd.com> on 2008/02/03 23:05:06 UTC

JSPWiki 3 design notes

Hi folks!

I added my own design notes to the following page, detailing some of  
the changes that should go into JSPWiki version 3 (so that we can  
plan 2.8 with all this stuff in mind).

http://www.jspwiki.org/wiki/JSPWiki3Design

Please discuss this on the mailing list instead of the wiki page, as  
mailing lists are inherently better for discussion than wikipages.   
Let's keep that page in DocumentMode, ok?

/Janne

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

To get back to this...

> Oh, I agree -- I think all of the fields in the abstract or API schema
> would be wiki:, such that if a given implementor (such as me) wants to
> use DC, any reference to say, dc:creator is automatically mapped to
> wiki:author (or whatever it is) in the backend. I'm not suggesting  
> that
> a transformation happen on the fly, as if one could change the schema
> of an existing installation. This would be a configuration issue, and
> I've already got a pretty good idea how to do it.

Ah, ok, now I see what you're after.  I agree, this might probably be  
the best thingy overall.  If we have a very tight specification for  
our own metadata items, they will be unambiguous to us, and it will  
allow 3rd party provider developers to store them in whatever format  
they want.  We would probably ship with a simple mapping to a file  
system/db, but you would be free to store the wiki:author under  
dc:author or whatever.

This sounds fair to me.

> No, it's not. That's what you use an application profile to define.
> DC is designed for broad interoperability across thousands of systems,
> but if you want to constrain things or use particularly encodings,
> you do that in an application profile. The profile has to be in accord
> with DC, IOW it is stricter than DC. DC has to be lenient by design.

I still think this creates IOP problems, but that's now outside this  
discussion.  I personally don't like specs which leave too much  
things for "application profiles".  But then again, I work for a  
telecommunications device manufacturer, who tend to be very precise  
when writing a spec ;-)

>>> To reiterate, what I'd suggest is that we define an abstract  
>>> metadata
> schema in our own namespace, with as tight a set of constraints and
> definitions as we need to function. That's what the system uses. I
> can *still* write that up as an application profile for DC, using DC
> terms where they make sense and putting anything else in as  
> extensions.

Yup, I think we're in agreement here.

Would you like to write up the discussion to the JSPWiki3Design page,  
and refresh the metadata items to what we discussed, and have a first  
stab at the definitions?

/Janne

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Murray Altheim <mu...@altheim.com>.

Janne Jalkanen wrote:
>> I don't think it will. There's a core set of fields but their names
>> should probably be abstractions. I'm trying to think through how this
>> might work without loads of problems. There's so many applications
>> for JSPWiki (in terms of how it might fit into other applications)
>> that we'll need to fit into others' metadata schemes. What I'm
>> talking about are really surface names for things.
> 
> Yes, it will.  If the provider has to figure out mapping between 
> different concepts in the database, it'll create problems.

Not if the provider is using the same identifiers as everything else,
with all this determined prior to anything firing up. Basically the
idea is that there's an abstract metadata schema, with a reference
implementation. People could add additional, but not remove the core.
There'd be no confusion for a given installation.

> This is exactly why namespaces were invented, and this is also why it 
> would probably be a better idea NOT to reuse Dublin Core, but to stick 
> to our own schema.

Oh, I agree -- I think all of the fields in the abstract or API schema
would be wiki:, such that if a given implementor (such as me) wants to
use DC, any reference to say, dc:creator is automatically mapped to
wiki:author (or whatever it is) in the backend. I'm not suggesting that
a transformation happen on the fly, as if one could change the schema
of an existing installation. This would be a configuration issue, and
I've already got a pretty good idea how to do it.

>> Well, yes, but also having the field names match a given schema. Maybe
>> some kind of transformation feature, dunno.
> 
> I think namespaces are quite enough for us.  I don't really want to code 
> for the case in case someone wants to use "wiki:author" for some other 
> purpose.

They wouldn't be able to since the backend uses wiki:author. But if they
provided a mapping so that dc:creator was considered the same as wiki:author,
they could use (potentially) either.

> If people want, they *can* rewrite their own backend in such a way that 
> in converts everything into paper notes stuck onto a donkey glued to a 
> wall somewhere in Pakistan with the word "CUCKOO" written on the 
> backside - but after the JCR interface, I don't really care what 
> transformations you do.

Again, the transformations aren't dynamic, they're part of config.

>>>> Well, I also mentioned that I really doubt that I'd be using 
>>>> dc:identifier
>> for those purposes within the JSPWiki metadata profile. I can also see
>> creating a suitable ID within our own namespace, but I really think
>> dc:identifier would suit fine. We'd not be abusing it at all.
> 
> Ah yes, now I found it.  From RFC 5013:
> 
[...]> <snip>

> </snip>
> 
> I like atom:id much more than the dc:identifier, because
> a) [...]
> Since atom:id is a machine-processable entity, having clear, 
> machine-understandable rules as to what it really is, is very, very 
> important.  For dc:identifier, it's pretty much handwaving.

No, it's not. That's what you use an application profile to define.
DC is designed for broad interoperability across thousands of systems,
but if you want to constrain things or use particularly encodings,
you do that in an application profile. The profile has to be in accord
with DC, IOW it is stricter than DC. DC has to be lenient by design.

>> Not that I'm aware of. DC doesn't get into that kind of thing much
>> except when you get to things like dates.
> 
> I would actually like to use the atom:person construct here, since it 
> has better semantics (it adds an IRI to a name, which can be useful in 
> figuring out across wikis who actually authored what).  But it might be 
> easier to just to store a local identifier, in which case dc is as good 
> as any.
> 
>> It certainly suits the role of both dc:creator, editor, translator,
>> etc. (i.e., very general purpose), anyone who contributes to the
>> resource.
> 
> But again, the definition is a bit handwavy.

Again, this is the place for an application profile.

>>>> Recommendation: Use DCTERMS.format. This is the term used to contain
>>>> a format identifier.  While I recognise that these discussions tend to
>>> I would need to check if it's okay.
>>
>> That one is pretty common.
> 
> Unfortunately, it just says that the "best practice" is to use something 
> like MIME.  Now the problem is that in order to consider e.g. data 
> portability, there's no way to say that "this dcterms:format" means a 
> MIME type.  So again, a system processing the information needs to 
> resort to context-sensitive processing (e.g. "ok, so this comes from 
> jspwiki, so it's always a MIME type").    Which isn't really very good.  
> This is why I would like to have an unambigous "wiki:contentType" 
> definition, which can also be reflected in a non-modifiable 
> pseudoproperty "dcterms:format".
> 
> E.g. "wiki:contentType contains a STRING, which denotes the MIME content 
> type of the content as defined in RFC XXXX [MIME]."
> 
> For example, if it's just defined as a String, how do you define 
> equivalence rules?  Is it okay to put in IMAGE/JPG, or ImAgE/jpG, or 
> image/jpg? If you do not know that these are MIME types, and RFC XXXX 
> defines MIME comparison as case-insensitive, then your application might 
> be functioning wrong.
> 
> This is really my gripe with Dublin Core - it leaves too much up for 
> interpretation.  Which makes it really good for people, but cumbersome 
> for computers.

Again, application profile.

>> It's a Big Deal for a lot of people, I probably don't care much either.
>> I use 'text/wiki' for general purpose wiki text and the application
>> one above to specifically tag JSPWiki wiki text.
> 
> I don't think you can use text/wiki - it's missing the "x-" ;-)

Oops. My fault. There should have been an "x-" in there.

To reiterate, what I'd suggest is that we define an abstract metadata
schema in our own namespace, with as tight a set of constraints and
definitions as we need to function. That's what the system uses. I
can *still* write that up as an application profile for DC, using DC
terms where they make sense and putting anything else in as extensions.
The actual namespace of the definitions would be wiki:, but it would
be dc: compatible under the covers. Then, the reference implementation
would be wiki: but permit mapping to whatever a user required for their
own systems. In my case I'd use the reference implementation for most
of my projects, but in cases where I've got to work within an existing
(e.g., library) CMS or other system, I'd configure that wiki instance
to use a dc: set of terms instead of wiki:. I'm pretty sure this is
workable.

Murray

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> I don't think it will. There's a core set of fields but their names
> should probably be abstractions. I'm trying to think through how this
> might work without loads of problems. There's so many applications
> for JSPWiki (in terms of how it might fit into other applications)
> that we'll need to fit into others' metadata schemes. What I'm
> talking about are really surface names for things.

Yes, it will.  If the provider has to figure out mapping between  
different concepts in the database, it'll create problems.

This is exactly why namespaces were invented, and this is also why it  
would probably be a better idea NOT to reuse Dublin Core, but to  
stick to our own schema.

> Well, yes, but also having the field names match a given schema. Maybe
> some kind of transformation feature, dunno.

I think namespaces are quite enough for us.  I don't really want to  
code for the case in case someone wants to use "wiki:author" for some  
other purpose.

If people want, they *can* rewrite their own backend in such a way  
that in converts everything into paper notes stuck onto a donkey  
glued to a wall somewhere in Pakistan with the word "CUCKOO" written  
on the backside - but after the JCR interface, I don't really care  
what transformations you do.

>>> Well, I also mentioned that I really doubt that I'd be using  
>>> dc:identifier
> for those purposes within the JSPWiki metadata profile. I can also see
> creating a suitable ID within our own namespace, but I really think
> dc:identifier would suit fine. We'd not be abusing it at all.

Ah yes, now I found it.  From RFC 5013:

<snip>
"Element Name:   identifier

    Label:       Identifier
    Definition:  An unambiguous reference to the resource within a given
                 context.
    Comment:     Recommended best practice is to identify the
                 resource by means of a string conforming
                 to a formal identification system."
</snip>

Whereas from RFC 4287 (Atom)

<snip>
"Its content MUST be an IRI, as defined by [RFC3987].  Note that the
    definition of "IRI" excludes relative references.  Though the IRI
    might use a dereferencable scheme, Atom Processors MUST NOT  
assume it
    can be dereferenced.

    When an Atom Document is relocated, migrated, syndicated,
    republished, exported, or imported, the content of its atom:id
    element MUST NOT change.  Put another way, an atom:id element
    pertains to all instantiations of a particular Atom entry or feed;
    revisions retain the same content in their atom:id elements.  It is
    suggested that the atom:id element be stored along with the
    associated resource.

    The content of an atom:id element MUST be created in a way that
    assures uniqueness.

    Because of the risk of confusion between IRIs that would be
    equivalent if they were mapped to URIs and dereferenced, the
    following normalization strategy SHOULD be applied when generating
    atom:id elements:

    o  Provide the scheme in lowercase characters.
    o  Provide the host, if any, in lowercase characters.
    o  Only perform percent-encoding where it is essential.
    o  Use uppercase A through F characters when percent-encoding.
    o  Prevent dot-segments from appearing in paths.
    o  For schemes that define a default authority, use an empty
       authority if the default is desired.
    o  For schemes that define an empty path to be equivalent to a path
       of "/", use "/".
    o  For schemes that define a port, use an empty port if the default
       is desired.
    o  Preserve empty fragment identifiers and queries.
    o  Ensure that all components of the IRI are appropriately character
       normalized, e.g., by using NFC or NFKC.

4.2.6.1.  Comparing atom:id

    Instances of atom:id elements can be compared to determine  
whether an
    entry or feed is the same as one seen before.  Processors MUST
    compare atom:id elements on a character-by-character basis (in a
    case-sensitive fashion).  Comparison operations MUST be based solely
    on the IRI character strings and MUST NOT rely on dereferencing the
    IRIs or URIs mapped from them.

    As a result, two IRIs that resolve to the same resource but are not
    character-for-character identical will be considered different for
    the purposes of identifier comparison.

    For example, these are four distinct identifiers, despite the fact
    that they differ only in case:

       http://www.example.org/thing
       http://www.example.org/Thing
       http://www.EXAMPLE.org/thing
       HTTP://www.example.org/thing

    Likewise, these are three distinct identifiers, because IRI
    %-escaping is significant for the purposes of comparison:

       http://www.example.com/~bob
       http://www.example.com/%7ebob
       http://www.example.com/%7Ebob"

</snip>

I like atom:id much more than the dc:identifier, because
a) atom:id conforms to very precise semantics, including comparison  
rules (which dc:identifier does not give)
b) atom:id is defined as globally unique and non-dereferenceable  
(which helps a *lot* when you don't get people assuming that there's  
something at the end of your IRI)
c) atom:id is defined as an IRI instead of an URI (small difference,  
but might be important)
d) atom:id is defined as unique across the entire lifespan of the  
entity, which dc:identifier is not.
e) Atom feeds make a lot of sense to use, even in a wiki context (and  
you need the atom:id anyway)

Since atom:id is a machine-processable entity, having clear, machine- 
understandable rules as to what it really is, is very, very  
important.  For dc:identifier, it's pretty much handwaving.

> Not that I'm aware of. DC doesn't get into that kind of thing much
> except when you get to things like dates.

I would actually like to use the atom:person construct here, since it  
has better semantics (it adds an IRI to a name, which can be useful  
in figuring out across wikis who actually authored what).  But it  
might be easier to just to store a local identifier, in which case dc  
is as good as any.

> It certainly suits the role of both dc:creator, editor, translator,
> etc. (i.e., very general purpose), anyone who contributes to the
> resource.

But again, the definition is a bit handwavy.

>>> Recommendation: Use DCTERMS.format. This is the term used to contain
>>> a format identifier.  While I recognise that these discussions  
>>> tend to
>> I would need to check if it's okay.
>
> That one is pretty common.

Unfortunately, it just says that the "best practice" is to use  
something like MIME.  Now the problem is that in order to consider  
e.g. data portability, there's no way to say that "this  
dcterms:format" means a MIME type.  So again, a system processing the  
information needs to resort to context-sensitive processing (e.g.  
"ok, so this comes from jspwiki, so it's always a MIME type").     
Which isn't really very good.  This is why I would like to have an  
unambigous "wiki:contentType" definition, which can also be reflected  
in a non-modifiable pseudoproperty "dcterms:format".

E.g. "wiki:contentType contains a STRING, which denotes the MIME  
content type of the content as defined in RFC XXXX [MIME]."

For example, if it's just defined as a String, how do you define  
equivalence rules?  Is it okay to put in IMAGE/JPG, or ImAgE/jpG, or  
image/jpg? If you do not know that these are MIME types, and RFC XXXX  
defines MIME comparison as case-insensitive, then your application  
might be functioning wrong.

This is really my gripe with Dublin Core - it leaves too much up for  
interpretation.  Which makes it really good for people, but  
cumbersome for computers.

> It's a Big Deal for a lot of people, I probably don't care much  
> either.
> I use 'text/wiki' for general purpose wiki text and the application
> one above to specifically tag JSPWiki wiki text.

I don't think you can use text/wiki - it's missing the "x-" ;-)

It might be interesting to just adopt the practice other wikiengines  
are using.

/Janne

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Murray Altheim <mu...@altheim.com>.

Janne Jalkanen wrote:
>> Now, before getting into this too deeply it occurs to me that we might
>> consider a pluggable meta API rather than single metadata schema. There
> 
> Um.  Pluggable?  No.  That'll create loads of problems.

I don't think it will. There's a core set of fields but their names
should probably be abstractions. I'm trying to think through how this
might work without loads of problems. There's so many applications
for JSPWiki (in terms of how it might fit into other applications)
that we'll need to fit into others' metadata schemes. What I'm
talking about are really surface names for things.

> User-access to the metadata?  Absolutely.   And I think that is what
> you really mean - ability to add your own arbitrary metadata for any
> Node.

Well, yes, but also having the field names match a given schema. Maybe
some kind of transformation feature, dunno.

>> WorldCat), Dublin Core is used in almost the entirety of the world's
>> libraries for lightweight interchangeable metadata and is compatible
>> with and/or the basis of the designs used by the W3C and its "semantic
>> web".
> 
> Semantic web is actually a load of bollocks.  But it has some nice
> ideas.

Oh, you don't need to convince me of that. I'm on public record with
quite a number of people at the W3C for stating pretty much the same
thing, with little extra diplomacy.

>> I note that many of the proposed field names come from Atom. While this
>> is perhaps an appropriate usage, Atom is a syndication schema, not a
>> content repository schema. There's not a huge difference and Atom is in
>> large parts (semantically) compatible with and influenced by Dublin Core
>> (e.g., choice of atom:creator). For documents stored in a repository I
>> believe Dublin Core is likely more appropriate.
> 
> There are some reasons why I chose Atom identifiers; I was involved in
> its definition (somewhat), and therefore I know some of the reasons
> why Atom does not use Dublin Core.  Partly because some of the
> definitions were a bit complicated.

Hmm. I find DC pretty simple in general, at least for the majority of
terms we'd use.

>> Historically, there are two Dublin Core schemas, DC.* and DCTERMS.*.
>> The original core set (about a dozen) of Dublin Core Metadata Elements
>> (DC.*) have been grandfathered into the set of DC Terms (DCTERMS, see
>> footnote). For our purposes below, we can consider DC.* and DCTERMS.*
>> as identical namespaces (they by definition now are).
> 
> I wasn't aware of dcterms.  INteresting.

Mostly DCTERMS moves the qualifiers down into a flattened namespace,
which is simpler, certainly.

>>  * atom:updated As in RFC 4287. This is a DATE.
>>
>> Recommendation: Use DCTERMS.modified. [DC.date or DC.date.modified]
>> DATE.
> 
> The semantics of atom:updated and dcterms.modified differ - and I seem
> to recall that that difference is minuscule, but actually very
> important.  Can't dig up the reference now, will do later.

Would be interested, thanks.

>>  * atom:published As in RFC 4287. As JSPWiki does not yet support
>>   "draft" -pages, this is essentially a creation date. NB: This cannot
>>    be checked from page version #1, because that might be deleted.
>>    This is a DATE.
>>
>> Recommendation: Use DCTERMS.created. Agreed: this must be carried
>> through all revisions since it provides a canonical container for the
>> origin date of the document. [DC.date or DC.date.created]
>> DATE.
> 
> Probably better.
> 
>>  * atom:id As in RFC 4287. This has some advantages, and can easily be
>>    tied to the JCR jcr:uuid. This is a STRING.
>>
>> Recommendation: Use DCTERMS.identifier. [DC.identifier]
>>    STRING (URI?)
> 
> Nope.  Atom:id is a very, very useful construct.  As you mentioned in
> the last email, you probably want to use dc:identifier for your own
> purposes. 

Well, I also mentioned that I really doubt that I'd be using dc:identifier
for those purposes within the JSPWiki metadata profile. I can also see
creating a suitable ID within our own namespace, but I really think
dc:identifier would suit fine. We'd not be abusing it at all.

>> Recommendation: Use DCTERMS.creator. The Atom specification seems to borrow
>> extensively from DC, with atom:author identical with the concept of
>> DC.creator (they apparently just didn't like the term 'creator' and
>> changed it to 'author'), but do use 'contributor' in the same manner
>> (again, paraphrasing the terminology from DC). This will need to occur
>> in all revisions since we need to maintain the original author ID
>> regardless of the existence of a given revision. [DC.creator]
>> STRING.
> 
> I seem to recall that dc requires a specific notation for the user
> data - which might be incompatible with what we have (essentially the
> uid).  It might be useful to provide a pseudo-property dc:creator
> which is constructed out of wiki:creator and UserDatabase data.

Not that I'm aware of. DC doesn't get into that kind of thing much
except when you get to things like dates.

>> Recommendation: Use DCTERMS.contributor.  The idea with DC.creator and
>> DC.contributor is that the former is the original creator (author) of
>> a resource, and any subsequent contributions (editing, translation, etc.)
>> are considered as being done by a 'contributor'. For the original author,
>> see wiki:creator (DC.creator) above. [DC.contributor]
> 
> I am not certain whether dc:contributor is syntactically okay.

It certainly suits the role of both dc:creator, editor, translator,
etc. (i.e., very general purpose), anyone who contributes to the
resource.

>> Recommendation: Use new application profile wiki:content. Question
>> as to binary stream? Not STRING?
> 
> JPEGs are badly presented as Strings.

Ah, yes.

>> Recommendation: Use DCTERMS.format. This is the term used to contain
>> a format identifier.  While I recognise that these discussions tend to
> 
> I would need to check if it's okay.

That one is pretty common.

>> devolve rather quickly, I would highly recommend considering the MIME
>> or Internet Media Type as "application/*" instead of "text/*", e.g.,
>> "application/x-wiki+jspwiki". In looking at the history of "text/html"
>> vs. "application/html" this would suggest that text formats that use
>> a significant amount of processing to perform rendering generally move
>> towards being considered more an application than a text format (i.e.,
>> that while they may be largely human readable they quickly become
>> indecipherable or largely unreadable in practice when used with plugins
>> and other complex syntax, e.g., many if not most pages on Wikipedia.
>> [DC.format] STRING.
> 
> I don't really know about this.  I don't care much.

It's a Big Deal for a lot of people, I probably don't care much either.
I use 'text/wiki' for general purpose wiki text and the application
one above to specifically tag JSPWiki wiki text.

>> Recommended: Use new application profile wiki:state. Enumerated value
>> set. Not labeled as BOOLEAN but seems to be.
> 
> Yes, boolean.
> 
>> In summary, while I see Atom as interesting and in large part semantically
>> compatible with Dublin Core, I think it'd be better to incorporate a
>> schema that was designed more specifically for resources than for feeds;
>> the definitions fit more closely with our usage.
> 
> I think we need to define the exact semantics of the properties we
> want to use, and then choose what is most appropriate - or define our own.

Yup.

I'm outa here... gotta run.

Murray

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Janne Jalkanen <ja...@iki.fi>.

> Now, before getting into this too deeply it occurs to me that we might
> consider a pluggable meta API rather than single metadata schema. There

Um.  Pluggable?  No.  That'll create loads of problems.

User-access to the metadata?  Absolutely.   And I think that is what
you really mean - ability to add your own arbitrary metadata for any
Node.

> WorldCat), Dublin Core is used in almost the entirety of the world's
> libraries for lightweight interchangeable metadata and is compatible
> with and/or the basis of the designs used by the W3C and its "semantic
> web".

Semantic web is actually a load of bollocks.  But it has some nice
ideas.

> I note that many of the proposed field names come from Atom. While this
> is perhaps an appropriate usage, Atom is a syndication schema, not a
> content repository schema. There's not a huge difference and Atom is in
> large parts (semantically) compatible with and influenced by Dublin Core
> (e.g., choice of atom:creator). For documents stored in a repository I
> believe Dublin Core is likely more appropriate.

There are some reasons why I chose Atom identifiers; I was involved in
its definition (somewhat), and therefore I know some of the reasons
why Atom does not use Dublin Core.  Partly because some of the
definitions were a bit complicated.

> Historically, there are two Dublin Core schemas, DC.* and DCTERMS.*.
> The original core set (about a dozen) of Dublin Core Metadata Elements
> (DC.*) have been grandfathered into the set of DC Terms (DCTERMS, see
> footnote). For our purposes below, we can consider DC.* and DCTERMS.*
> as identical namespaces (they by definition now are).

I wasn't aware of dcterms.  INteresting.

>  * atom:updated As in RFC 4287. This is a DATE.
> 
> Recommendation: Use DCTERMS.modified. [DC.date or DC.date.modified]
> DATE.

The semantics of atom:updated and dcterms.modified differ - and I seem
to recall that that difference is minuscule, but actually very
important.  Can't dig up the reference now, will do later.

>  * atom:published As in RFC 4287. As JSPWiki does not yet support
>   "draft" -pages, this is essentially a creation date. NB: This cannot
>    be checked from page version #1, because that might be deleted.
>    This is a DATE.
> 
> Recommendation: Use DCTERMS.created. Agreed: this must be carried
> through all revisions since it provides a canonical container for the
> origin date of the document. [DC.date or DC.date.created]
> DATE.

Probably better.

>  * atom:id As in RFC 4287. This has some advantages, and can easily be
>    tied to the JCR jcr:uuid. This is a STRING.
> 
> Recommendation: Use DCTERMS.identifier. [DC.identifier]
>    STRING (URI?)

Nope.  Atom:id is a very, very useful construct.  As you mentioned in
the last email, you probably want to use dc:identifier for your own
purposes. 

> Recommendation: Use DCTERMS.creator. The Atom specification seems to borrow
> extensively from DC, with atom:author identical with the concept of
> DC.creator (they apparently just didn't like the term 'creator' and
> changed it to 'author'), but do use 'contributor' in the same manner
> (again, paraphrasing the terminology from DC). This will need to occur
> in all revisions since we need to maintain the original author ID
> regardless of the existence of a given revision. [DC.creator]
> STRING.

I seem to recall that dc requires a specific notation for the user
data - which might be incompatible with what we have (essentially the
uid).  It might be useful to provide a pseudo-property dc:creator
which is constructed out of wiki:creator and UserDatabase data.

> Recommendation: Use DCTERMS.contributor.  The idea with DC.creator and
> DC.contributor is that the former is the original creator (author) of
> a resource, and any subsequent contributions (editing, translation, etc.)
> are considered as being done by a 'contributor'. For the original author,
> see wiki:creator (DC.creator) above. [DC.contributor]

I am not certain whether dc:contributor is syntactically okay.

> Recommendation: Use new application profile wiki:content. Question
> as to binary stream? Not STRING?

JPEGs are badly presented as Strings.

> Recommendation: Use DCTERMS.format. This is the term used to contain
> a format identifier.  While I recognise that these discussions tend to

I would need to check if it's okay.

> devolve rather quickly, I would highly recommend considering the MIME
> or Internet Media Type as "application/*" instead of "text/*", e.g.,
> "application/x-wiki+jspwiki". In looking at the history of "text/html"
> vs. "application/html" this would suggest that text formats that use
> a significant amount of processing to perform rendering generally move
> towards being considered more an application than a text format (i.e.,
> that while they may be largely human readable they quickly become
> indecipherable or largely unreadable in practice when used with plugins
> and other complex syntax, e.g., many if not most pages on Wikipedia.
> [DC.format] STRING.

I don't really know about this.  I don't care much.

> Recommended: Use new application profile wiki:state. Enumerated value
> set. Not labeled as BOOLEAN but seems to be.

Yes, boolean.

> In summary, while I see Atom as interesting and in large part semantically
> compatible with Dublin Core, I think it'd be better to incorporate a
> schema that was designed more specifically for resources than for feeds;
> the definitions fit more closely with our usage.

I think we need to define the exact semantics of the properties we
want to use, and then choose what is most appropriate - or define our own.

I'll need to check dcterms, though.

/Janne

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> I'd really like that approach. That way it would be easy to add( and
> possibly edit) picture's metadata given in the various formats from
> within the wiki. Using this one could easily implement, something  
> like a
> 'picture management wiki', like for collaborative tagging of holiday
> pictures with you pals.

Yes, developers will have full access to metadata (except some read- 
only properties, like jcr:uuid, for obvious reasons).

/Janne

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Murray Altheim <mu...@altheim.com>.

Fabian,

I myself have three separate and very different JSPWiki applications
currently in mind, all with very different requirements and different
metadata schemas. Something like a Flickr system is certainly the
type of thing I can also see JSPWiki providing, i.e., somebody would
extend the system (likely with a modified metadata schema, some
plugins and some JSP work) and have an entirely new application. I've
already got about 90% of a digital library application written in a
way that's compatible with JSPWiki as the front end. There are many
possibilities.

Cheers,

Murray

Fabian Haupt wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> +1
> 
> Hi,
> 
> I'd really like that approach. That way it would be easy to add( and
> possibly edit) picture's metadata given in the various formats from
> within the wiki. Using this one could easily implement, something like a
> 'picture management wiki', like for collaborative tagging of holiday
> pictures with you pals.
> 
> greets
> Fabian
> 
> 
> Murray Altheim wrote:
> | In looking at the 3.0 design document at
> |
> |    http://www.jspwiki.org/wiki/JSPWiki3Design
> |
> | I have some comments on the metadata plans. These comments are only
> | tentative.
> |
> | ----
> | ! Metadata Meta API
> |
> | Now, before getting into this too deeply it occurs to me that we might
> | consider a pluggable meta API rather than single metadata schema. There
> | are likely a variety of different applications that JSPWiki may be
> | used within (simple wikis, embedded apps, hives, part of document mgmt
> | systems, etc.), and we likely also want scalability (i.e., in terms of
> | both simplicity/complexity and factors like page an revision count) in
> | our metadata just as we do in other areas. I don't think this sounds
> | particularly difficult if we're using a JSR-170 compliant repository:
> | there'd be a core set of metadata fields whose actual descriptors would
> | be assigned by the API implementation. If an application needed more
> | than that it'd be up to the implementation to define and handle (e.g.,
> | because the documents will be used within a more complex framework or
> | document management system having an existing schema). We'd simply be
> | creating the API and reference implementation.
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.7 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> 
> iD8DBQFHqBPJtC//DIQj2V8RAjoGAKCg+LCsTRc8VnGMvghNYR1HjrReRgCfVAv8
> 6ymmoWJV5B2SpcP+FzMFZxE=
> =gX1L
> -----END PGP SIGNATURE-----
> 


-- 

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Fabian Haupt <ka...@submerged-intelligence.de>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

+1

Hi,

I'd really like that approach. That way it would be easy to add( and
possibly edit) picture's metadata given in the various formats from
within the wiki. Using this one could easily implement, something like a
'picture management wiki', like for collaborative tagging of holiday
pictures with you pals.

greets
Fabian


Murray Altheim wrote:
| In looking at the 3.0 design document at
|
|    http://www.jspwiki.org/wiki/JSPWiki3Design
|
| I have some comments on the metadata plans. These comments are only
| tentative.
|
| ----
| ! Metadata Meta API
|
| Now, before getting into this too deeply it occurs to me that we might
| consider a pluggable meta API rather than single metadata schema. There
| are likely a variety of different applications that JSPWiki may be
| used within (simple wikis, embedded apps, hives, part of document mgmt
| systems, etc.), and we likely also want scalability (i.e., in terms of
| both simplicity/complexity and factors like page an revision count) in
| our metadata just as we do in other areas. I don't think this sounds
| particularly difficult if we're using a JSR-170 compliant repository:
| there'd be a core set of metadata fields whose actual descriptors would
| be assigned by the API implementation. If an application needed more
| than that it'd be up to the implementation to define and handle (e.g.,
| because the documents will be used within a more complex framework or
| document management system having an existing schema). We'd simply be
| creating the API and reference implementation.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHqBPJtC//DIQj2V8RAjoGAKCg+LCsTRc8VnGMvghNYR1HjrReRgCfVAv8
6ymmoWJV5B2SpcP+FzMFZxE=
=gX1L
-----END PGP SIGNATURE-----

Metadata in 3.0 [Was: JSPWiki 3 design notes]

Posted by Murray Altheim <mu...@altheim.com>.

In looking at the 3.0 design document at

    http://www.jspwiki.org/wiki/JSPWiki3Design

I have some comments on the metadata plans. These comments are only
tentative.

----
! Metadata Meta API

Now, before getting into this too deeply it occurs to me that we might
consider a pluggable meta API rather than single metadata schema. There
are likely a variety of different applications that JSPWiki may be
used within (simple wikis, embedded apps, hives, part of document mgmt
systems, etc.), and we likely also want scalability (i.e., in terms of
both simplicity/complexity and factors like page an revision count) in
our metadata just as we do in other areas. I don't think this sounds
particularly difficult if we're using a JSR-170 compliant repository:
there'd be a core set of metadata fields whose actual descriptors would
be assigned by the API implementation. If an application needed more
than that it'd be up to the implementation to define and handle (e.g.,
because the documents will be used within a more complex framework or
document management system having an existing schema). We'd simply be
creating the API and reference implementation.

----
! Recommendation

I agree that the schema for JSPWiki should use standards wherever
possible, and would advocate basing the reference implementation
of a metadata API on Dublin Core, given that it is the predominant
document metadata schema in use on the Web, either used directly or
heavily informed by it). Due to its origins in OCLC (publishers of
WorldCat), Dublin Core is used in almost the entirety of the world's
libraries for lightweight interchangeable metadata and is compatible
with and/or the basis of the designs used by the W3C and its "semantic
web".

When these terms don't suffice there are a variety of ways to extend
the set. An accepted way to do this is to create and publish (i.e.,
post on the web) an "application profile" for the local customisations
made. I am willing to both design and create the necessary documents
for a Dublin Core application profile for JSPWiki. Examples of these
documents (which are backed by an RDF document) are at

    http://dublincore.org/documents/2004/09/10/library-application-profile/
    http://www.natlib.govt.nz/dr/drterms.html
    http://www.natlib.govt.nz/dr/terms# (RDF document)

Note that there is no requirement that an application profile be
either submitted or approved by the DCMI. It's just playing nicely
by the rules to do so. For our purposes it'd just be a published
web page plus a static RDF document.

Below is the name and online comments for each proposed term followed by
my comments and/recommendation for the term to be used in 3.0. I've
sorted the list to begin with those terms that can be supported directly
by the existing Dublin Core terms, followed by a set of terms to be
defined within a JSPWiki application profile. Within the profile would
be references to equivalent terms in other schemas where available and
appropriate.

I note that many of the proposed field names come from Atom. While this
is perhaps an appropriate usage, Atom is a syndication schema, not a
content repository schema. There's not a huge difference and Atom is in
large parts (semantically) compatible with and influenced by Dublin Core
(e.g., choice of atom:creator). For documents stored in a repository I
believe Dublin Core is likely more appropriate.

----
! Historical Note

Historically, there are two Dublin Core schemas, DC.* and DCTERMS.*.
The original core set (about a dozen) of Dublin Core Metadata Elements
(DC.*) have been grandfathered into the set of DC Terms (DCTERMS, see
footnote). For our purposes below, we can consider DC.* and DCTERMS.*
as identical namespaces (they by definition now are).

There used to be a qualification scheme whereby e.g., DC.date could be
qualified as DC.date.modified, but this has been dropped in favour of
having most of these qualified terms become full terms in their own right
within the DCTERMS namespace. Where they exist, I've included the DC.*
term or qualified term in parentheses below.]

--------------

  * atom:updated As in RFC 4287. This is a DATE.

Recommendation: Use DCTERMS.modified. [DC.date or DC.date.modified]
DATE.

  * atom:published As in RFC 4287. As JSPWiki does not yet support
   "draft" -pages, this is essentially a creation date. NB: This cannot
    be checked from page version #1, because that might be deleted.
    This is a DATE.

Recommendation: Use DCTERMS.created. Agreed: this must be carried
through all revisions since it provides a canonical container for the
origin date of the document. [DC.date or DC.date.created]
DATE.

  * atom:id As in RFC 4287. This has some advantages, and can easily be
    tied to the JCR jcr:uuid. This is a STRING.

Recommendation: Use DCTERMS.identifier. [DC.identifier]
    STRING (URI?)

  * wiki:creator As in atom_published, the creator probably needs to
    be stored separately. Though on wikipages it might not be that useful.
    This is TBD.

Recommendation: Use DCTERMS.creator. The Atom specification seems to borrow
extensively from DC, with atom:author identical with the concept of
DC.creator (they apparently just didn't like the term 'creator' and
changed it to 'author'), but do use 'contributor' in the same manner
(again, paraphrasing the terminology from DC). This will need to occur
in all revisions since we need to maintain the original author ID
regardless of the existence of a given revision. [DC.creator]
STRING.

  * wiki:author Denotes the Identity of the user who saved this version
    of the page. This should probably be a reference to the user identity.
    It should also have a useful value in case the modification is done
    by the system automatically. This value should never be anything
    meaningless - in fact, I think that PageManager should throw an Exception
    if there is an missing attribute when saved. This is TBD.

Recommendation: Use DCTERMS.contributor.  The idea with DC.creator and
DC.contributor is that the former is the original creator (author) of
a resource, and any subsequent contributions (editing, translation, etc.)
are considered as being done by a 'contributor'. For the original author,
see wiki:creator (DC.creator) above. [DC.contributor]

    STRING.

  * wiki:ipaddr The IP address where the last change occurred. The
    SpamFilter might then add some additional tags (in its own namespace).
    This is a STRING

Recommendation: Use new application profile wiki:ipaddr. STRING.

  * wiki:content The actual content as a binary stream (BINARY)

Recommendation: Use new application profile wiki:content. Question
as to binary stream? Not STRING?

  * wiki:contentType The MIME type of the content. JSPWiki markup shall be
    denoted as "text/x-wiki.jspwiki". Creole as "text/x-wiki.creole".
    Other types are also allowed, e.g. "text/html" or "image/jpeg".

Recommendation: Use DCTERMS.format. This is the term used to contain
a format identifier.  While I recognise that these discussions tend to
devolve rather quickly, I would highly recommend considering the MIME
or Internet Media Type as "application/*" instead of "text/*", e.g.,
"application/x-wiki+jspwiki". In looking at the history of "text/html"
vs. "application/html" this would suggest that text formats that use
a significant amount of processing to perform rendering generally move
towards being considered more an application than a text format (i.e.,
that while they may be largely human readable they quickly become
indecipherable or largely unreadable in practice when used with plugins
and other complex syntax, e.g., many if not most pages on Wikipedia.
[DC.format] STRING.

  * wiki:acl The access control list for this page. Format TBD.

Recommendation: Use new application profile wiki:acl. TBD.

  * wiki:changenote A simple, text/plain description of the note of the
    change. STRING.

Recommendation: Use new application profile wiki:changenote. The way
to do this in Dublin Core would likely be considered too complicated
for this application. The change note needs to be considered as
metadata of the revision, not the document.
STRING.

  * wiki:state Essentially an Enum defining the state of the page. Can
    be EXISTS or DELETED. Format TBD.

Recommended: Use new application profile wiki:state. Enumerated value
set. STRING.

  * wiki:minorchange This change is minor, and should not be shown in
    the changelog, though an actual change has been made.

Recommended: Use new application profile wiki:state. Enumerated value
set. Not labeled as BOOLEAN but seems to be.

----

In summary, while I see Atom as interesting and in large part semantically
compatible with Dublin Core, I think it'd be better to incorporate a
schema that was designed more specifically for resources than for feeds;
the definitions fit more closely with our usage.

As I mentioned above I consider these merely comments-on-the-path.

Murray


DCTERMS. The set of Dublin Core terms are found at
    http://dublincore.org/documents/dcmi-terms/
...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: JCR, was: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> have I understood it right, that you want to replace the whole
> storage backend by a JCR solution, meaning JSPWiki won't be able to
> save its data directly to the file system, a database or anything else
> but a Java Content Repository any more? ("Directly" because it's of
> course up to the JCR server where to store the data then.)

Essentially, yes.  JCR is pretty cool, even if you didn't use all of  
the features.

The long story is that there *is* a nice strategy that'll allow us to  
lose no advantages of the current file-based storage.  But I didn't  
write it down yet.

> When it comes to versioning, I'd like to remind you of JSPWIKI-110,
> time machine.

Yup.  It's on my mind...

/Janne

JCR, was: JSPWiki 3 design notes

Posted by Florian Holeczek <fl...@holeczek.de>.

Hi Janne,

have I understood it right, that you want to replace the whole
storage backend by a JCR solution, meaning JSPWiki won't be able to
save its data directly to the file system, a database or anything else
but a Java Content Repository any more? ("Directly" because it's of
course up to the JCR server where to store the data then.)

When it comes to versioning, I'd like to remind you of JSPWIKI-110,
time machine.

Regards,
 Florian

Ursprüngliche Nachricht vom 03.02.2008 um 23:05:
> Hi folks!

> I added my own design notes to the following page, detailing some of  
> the changes that should go into JSPWiki version 3 (so that we can  
> plan 2.8 with all this stuff in mind).

> http://www.jspwiki.org/wiki/JSPWiki3Design

> Please discuss this on the mailing list instead of the wiki page, as  
> mailing lists are inherently better for discussion than wikipages.   
> Let's keep that page in DocumentMode, ok?

> /Janne

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> Which are the other bad sides you've been talking of?

Mostly relating to the page lifecycle; version history would be a bit  
strange, because you would need to somehow version the EOL markers as  
well.

Also, it might result in a lot of test or crap pages in the  
repository (can you imagine how many different spellings of  
"BuyViagra" pages we would have? ;-).  So you would need a "permanent  
delete" option for that stuff as well.

Pages themselves are lightweight objects, but having a zillion MP3  
files in various states of deletion in the repository takes a bit of  
too much space.

/Janne

Re: JSPWiki 3 design notes

Posted by Florian Holeczek <fl...@holeczek.de>.

Hi Janne,

>> Another idea: Why not see page deletion as and "end of life" marker
>> in the page history? Let a page get a life cycle (create, change,
>> delete).

> That's what is currently in the document. It has the couple of
> obvious bad sides, like the fact what happens if someone recreates
> the page? Your versioning would not start from one, but 2389 ;-)

thinking about this (and generally the whole versioning and linking
thing) I came to the conclusion that the basic problem lies in the
wiki way itself: using page names as identifiers.

We wouldn't have any problems if there was another identifier (JCR has
its UUIDs) and the page name was only one of several metadata (which,
BTW, would be localizable then).

I think, although they are restrictions, things like using page names
as identifiers simply are like this in a wiki (in opposite of e.g. a
cms) and should be left this way.

One could however increase usability via workarounds, e.g. in this
case, adding an internal life cycle count to the version number
(1.2389 - deletion - recreation - 2.1).

Which are the other bad sides you've been talking of?

Regards,
 Florian

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> Another idea: Why not see page deletion as and "end of life" marker in
> the page history? Let a page get a life cycle (create, change,
> delete).

That's what is currently in the document.  It has the couple of  
obvious bad sides, like the fact what happens if someone recreates  
the page?  Your versioning would not start from one, but 2389 ;-)

/Janne

Re: JSPWiki 3 design notes

Posted by Florian Holeczek <fl...@holeczek.de>.

>> "Page deletion"
>>
>> Why not move the page to a "Trash" WorkSpace, so the user can go their
>> an recover a page, or empty its Trash ?

> This is what TWiki does.  It has a couple of problems:

> 1) It needs specific handling with respect to moving and renaming -  
> the system needs to know that Trash is a special WikiSpace.
> 2) If two persons from different WikiSpaces move a similarly named  
> page to the Trash, then there's a problem.  Or, the trash is not  
> emptied, and someone recreates the page and moves it again to the  
> trash.  It becomes... problematic.  Though, actually there is also a  
> problem if a page is marked deleted, but it gets recreated - is the  
> old one brought back or what?

> Maybe a WikiSpace-specific wiki:Trash space might be the best  
> solution.  And an admin function to empty it.

Another idea: Why not see page deletion as and "end of life" marker in
the page history? Let a page get a life cycle (create, change,
delete).

Regards,
 Florian

Re: JSPWiki 3 design notes

Posted by Florian Holeczek <fl...@holeczek.de>.

>> I am not sure if this question is at the right place here, but will
>> it be possible to make JSPWiki more scalable with this new design ?
>> And then I mean running multiple instances (JVM's) of JSPWiki
>> against 1 shared repository.

> Short answer: yes.

A nice overview of different deployment models of Apache Jackrabbit (a
JCR implementation) can be found at
http://jackrabbit.apache.org/doc/deploy.html

Regards,
 Florian

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> I am not sure if this question is at the right place here, but will  
> it be
> possible to make JSPWiki more scalable with this new design ?
> And then I mean running multiple instances (JVM's) of JSPWiki  
> against 1
> shared repository.

Short answer: yes.

> The current version of JSPWiki does not offer that scalability, for  
> instance
> because of the referencemanager, at least not with a File based  
> repository.
> Running a wiki with hundreds of thousand or even million pages  
> would (I
> guess) not be possible right now. Startup times would take too long
> probably.

Some people *are* doing it, but I believe they have turned off ACL  
scanning (which is the big timewaster right now).

/Janne

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> The biggest pain point, I suspect is that all of the pages need to  
> be examined at startup time. That's an expensive operation. Similar  
> to how we run the Lucene indexer in a background thread, an easy  
> way to speed up startup would be to initialize the ReferenceManager  
> that way, too.

ReferenceManager actually caches its contents to the workdir (the  
refmgr.ser), so only the first startup is slow; for the rest we just  
deserialize the data.  Variables are also cached.

However, ACLs are not.  And since we do scan through all of the  
pages, the missing ACLs cause a refresh of the entire page.  This  
would be a relatively low-hanging fruit to grab.

> However, that's probably not the whole answer. It seems to me that  
> references (what links to Page X, and what does Page X link to) is  
> also something that should properly be stored as page metadata. So,  
> that's probably part of the solution too. Certainly, part of the  
> plan would be to enable deployers to share page repositories, and  
> references would be part of that.

Yes, it would make sense to store this as the page metadata.   
However, they're not the only thing which are a problem.

/Janne

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

Just this mailing list thread.  At least I'm not aware of other places.

/Janne

On 13 Feb 2008, at 20:26, Florian Holeczek wrote:

> Hi committers,
>
>> There are a bunch of considerations here...
>
> where does this discussion take place, apart from the wiki pages and
> this mailing list thread?
>
> Regards,
>  Florian

Re: JSPWiki 3 design notes

Posted by Florian Holeczek <fl...@holeczek.de>.

Hi committers,

> There are a bunch of considerations here...

where does this discussion take place, apart from the wiki pages and
this mailing list thread?

Regards,
 Florian

Re: JSPWiki 3 design notes

Posted by Andrew Jaquith <an...@mac.com>.

There are a bunch of considerations here...

The biggest pain point, I suspect is that all of the pages need to be  
examined at startup time. That's an expensive operation. Similar to  
how we run the Lucene indexer in a background thread, an easy way to  
speed up startup would be to initialize the ReferenceManager that way,  
too.

However, that's probably not the whole answer. It seems to me that  
references (what links to Page X, and what does Page X link to) is  
also something that should properly be stored as page metadata. So,  
that's probably part of the solution too. Certainly, part of the plan  
would be to enable deployers to share page repositories, and  
references would be part of that.

That's just my $0.02. I am not the author of ReferenceManager, and am  
not too familiar with the code (although I've hacked it slightly to  
get it working with my 3.0 branch).

Andrew

On Feb 13, 2008, at 12:32 PM, Harry Metske wrote:

> I am not sure if this question is at the right place here, but will  
> it be
> possible to make JSPWiki more scalable with this new design ?
> And then I mean running multiple instances (JVM's) of JSPWiki  
> against 1
> shared repository.
>
> The current version of JSPWiki does not offer that scalability, for  
> instance
> because of the referencemanager, at least not with a File based  
> repository.
> Running a wiki with hundreds of thousand or even million pages would  
> (I
> guess) not be possible right now. Startup times would take too long
> probably.
>
> regards,
> Harry Metske
>
>
> 2008/2/7, Janne Jalkanen <Ja...@ecyrd.com>:
>>
>>> Wow. That sounds pretty damn useful for JSPWiki and beyond. I shared
>>> Murray's concern about total dependence on JCR too, but if a
>>> light-weight file-based JCR implementation is feasible, that
>>> definitely changes my thinking.
>>
>> Yes, it is very feasible - as long as you don't try to shoot for
>> total JSR-170 compliance.  Some things, like XPath queries or SQL
>> queries can be a bit too complicated, though.  But if we keep the
>> feature set that we need relatively clean, and upgrade gradually,
>> then I think it all should work just nicely.
>>
>> /Janne
>>
>
>
>
> -- 
> met vriendelijke groet,
> Harry Metske
> Telnr. +31-548-512395
> Mobile +31-6-51898081

Re: JSPWiki 3 design notes

Posted by Harry Metske <ha...@gmail.com>.

I am not sure if this question is at the right place here, but will it be
possible to make JSPWiki more scalable with this new design ?
And then I mean running multiple instances (JVM's) of JSPWiki against 1
shared repository.

The current version of JSPWiki does not offer that scalability, for instance
because of the referencemanager, at least not with a File based repository.
Running a wiki with hundreds of thousand or even million pages would (I
guess) not be possible right now. Startup times would take too long
probably.

regards,
Harry Metske


2008/2/7, Janne Jalkanen <Ja...@ecyrd.com>:
>
> > Wow. That sounds pretty damn useful for JSPWiki and beyond. I shared
> > Murray's concern about total dependence on JCR too, but if a
> > light-weight file-based JCR implementation is feasible, that
> > definitely changes my thinking.
>
> Yes, it is very feasible - as long as you don't try to shoot for
> total JSR-170 compliance.  Some things, like XPath queries or SQL
> queries can be a bit too complicated, though.  But if we keep the
> feature set that we need relatively clean, and upgrade gradually,
> then I think it all should work just nicely.
>
> /Janne
>



-- 
met vriendelijke groet,
Harry Metske
Telnr. +31-548-512395
Mobile +31-6-51898081

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

The JDBCProvider could be relatively easily ported to the new API, or  
you could just switch to Jackrabbit and their JDBC provider.

I think you will like it.  The current FileSystemProvider is roughly  
the same complexity as the old one, and *that* has to figure out how  
to store arbitrary metadata.

There needs to be a migration tool though.  I'm *hoping* that the new  
providers are able to do it relatively easily with little or no user  
config.

/Janne

On 5 Feb 2008, at 18:49, Terry Steichen wrote:

> Janne,
>
> What would that do to those of us using the JDBCPageProvider (and the
> thousands of pages implemented via that WikiPageProvider)?
>
> Terry
>
> PS: Sorry, but I'm also beginning to wonder if these grand and  
> glorious
> plans aren't taking JSPWiki in a direction that will drastically alter
> the characteristics that attracted me to it from the outset.
>
>
> On Tue, 2008-02-05 at 18:12 +0200, Janne Jalkanen wrote:
>
>>> So you don't see any way of using a JSPWiki 3.0 implementation
>>> *without* JSR-170?
>>
>> Exactly.  It would be duplication of work.  And mostly really stupid
>> work, too, since it would mean reinventing the JSR-170 concepts.
>>
>>> I'm rather surprised, really. One of the real
>>> strengths of JSPWiki is that there's a nice, lightweight file
>>> system implementation too.
>>
>> The job of the lightweight file system implementation is the job of
>> the backend, in this case, JSR-170.  It makes a lot of sense to
>> separate backend (i.e. storage) under a separate API, and where we
>> now use the WikiPageProvider, we can get far better support by using
>> JCR.
>>
>>> If the entry ramp is a complex database
>>
>> Nobody said anything about complex databases.
>>
>> I have, over the past year, been writing a lightweight implementation
>> of JSR-170, which uses a very similar pluggable provider system like
>> the current WikiPageProvider.  And yes, it ships with a lightweight
>> file system provider as well.  And no, it does not pass the TCK yet.
>> And yes, I was planning to offer it as the default JCR Repository for
>> JSPWiki 3, and yes, users who need HA or scalability can then switch
>> to Jackrabbit at the flick of a switch.
>>
>> Murray, calm down.  I wouldn't want to throw away the advantages of
>> JSPWiki, and I also still do not particularly like databases.
>>
>> /Janne

Re: JSPWiki 3 design notes

Posted by Terry Steichen <te...@net-frame.com>.

Janne,

What would that do to those of us using the JDBCPageProvider (and the
thousands of pages implemented via that WikiPageProvider)?

Terry

PS: Sorry, but I'm also beginning to wonder if these grand and glorious
plans aren't taking JSPWiki in a direction that will drastically alter
the characteristics that attracted me to it from the outset.


On Tue, 2008-02-05 at 18:12 +0200, Janne Jalkanen wrote:

> > So you don't see any way of using a JSPWiki 3.0 implementation
> > *without* JSR-170?
> 
> Exactly.  It would be duplication of work.  And mostly really stupid  
> work, too, since it would mean reinventing the JSR-170 concepts.
> 
> > I'm rather surprised, really. One of the real
> > strengths of JSPWiki is that there's a nice, lightweight file
> > system implementation too.
> 
> The job of the lightweight file system implementation is the job of  
> the backend, in this case, JSR-170.  It makes a lot of sense to  
> separate backend (i.e. storage) under a separate API, and where we  
> now use the WikiPageProvider, we can get far better support by using  
> JCR.
> 
> > If the entry ramp is a complex database
> 
> Nobody said anything about complex databases.
> 
> I have, over the past year, been writing a lightweight implementation  
> of JSR-170, which uses a very similar pluggable provider system like  
> the current WikiPageProvider.  And yes, it ships with a lightweight  
> file system provider as well.  And no, it does not pass the TCK yet.   
> And yes, I was planning to offer it as the default JCR Repository for  
> JSPWiki 3, and yes, users who need HA or scalability can then switch  
> to Jackrabbit at the flick of a switch.
> 
> Murray, calm down.  I wouldn't want to throw away the advantages of  
> JSPWiki, and I also still do not particularly like databases.
> 
> /Janne

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> Wow. That sounds pretty damn useful for JSPWiki and beyond. I shared
> Murray's concern about total dependence on JCR too, but if a
> light-weight file-based JCR implementation is feasible, that
> definitely changes my thinking.

Yes, it is very feasible - as long as you don't try to shoot for  
total JSR-170 compliance.  Some things, like XPath queries or SQL  
queries can be a bit too complicated, though.  But if we keep the  
feature set that we need relatively clean, and upgrade gradually,  
then I think it all should work just nicely.

/Janne

Re: JSPWiki 3 design notes

Posted by Dave <sn...@gmail.com>.

On Feb 5, 2008 11:12 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
> I have, over the past year, been writing a lightweight implementation
> of JSR-170, which uses a very similar pluggable provider system like
> the current WikiPageProvider.  And yes, it ships with a lightweight
> file system provider as well.  And no, it does not pass the TCK yet.
>
> And yes, I was planning to offer it as the default JCR Repository for
> JSPWiki 3, and yes, users who need HA or scalability can then switch
> to Jackrabbit at the flick of a switch.

Wow. That sounds pretty damn useful for JSPWiki and beyond. I shared
Murray's concern about total dependence on JCR too, but if a
light-weight file-based JCR implementation is feasible, that
definitely changes my thinking.

- Dave

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> So you don't see any way of using a JSPWiki 3.0 implementation
> *without* JSR-170?

Exactly.  It would be duplication of work.  And mostly really stupid  
work, too, since it would mean reinventing the JSR-170 concepts.

> I'm rather surprised, really. One of the real
> strengths of JSPWiki is that there's a nice, lightweight file
> system implementation too.

The job of the lightweight file system implementation is the job of  
the backend, in this case, JSR-170.  It makes a lot of sense to  
separate backend (i.e. storage) under a separate API, and where we  
now use the WikiPageProvider, we can get far better support by using  
JCR.

> If the entry ramp is a complex database

Nobody said anything about complex databases.

I have, over the past year, been writing a lightweight implementation  
of JSR-170, which uses a very similar pluggable provider system like  
the current WikiPageProvider.  And yes, it ships with a lightweight  
file system provider as well.  And no, it does not pass the TCK yet.   
And yes, I was planning to offer it as the default JCR Repository for  
JSPWiki 3, and yes, users who need HA or scalability can then switch  
to Jackrabbit at the flick of a switch.

Murray, calm down.  I wouldn't want to throw away the advantages of  
JSPWiki, and I also still do not particularly like databases.

/Janne

Re: JSPWiki 3 design notes

Posted by Murray Altheim <mu...@altheim.com>.

Janne Jalkanen wrote:
>> Well, if by jcr:uuid you mean a canonical, unique identifier for the
>> resource, I've got that as an 'oid' (object id), created by a 10ms
>> delayed, non-repeating (timestamped) ID factory. The system identifier
> 
> Check out the JCR spec on this one.  It's a globally unique ID of a
> Node. 
> 
>> If you're talking about the prototype we've discussed, I mean more
>> information about the ability to continue to use non-JSR 170 backends,
>> as we may need to. We may have a very strong need to if we end up
>> backing the system into a CMS (which I wouldn't particularly like but
>> it might not work with our archiving system otherwise, dunno yet).
> 
> I think the chances of using a non-JCR backend is essentially zero,
> because it would mean duplicating the entire metadata layer and
> essentially redesigning JSR-170.

So you don't see any way of using a JSPWiki 3.0 implementation
*without* JSR-170? I'm rather surprised, really. One of the real
strengths of JSPWiki is that there's a nice, lightweight file
system implementation too. If the entry ramp is a complex database
that'll likely have huge significance on its use. For example, I've
been working on a lightweight wiki for the OLPC XO (kids') laptop.
It runs fine using the file system but could *never* use a complex
database since the whole shebang must fit into memory.

> But this is a very long topic and I don't have more time...

Agreed -- right now neither do I. But we're not necessarily in a
rush either...

Murray

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <ja...@iki.fi>.

> Well, if by jcr:uuid you mean a canonical, unique identifier for the
> resource, I've got that as an 'oid' (object id), created by a 10ms
> delayed, non-repeating (timestamped) ID factory. The system identifier

Check out the JCR spec on this one.  It's a globally unique ID of a
Node. 

> If you're talking about the prototype we've discussed, I mean more
> information about the ability to continue to use non-JSR 170 backends,
> as we may need to. We may have a very strong need to if we end up
> backing the system into a CMS (which I wouldn't particularly like but
> it might not work with our archiving system otherwise, dunno yet).

I think the chances of using a non-JCR backend is essentially zero,
because it would mean duplicating the entire metadata layer and
essentially redesigning JSR-170.

But this is a very long topic and I don't have more time...

/Janne

Re: JSPWiki 3 design notes

Posted by Murray Altheim <mu...@altheim.com>.

Janne Jalkanen wrote:
>> You don't really need that though. The system identifier for the page
>> should include its full path. In my systems I use a collection/document
> 
> Actually, I disagree.  This is problematic, if you move the page
> around.  Therefore, it's better to use a synthetic property like
> jcr:uuid, which does not necessarily bear any relation to the path of
> the object.

Well, if by jcr:uuid you mean a canonical, unique identifier for the
resource, I've got that as an 'oid' (object id), created by a 10ms
delayed, non-repeating (timestamped) ID factory. The system identifier
is the *current* location of the page. When moved to the trash the
system ID is transferred to DC.source, and if restored from the trash
its default location is recovered from DC.source. The uuid/oid is
only useful in determining the canonical identifier for the resource,
which is a kind of name, not an address.

>> In my system the page revisions are stored within the document record.
> 
> It is useful to look at Mediawiki's storage model; they have a
> separate table for archived versions for optimization purposes.  This
> suggests it might make sense to keep the version history outside of
> the actual Node.  (Which is, incidentally what jcr does as well - the
> version history is exposed as /jcr:system/jcr:versionHistory/<node>)

Yes, I'm aware of that. This was a design decision based on portability
of the document and revisions as a package.

>> a new field (within my own application profile) for 'system id' and
>> use DC.identifier for things like ISBNs, since my metadata records can
>> sometimes refer to a physical or digital resource that isn't stored
>> directly in the system. That's probably not an issue here either.
> 
> I think that this suggests that we should keep our own identifier
> system and leave dc:identifier for user use.

Only if we plan to use DC.identifier (or okay, dc:identifier) for things
like metadata records of external resources. It's fine for JSPWiki, as
that's its intended purpose, i.e., an identifier for a resource/wiki
page.

>> I must admit I'm a bit concerned that the 3.0 design might lock out a
>> large number of existing backend implementations (including my own) with
>> a requirement to use the JSR-170 backend, since currently there are
>> variety of options, and we might assume we don't even know about the
>> custom ones that people have created. I'm working on a major project
>> that *might* migrate to JackRabbit but might not. If not, I'll still
>> need a way to use 3.0 with our systems. The XNodeProvider (Berkeley JE
>> based) is currently our way forward and I'm not in a position to scrap
>> that. So if there's plans to provide a non-JSR-170 backend -- in other
>> words we'll still have a WikiProvider API -- I'd like to know more about
>> those plans.
> 
> Well, yes, there is, and I've told you about it already ;-)

If you're talking about the prototype we've discussed, I mean more
information about the ability to continue to use non-JSR 170 backends,
as we may need to. We may have a very strong need to if we end up
backing the system into a CMS (which I wouldn't particularly like but
it might not work with our archiving system otherwise, dunno yet).

Murray

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <ja...@iki.fi>.

> You don't really need that though. The system identifier for the page
> should include its full path. In my systems I use a collection/document

Actually, I disagree.  This is problematic, if you move the page
around.  Therefore, it's better to use a synthetic property like
jcr:uuid, which does not necessarily bear any relation to the path of
the object.

> In my system the page revisions are stored within the document record.

It is useful to look at Mediawiki's storage model; they have a
separate table for archived versions for optimization purposes.  This
suggests it might make sense to keep the version history outside of
the actual Node.  (Which is, incidentally what jcr does as well - the
version history is exposed as /jcr:system/jcr:versionHistory/<node>)

> a new field (within my own application profile) for 'system id' and
> use DC.identifier for things like ISBNs, since my metadata records can
> sometimes refer to a physical or digital resource that isn't stored
> directly in the system. That's probably not an issue here either.

I think that this suggests that we should keep our own identifier
system and leave dc:identifier for user use.

> I must admit I'm a bit concerned that the 3.0 design might lock out a
> large number of existing backend implementations (including my own) with
> a requirement to use the JSR-170 backend, since currently there are
> variety of options, and we might assume we don't even know about the
> custom ones that people have created. I'm working on a major project
> that *might* migrate to JackRabbit but might not. If not, I'll still
> need a way to use 3.0 with our systems. The XNodeProvider (Berkeley JE
> based) is currently our way forward and I'm not in a position to scrap
> that. So if there's plans to provide a non-JSR-170 backend -- in other
> words we'll still have a WikiProvider API -- I'd like to know more about
> those plans.

Well, yes, there is, and I've told you about it already ;-)

/Janne

Re: JSPWiki 3 design notes

Posted by Murray Altheim <mu...@altheim.com>.

Janne Jalkanen wrote:
>> The document is assigned a new system identifier when in the trash, with
>> its original stored as metadata (e.g., "DC.source").
>>
>> If the document is reinstated from the trash, its original system
>> identifier is used to relocate it (with of course the possibility that
>> a new page with the same address now exists, and the requisite handling
>> of that situation).
> 
> Ah, that's actually kinda neat.  Yeah, that would work.  Thanks Murray!
> 
> If the Trash is per-wikispace, it should work nicely.

You don't really need that though. The system identifier for the page
should include its full path. In my systems I use a collection/document
metaphor, with the wiki application name being used as the collection
name, the wiki page name as the document identifier. All documents get
their own internal URL that contains the full context

    xnode://db/wiki/Main
    xnode://db/wiki/RecentChanges

[Ignoring the 'xnode:' protocol for the moment] when a document is
deleted, I provide the option to move it to a 'trash' collection,
populating the contents of its DC.source metadata field with its
current system identifier (DC.identifier). If the user restores the
document I use the DC.source to put it back where it used to be. The
only downside of this is that I'm overwriting (losing) any contents
of the DC.source field. But that's a side issue.

(i.e., I long ago solved this issue within my own systems).

> What if just individual versions of a page are removed?  Of course, if 
> the wiki:Trash/<identifier>/DC:source points directly to the 
> corresponding version node, then it probably shouldn't really be a big 
> problem...

In my system the page revisions are stored within the document record.
I think this is going to vary depending on the backend used. I store
the entire contents of the document metadata along with each revision
(just copying the entire metadata set into the <head> of the revision),
so this isn't a problem.

One issue for me is that I have previously been using DC.identifier to
store the system identifier for each page, but I've decided to create
a new field (within my own application profile) for 'system id' and
use DC.identifier for things like ISBNs, since my metadata records can
sometimes refer to a physical or digital resource that isn't stored
directly in the system. That's probably not an issue here either.

I must admit I'm a bit concerned that the 3.0 design might lock out a
large number of existing backend implementations (including my own) with
a requirement to use the JSR-170 backend, since currently there are
variety of options, and we might assume we don't even know about the
custom ones that people have created. I'm working on a major project
that *might* migrate to JackRabbit but might not. If not, I'll still
need a way to use 3.0 with our systems. The XNodeProvider (Berkeley JE
based) is currently our way forward and I'm not in a position to scrap
that. So if there's plans to provide a non-JSR-170 backend -- in other
words we'll still have a WikiProvider API -- I'd like to know more about
those plans.

Cheers,

Murray

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> The document is assigned a new system identifier when in the trash,  
> with
> its original stored as metadata (e.g., "DC.source").
>
> If the document is reinstated from the trash, its original system
> identifier is used to relocate it (with of course the possibility that
> a new page with the same address now exists, and the requisite  
> handling
> of that situation).

Ah, that's actually kinda neat.  Yeah, that would work.  Thanks Murray!

If the Trash is per-wikispace, it should work nicely.

What if just individual versions of a page are removed?  Of course,  
if the wiki:Trash/<identifier>/DC:source points directly to the  
corresponding version node, then it probably shouldn't really be a  
big problem...

/Janne

Re: JSPWiki 3 design notes

Posted by Murray Altheim <mu...@altheim.com>.

Janne Jalkanen wrote:
[...]
> This is what TWiki does.  It has a couple of problems:
> 
> 1) It needs specific handling with respect to moving and renaming - the 
> system needs to know that Trash is a special WikiSpace.

Yes, certainly. And there would be restrictions on the kinds of access
that both users (admins) and code would have on the trash. A Trash API.

> 2) If two persons from different WikiSpaces move a similarly named page 
> to the Trash, then there's a problem.  Or, the trash is not emptied, and 
> someone recreates the page and moves it again to the trash.  It 
> becomes... problematic.  Though, actually there is also a problem if a 
> page is marked deleted, but it gets recreated - is the old one brought 
> back or what?

The document is assigned a new system identifier when in the trash, with
its original stored as metadata (e.g., "DC.source").

If the document is reinstated from the trash, its original system
identifier is used to relocate it (with of course the possibility that
a new page with the same address now exists, and the requisite handling
of that situation).

Murray

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Re: JSPWiki 3 design notes

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

> "Attachment"
>
> Probably you want to differentiate attachments from sub-pages.
> (similar as versions)
>
> /Main/wiki:attachments/<attachment-name>/

Why?

An attachment is essentially non-wikimarkup content.

> Attachments may need some other properties as well:
> * wiki:length : length in bytes of an attachement

Ah.  Yes.  On the other hand, this would not really be a property,  
but a pseudo-property (in the sense that it would not be a part of  
the content). We'll have to think about pseudoproperties (e.g. URI,  
length, title, etc).

> Probably  wiki:version would be a useful property as well.

Very true.  Oops...  Added to the versioning section.

>
> ==> This check should also include "no-older-than-one-hour", "same
> author", "authenticated"

True, fixed.

> "Page deletion"
>
> Why not move the page to a "Trash" WorkSpace, so the user can go their
> an recover a page, or empty its Trash ?

This is what TWiki does.  It has a couple of problems:

1) It needs specific handling with respect to moving and renaming -  
the system needs to know that Trash is a special WikiSpace.
2) If two persons from different WikiSpaces move a similarly named  
page to the Trash, then there's a problem.  Or, the trash is not  
emptied, and someone recreates the page and moves it again to the  
trash.  It becomes... problematic.  Though, actually there is also a  
problem if a page is marked deleted, but it gets recreated - is the  
old one brought back or what?

Maybe a WikiSpace-specific wiki:Trash space might be the best  
solution.  And an admin function to empty it.

> Are you considering to extend the wiki-link syntax with xpath notation
> ?  Ref. http://www.jspwiki.org/wiki/ 
> IdeaWikiLinksThroughXPATHIncludingSubPagesSupport
> That page contains also some ideas on how to resolve wikilinks.

Yup.  JCR requires XPath support anyway, though we don't necessarily  
need all that stuff - it gets awfully complicated after a while, and  
formatting needs to be thought about.

/Janne

Re: JSPWiki 3 design notes

Posted by Dirk Frederickx <di...@gmail.com>.

Great note. Looks promising !


Some minor notes/ideas :


"Attachment"

Probably you want to differentiate attachments from sub-pages.
(similar as versions)

/Main/wiki:attachments/<attachment-name>/

Attachments may need some other properties as well:
* wiki:length : length in bytes of an attachement



"Versioning"

Probably  wiki:version would be a useful property as well.


"Change tracking"
...
"It might also be useful to adopt the TWiki way of replacing the last
version on save, if the modification is no older than one hour. This
prevents multiple changes from accumulating in the workspace when the
user keeps pressing save."
==> This check should also include "no-older-than-one-hour", "same
author", "authenticated"


"Page deletion"

Why not move the page to a "Trash" WorkSpace, so the user can go their
an recover a page, or empty its Trash ?



General:
Are you considering to extend the wiki-link syntax with xpath notation
?  Ref. http://www.jspwiki.org/wiki/IdeaWikiLinksThroughXPATHIncludingSubPagesSupport
That page contains also some ideas on how to resolve wikilinks.



dirk