You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lenya.apache.org by "J. Wolfgang Kaltz" <ka...@interactivesystems.info> on 2005/03/20 17:59:58 UTC

[1.4] document model (Was: Re: [1.4] component management)

Andreas Hartmann schrieb:
> J. Wolfgang Kaltz wrote:
> 
> [...]
> 
>> That is true. But actually I'm wondering about something else in this 
>> class now: why is the document type for a document resolved upon every 
>> request ? The knowledge "document A is of type B" is something quite 
>> static, so shouldn't the model reflect this somehow ?
> 
> 
> The problem here are the current concepts of meta data and document type
> assignment in Lenya.
> 
> Meta data are stored in the document itself. But we want to allow arbitrary
> document structures, which means we can't require to store meta data in the
> document. That implies that all meta data are optional and therefore the
> document type can't be stored in the meta data.
> IMO this has to be changed (mandatory meta data). Praise the repository!
> 
> The sitetree can't be used, because it is optional and the information
> doesn't belong to the site structure.
> 
> That's why we decided to use the SourceTypeAction for document type
> resolving. Which means we need a cocoon:// request every time we want
> to obtain the document type.
> 
> Up to now, it was not possible to implement Document.getDocumentType()
> because the document couldn't be provided with the ServiceManager and
> therefore not access the DocumentTypeResolver/URIParamterizer. That's
> why the document type is not even stored in the DocumentIdentityMap.

Thx for the explanation.
The way I see it, the Lenya core should definitely distinguish between 
the content of a document and its meta-data. IMO meta-data must be 
mandatory for a document, and the core must provide separate access to a 
document's content vs. a document's meta-data.
This is in fact already what happens for assets, because meta-data 
cannot be written into an asset, as opposed to an XML-based document. 
Ironically, in this case it is unfortunate that you can store arbitrary 
information in XML since this mislead us to store Lenya meta-information 
within XHTML content.

So here is my +1 to change the handling of document meta-data to make it 
identical to asset meta-data, that is
- mandatory
- separately from content

> 
> 
>> Please forgive my naivete if this is too simplistic :) But it seems to 
>> me the model should know "What document type does a document have?" 
>> without needing an http request ?
> 
> 
> Yes, definitely. The question is if we should implement a temporary
> solution until we have the repository. IMO this is not necesarry since
> it is "just" a performance issue.

I think it is more than that, also a maintenance / extensibility issue. 
Generally speaking, we should have a document model which accomplishes 
what we want (obviously ;) ), notwithstanding the storing of the data 
within a repository. And I think it should be a goal of 1.4. Regarding 
migration, the migration scripts can take the meta-data from the xhtml 
header and generate a separate file, so that will be painless. The 
documents of a custom doctype would perhaps be more challenging to migrate.

> 
> The other option is to require meta data in a document, which reduces
> the flexibility (but IMO to a still reasonable amount).

+1, if we mean: a document must have meta data. But not "in" it, along 
with it.

WDYT ?

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by "J. Wolfgang Kaltz" <ka...@interactivesystems.info>.

Gregor J. Rothfuss schrieb:
> J. Wolfgang Kaltz wrote:
> 
>> The way I see it, the Lenya core should definitely distinguish between 
>> the content of a document and its meta-data. IMO meta-data must be 
>> mandatory for a document, and the core must provide separate access to 
>> a document's content vs. a document's meta-data.
> 
> 
> +1 as long as some of that metadata is maintained by the system. don't 
> want to force the user to enter metadata if they dont want to. but 
> exposing key metadata as searchable jcr properties makes sense.

Agreed,
actually I don't think the changes discussed would directly affect the 
CMS user at all. "only" the internals would be affected (though it would 
be an important change)

When a document is created, the system knows a lot of meta data: e.g. 
the creator, the time, and the type of the document. So Lenya can always 
create the mandatory meta data, the way I see it.

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by Torsten Schlabach <ts...@apache.org>.

+1 for separating metadata from content!
I had been thinking about this as well.
Torsten

Gregor J. Rothfuss schrieb:
> J. Wolfgang Kaltz wrote:
> 
>> The way I see it, the Lenya core should definitely distinguish between 
>> the content of a document and its meta-data. IMO meta-data must be 
>> mandatory for a document, and the core must provide separate access to 
>> a document's content vs. a document's meta-data.
> 
> 
> +1 as long as some of that metadata is maintained by the system. don't 
> want to force the user to enter metadata if they dont want to. but 
> exposing key metadata as searchable jcr properties makes sense.
> 
>> This is in fact already what happens for assets, because meta-data 
>> cannot be written into an asset, as opposed to an XML-based document. 
>> Ironically, in this case it is unfortunate that you can store 
>> arbitrary information in XML since this mislead us to store Lenya 
>> meta-information within XHTML content.
>>
>> So here is my +1 to change the handling of document meta-data to make 
>> it identical to asset meta-data, that is
>> - mandatory
>> - separately from content
> 
> 
> +1
> 
>> I think it is more than that, also a maintenance / extensibility 
>> issue. Generally speaking, we should have a document model which 
>> accomplishes what we want (obviously ;) ), notwithstanding the storing 
>> of the data within a repository. And I think it should be a goal of 
>> 1.4. Regarding migration, the migration scripts can take the meta-data 
>> from the xhtml header and generate a separate file, so that will be 
>> painless. The documents of a custom doctype would perhaps be more 
>> challenging to migrate.
> 
> 
> if they use the lenya:meta wrapper, it's pretty easy.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by "J. Wolfgang Kaltz" <ka...@interactivesystems.info>.

Michael Wechner schrieb:
> jplejacq wrote:
> 
>> Gregor J. Rothfuss wrote:
>>
>>> J. Wolfgang Kaltz wrote:
>>>
>>>> So here is my +1 to change the handling of document meta-data to 
>>>> make it identical to asset meta-data, that is
>>>> - mandatory
>>>> - separately from content
>>>
>>>
>>>
>>> +1
>>
>>
>>
>> I'm not sure that the meta-data should be separated from the content. 
>> Many (if not most) asset file types already support meta-data directly 
>> (for example, pdf, png, jpg). It seems to be that the meta-data and 
>> content are intrinsically related. 

This is an interesting point, but it is actually another type of 
meta-data. The Lenya meta-data contains the meta-data relevant to Lenya, 
such as the Lenya author who put this content in the system, the date 
this was done, etc. The asset's internal meta-data, if present, is 
unrelated to this. It might also contain an author field, but this has 
no relation to the author in the Lenya sense. So, using the asset's 
potential meta-data in some manner is an interesting idea, but we need 
to define and handle the Lenya meta-data nonetheless.

> 
> 
> 
> agreed, but I guess Wolfgang wants to make it more atomic (at least t 
> that's how I understand it), such that meta and content for instance can 
> be edited by different "users" at the same time and the interfaces 
> should also become easier because
> of the atomization. Or do I misunderstand this completely?

Yes, I basically meant atomization, i.e. splitting up the meta and 
content. Though for more far-reaching purposes than being able to edit 
them separately at the same time. IMO making meta-data for documents 
mandatory, and handling them separately from the content is an important 
architectural basis, for things such as plugins, and anyway makes the 
code and interfaces simpler, so makes new developments easier. Of course 
it will also enhance the performance (see older mails).

Anyway it looks like we have a consensus on this - I should have time to 
try to implement this, but this smells like the type of change that can 
easily break things in trunk (unexpected side-effects and the like; I'm 
also concerned about the editor integrations), so we should be aware of 
that. If there are no objections, I will take a shot at it.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.

jplejacq wrote:

> I prefer keeping the two together and would like us to move to 
> supporting the native meta capabilities of assets.

can you define 'together' a bit more? do you care how it is implemented 
in the backend, whether as seperate files, JCR properties, EXIF data or 
embedded into the XML? or do you mean 'together' in a sense that you can 
edit metadata and content at the same time?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by jplejacq <jp...@quoininc.com>.

Michael Wechner wrote:
> jplejacq wrote:
> 
>>> J. Wolfgang Kaltz wrote:
>>>
>>>> So here is my +1 to change the handling of document meta-data to 
>>>> make it identical to asset meta-data, that is
>>>> - mandatory
>>>> - separately from content
>>
>> I'm not sure that the meta-data should be separated from the content. 
>> Many (if not most) asset file types already support meta-data directly 
>> (for example, pdf, png, jpg). It seems to be that the meta-data and 
>> content are intrinsically related. 
> 
> 
> agreed, but I guess Wolfgang wants to make it more atomic (at least t 
> that's how I understand it), such that meta and content for instance can 
> be edited by different "users" at the same time and the interfaces 
> should also become easier because
> of the atomization. Or do I misunderstand this completely?

Oh, I see. Makes sense. +1

-- 
JP

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by Michael Wechner <mi...@wyona.com>.

jplejacq wrote:

> Gregor J. Rothfuss wrote:
>
>> J. Wolfgang Kaltz wrote:
>>
>>> So here is my +1 to change the handling of document meta-data to 
>>> make it identical to asset meta-data, that is
>>> - mandatory
>>> - separately from content
>>
>>
>> +1
>
>
> I'm not sure that the meta-data should be separated from the content. 
> Many (if not most) asset file types already support meta-data directly 
> (for example, pdf, png, jpg). It seems to be that the meta-data and 
> content are intrinsically related. 

 
agreed, but I guess Wolfgang wants to make it more atomic (at least t 
that's how I understand it), such that meta and content for instance can 
be edited by different "users" at the same time and the interfaces 
should also become easier because
of the atomization. Or do I misunderstand this completely?

> I prefer keeping the two together and would like us to move to 
> supporting the native meta capabilities of assets.


I guess that's worth another thread ;-)

Michi


-- 
Michael Wechner
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by jplejacq <jp...@quoininc.com>.

Gregor J. Rothfuss wrote:
> J. Wolfgang Kaltz wrote:
> 
>> So here is my +1 to change the handling of document meta-data to make 
>> it identical to asset meta-data, that is
>> - mandatory
>> - separately from content
> 
> +1

I'm not sure that the meta-data should be separated from the content. 
Many (if not most) asset file types already support meta-data directly 
(for example, pdf, png, jpg). It seems to be that the meta-data and 
content are intrinsically related. I prefer keeping the two together and 
would like us to move to supporting the native meta capabilities of assets.

-- 
JP

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [1.4] document model (Was: Re: [1.4] component management)

Posted by "Gregor J. Rothfuss" <gr...@apache.org>.

J. Wolfgang Kaltz wrote:

> The way I see it, the Lenya core should definitely distinguish between 
> the content of a document and its meta-data. IMO meta-data must be 
> mandatory for a document, and the core must provide separate access to a 
> document's content vs. a document's meta-data.

+1 as long as some of that metadata is maintained by the system. don't 
want to force the user to enter metadata if they dont want to. but 
exposing key metadata as searchable jcr properties makes sense.

> This is in fact already what happens for assets, because meta-data 
> cannot be written into an asset, as opposed to an XML-based document. 
> Ironically, in this case it is unfortunate that you can store arbitrary 
> information in XML since this mislead us to store Lenya meta-information 
> within XHTML content.
> 
> So here is my +1 to change the handling of document meta-data to make it 
> identical to asset meta-data, that is
> - mandatory
> - separately from content

+1

> I think it is more than that, also a maintenance / extensibility issue. 
> Generally speaking, we should have a document model which accomplishes 
> what we want (obviously ;) ), notwithstanding the storing of the data 
> within a repository. And I think it should be a goal of 1.4. Regarding 
> migration, the migration scripts can take the meta-data from the xhtml 
> header and generate a separate file, so that will be painless. The 
> documents of a custom doctype would perhaps be more challenging to migrate.

if they use the lenya:meta wrapper, it's pretty easy.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org