You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lenya.apache.org by Andreas Hartmann <an...@apache.org> on 2006/02/03 09:50:47 UTC

[RFC] Terminology

Hi Lenya devs,

a while ago, we started to discussed the repository API terminology.
I think it's time that we agree on something, because that will
make further discussions easier.

That's not a vote, but it would be appreciated if you could state
your opinion. I'd like to emphasize that this discussion is not
about the API itself, but only about the terms we use. So please
don't go and complain about the concept of areas or language versions.
These issues can be discussed later on.


----

A website in Lenya is called a

- publication [+1]
- site
- ...

----

The resources of a website can exist at the same time in several

- areas [+1]
- ...

----

The entirety of plain information (without structuring) is called

- content [+1]
- resources
- ...

----

The set of language versions of a piece of information is called

- content node
- content item [+1]
- document
- resource
- ...

----

A specific language version is called

- content item
- language version [+1]
- document
- ...

----

A version in the history of a language version is called

- (history) version [+1]
- ...

----

The structuring information (there may be several of them) are

- sites
- structures [+1]
- navigations

(maybe we have to make a distinction between "structure" and "navigation")

----

A node in the structure is called

- (structure) node [+1]
- site node
- navigation item
- ...


For me, the following text sounds quite good:

In Lenya, a website, or an independent part of a large website, or a collection
of documents to manage, is called a *publication*. The content of a
publication is stored in *areas* to allow different versions to exist
at the same time (e.g., authoring, staging, and live).

Lenya stores the actual *content* in a non-hierarchical way.
Each *content item* consists of a set of *language versions*.
Each *version* of the history of a language version can be viewed and
rolled back to.

To create navigation widgets, sitemaps etc., *structures* can be created.
The *structure nodes* reference the content items or specific language versions.



WDYT?

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/4/06, Bob Harner <bo...@gmail.com> wrote:
> BTWI think I was unclear earlier in what I meant by "Content Item" and
> its relationship to Document.  I meant that both a Document and an
> Asset are each Content Items.  I didn't mean for "Content Item" to
> mean the aggregate of a Document and its Assets.
>
> One thing, though:  It isn't clear to me whether Jörn and Solprovider
> really agree on what a Document is.  Jörn seems to be using Document
> as the aggregate/container term for both the editable text and the
> assets of a page.  But Solprovider seems to be suggesting the words
> Resource or Content for that purpose.
>
> I wonder if we shouldn't promote "Asset" a little like this: A
> Document consists of Assets.  An Asset can be any of the following:  a
> German translation of the document, a French translation, a JPEG
> image, a PDF file, etc. All are assets.  Collectively, they are a
> Document.  This approach does away with the need for "Content Item"
> and helps us in treating assets just like other managed content, which
> several people have stated as a goal.

Sorry.  I was using Asset in the Lenya 1.2 GUI meaning of anything
that is not a Document.  I also assumed we were moving Assets from the
Document to the Publication.

Old:
Content/Document/Assets

New:
Content/Document
Content/Asset

Each ["Resource | "Content Node" | "Content Item"] is either a
Document or Asset.
Each Document has Translations which have Revisions.
Each Asset has Revisions.

Advantages:
- Assets have Revisions and Workflow like Documents.
- The URL for Assets is based on the Publication.  Usage could be as
simple as "/mypicture.gif" for the server's default pub, or
"/mypub/mypicture.gif" for other pubs.  That would require the default
"live" Module checks extensions for handling.  There are many possible
algorithms:
- - "live" retrieves the data;
- - "live" passes to "assets" which retrieves the data;
- - "live" passes to "assets" passes to "images" which retrieves the data.
Any of these options are easier than 1.2's Make Link "docid/assetname.ext".

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Thorsten Scherler <th...@wyona.com> wrote:
> El mié, 08-02-2006 a las 03:06 -0500, solprovider@apache.org escribió:
> > 2. General Usage defines "Document" as "a single unit of textual
> > information".
> ...and that is exactly my problem of this word in lenya. It is not
> (better does not have to be) a *single* unit of textual information!
> Further it is *not* limited to *textual information*.

Please give examples.  Or maybe it is because (in my mind) a
"Document" is a single piece of textual information, and a "Page" is
the single unit displayed to a visitor.  If you use "Document" for
both definitions, things get very confused.  Lenya adds navigation and
presentation to Documents to create a Page.

> > A Related Technology (XML) defines "Document" as "a
> > single unit of textual information conforming to the XML
> > specification".  A Previous Version (Lenya 1.2) defines "Document" as
> > "a single unit of textual information conforming to the XML
> > specification with specified security and workflow".  No other
> > definition is allowed.
>
> This implies that a document *cannot* contain other then "textual
> information" which is not reflecting the reality of the usage in lenya
> nor on this list!
See above.

> > 6. No software uses the term "Nugget".
> That is why it is perfect. ;-)
> > The phrase "I created a
> > Nugget" currently has no meaning.
> Well put it like "I created a(n) (information/content) nugget". Then it
> has perfect meaning.

Now you are using two words where previously one was sufficient.

> > "I created a Document" can be
> > understood by people who are barely computer literate.
> For "people who are barely computer literate" that may mean something
> else then we have here.
I was demonstrating the ubiquitousness of the term and its definition.

> More, a document is a unit, like you pointed out
> above, of textual information, coming back to a svg drawing that would
> not fit this "document" definition, but still I consider it as an
> information unit/nugget.

SVG is XML.  A unit of SVG data is a Document.  The resulting image
could be saved as an Asset (losing its dynamic abilities) or saved in
a cache by the "svg2image" Module.

> > While none of the old definitions of "Resource" are
> > needed, the word is already used by Lenya, and should be used before
> > adding a new word.
> Well http://en.wikipedia.org/wiki/Resource
> "Resource (economics), commodities and human resources used in the
> production of goods and services.
> Resource (computer science), valuable information"
>
> Resources has for me the same sweeping meaning like the word "thing". It
> is everything and nothing.  What is the difference for you between
> document and resources? Why should we keep on using a definition that
> only confuses?

According to your source, "Resource" is "valuable information".  All
the units under Content are "valuable information".  Yes, it is more
generic than "Document" and "Asset", which is why "Resource" is a good
term as the generic for both.

> I do not follow and see the logic of the statement "the word is already
> used by Lenya, and should be used before adding a new word".

For Lenya people, there is already a an entry in the brain for
"Resource".  The definition is rather blurry, but the entry exists. 
Reusing that entry requires changing the definition.  Adding a new
term requires creating a new entry, which increases the memory
requirements.

> > General Usage defines "Publication" as "A
> > specific issue of a public work including textual information".  The
> > output of a Lenya 1.2 Publication is close enough to the General Usage
> > definition that very few people have been confused by the term.  If it
> > works, do not break it.
> Well, it depends where you turn to look up "general usage". ;-)
> http://en.wikipedia.org/wiki/Publication
> "The word publication means the act of publishing, and it also means any
> writing of which copies are published, and any website. Among
> publications are books, and periodicals, the latter including magazines,
> scholarly journals, and newspapers."

Yes, I skipped the alternate definitions of "something that regularly
publishes {definition #1}", and "the act of publishing {definition
#1}".  A Lenya Publication fits those definitions too, which is why
"Publication" is so easily understood.

> > 8. "Resource" is better than "ContentItem" because it contains fewer words.
> ...but does not say anything about the function. IMO it should be
> (content)nugget.
See every post I have made about that subject.  Both terms meet the
functional requirements.  "Resource" is shorter, and was previously
used in Lenya.  See above.

> > ===
> > 1. "TableOfContent" is three words where we have been using one
> > ("Sitetree").  It also implies a single structure.
> No, that depends on the abstraction level. You are correct that a text
> book has *one* toc but as well some has a glossary, ... That are other
> ways to represent the structure on the book.
There is also the "Index" and the "Index of Illustrations".  :)

> > 2. "TOC" is not immediately recognizable by people outside the printed
> > paper industry.
> Well, it is a common acronym
> http://www.acronymfinder.com/af-query.asp?String=exact&Acronym=TOC&Find=Find
I did not write "TOC" is not recognizable, just that it required
thought for most people.  Requiring people to think is a lost cause. 
If we used "TOC" as a term, how often would we see user-ML posts
asking, "What is a TOC?"

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Josias Thoeny <jo...@wyona.com> wrote:
> Basically we're trying to unify documents and assets (from Lenya 1.2),
> right?. I could also imagine to call everything "document" instead of
> "asset".
> For me it's more natural to call a JPEG a "document", than to call an
> XHTML document an "asset". But it's a matter of personal preference...

"Document" is already well-defined.
"Asset" was defined in previous versions of Lenya.

Using either term for the generic will confuse people.  I think having
terms that imply the XML vs. Non-XML distinction is important because
Lenya is XML-based and must treat XML differently than non-XML.  The
following snippet does not make sense to me, and should break Cocoon:
   <map:generate src="image.gif"/>
   <map:transform src="mytransform.xsl">
   <map:serialize type="binary"/>

We are moving Assets to the top-level of Content as siblings to
Documents.  (If we are not, this discussion is almost pointless.)  A
generic term for all children of Content, both Documents and Assets,
would be very useful.  "Resource" and "ContentItem" are the popular
candidates.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, J. Wolfgang Kaltz <ka...@interactivesystems.info> wrote:
> Andreas Hartmann schrieb:
> > solprovider@apache.org wrote:
> >> Presentation configuration will be moved
> >> inside Modules.  While none of the old definitions of "Resource" are
> >> needed, the word is already used by Lenya, and should be used before
> >> adding a new word.
> > As I understood it, the term "resource" is rejected by most developers,
> > because the meaning is too general. But I guess "resource" vs.
> > "content item" will be a never-ending story unless we do a vote and
> > all developers commit to the decision.
>
> IMO we need to distinguish between pieces of content to be managed by
> the CMS (Lenya) and other "stuff", for which Lenya may also potentially
> provide interfaces: what about the XSL stylesheets for example. They are
> not content and should not be seen by content maintainers. They are
> however resources of the CMS and, maybe, one day in the future, they
> will be editable within the CMS (administration area? a new "design
> area"? but that day, they still won't be content)

You cheated.  I am introducing the concepts slowly, and you jumped to
the back of the book.

Yes, if we define "Resource" as the parent object that maintains the
Security, Translations (languages), and Revisions of all objects under
Content, then the same object will be used to maintain the Security,
Translations, and Revisions of functional resources.  You even called
them "RESOURCES of the CMS".

"Areas" are being replaced by "Modules" (at least in my mind.)  It
will be easy to add a Module to edit anything in the Lenya
fileSystem/repository.  Security will be very important, and the
Resource class will already have proven code from handling Content.

> That's why I favor explicitly having the word "content" in whatever
> terminology we use to describe pieces of content. Thus my proposal a
> while back (http://wiki.apache.org/lenya/ProposalContentModel), where
> "ContentItem" is a piece of content (or ContentNugget, or whatever), and
> a Document is a collection of such pieces.

I dislike multiple-word terms when single-word terms suffice.  "Item"
would become overloaded.  The currently common use of "Item" in Lenya
is "an Element of a Document", much lower on the tree than Content.

> I am not trying to say "I am right, why doesn't everybody agree" ;)
> But to be honest, I still haven't understood what is wrong with the
> proposal above.

I am not fighting for the word "Resource".  I want a consistent
terminology.  If anybody has a term that is better understandable as
the superset of Documents and Assets, I will immediately support it. 
I spent time with a Thesaurus because there is much resistance to
"Resource", but did not find a better term.

Is "Resource" an insult in some language?  It has no negative
connotations in American English.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Thorsten Scherler <th...@wyona.com>.

El mié, 08-02-2006 a las 03:06 -0500, solprovider@apache.org escribió:
> Rules for Determining Terminology
> 

...
> 2. General Usage defines "Document" as "a single unit of textual
> information".  

...and that is exactly my problem of this word in lenya. It is not
(better does not have to be) a *single* unit of textual information!
Further it is *not* limited to *textual information*. 

> A Related Technology (XML) defines "Document" as "a
> single unit of textual information conforming to the XML
> specification".  A Previous Version (Lenya 1.2) defines "Document" as
> "a single unit of textual information conforming to the XML
> specification with specified security and workflow".  No other
> definition is allowed.

This implies that a document *cannot* contain other then "textual
information" which is not reflecting the reality of the usage in lenya
nor on this list! 

...

> 
> 6. No software uses the term "Nugget".  

That is why it is perfect. ;-)

> The phrase "I created a
> Nugget" currently has no meaning.  

Well put it like "I created a(n) (information/content) nugget". Then it
has perfect meaning.

> "I created a Document" can be
> understood by people who are barely computer literate.

For "people who are barely computer literate" that may mean something
else then we have here. More, a document is a unit, like you pointed out
above, of textual information, coming back to a svg drawing that would
not fit this "document" definition, but still I consider it as an
information unit/nugget.

>   "I created an
> Asset" (meaning uploaded a file) is understood by users of Lenya 1.2. 
> "I created a Resource" would be understood by most people.  Lenya 1.2
> used "Resources" for a variety of additions: static graphics, storage
> of "Assets", presentation configuration (CSS and i18n), and software
> plug-ins (such as editors).  That was confusing to developers, and
> created a messy datastore. Lenya 1.4 has not defined "Resource" yet,
> but should relate to one or more of Lenya 1.2's definitions.  Software
> plug-ins are being defined as "Modules".  All graphics should be
> "Assets" (Why allow static graphics?  Let everything be editable, and
> improve security so the security granted by requiring file system
> access is unnecessary.)  Presentation configuration will be moved
> inside Modules.  While none of the old definitions of "Resource" are
> needed, the word is already used by Lenya, and should be used before
> adding a new word.
> 

Well http://en.wikipedia.org/wiki/Resource
"Resource (economics), commodities and human resources used in the
production of goods and services.
...
Resource (computer science), valuable information"

Resources has for me the same sweeping meaning like the word "thing". It
is everything and nothing.  What is the difference for you between
document and resources? Why should we keep on using a definition that
only confuses? 

I do not follow and see the logic of the statement "the word is already
used by Lenya, and should be used before adding a new word".

> 7. Lenya 1.2 defines "Publication" as "Content and related processing
> instructions" where "Content" is defined as "(XML) Documents and
> related Assets" where "Assets" is defined as "uploaded files".  
> People keep suggesting using "Site", "Website", "Book", and other
> terms for this concept.  General Usage defines "Publication" as "A
> specific issue of a public work including textual information".  The
> output of a Lenya 1.2 Publication is close enough to the General Usage
> definition that very few people have been confused by the term.  If it
> works, do not break it.

Well, it depends where you turn to look up "general usage". ;-)
http://en.wikipedia.org/wiki/Publication
"The word publication means the act of publishing, and it also means any
writing of which copies are published, and any website. Among
publications are books, and periodicals, the latter including magazines,
scholarly journals, and newspapers."

> 8. "Resource" is better than "ContentItem" because it contains fewer words.

...but does not say anything about the function. IMO it should be
(content)nugget.

> ===
> On 2/7/06, Thorsten Scherler <th...@apache.org> wrote:
> > What both models of http://wiki.apache.org/lenya/GlossaryStructure have
> > in common is that "content" is stored in a Publication. Now Andreas is
> > using frequently the word "structure" in referring to the sitetree or
> > like solprovider is calling it index of a publication. Seeing it from a
> > different angle a sitetree or index or structure is nothing else then
> > the typical table of content (toc) of a (text) book.
> 
> 1. "TableOfContent" is three words where we have been using one
> ("Sitetree").  It also implies a single structure.

No, that depends on the abstraction level. You are write that a text
book has *one* toc but as well some has a glossary, ... That are other
ways to represent the structure on the book. 

> 2. "TOC" is not immediately recognizable by people outside the printed
> paper industry.

Well, it is a common acronym
http://www.acronymfinder.com/af-query.asp?String=exact&Acronym=TOC&Find=Find

salu2
-- 
Thorsten Scherler
COO Spain
Wyona Inc.  -  Open Source Content Management  -  Apache Lenya
http://www.wyona.com                   http://lenya.apache.org
thorsten.scherler@wyona.com                thorsten@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Thorsten Scherler <th...@apache.org> wrote:
> El mié, 08-02-2006 a las 13:12 -0500, solprovider@apache.org escribió:
> > On 2/8/06, Thorsten Scherler <th...@wyona.com> wrote:
> > > page -> why that is only html?
> > Websites use HTML.  It is the primary format for web browsing.  Most
> > pipelines finish with <map:serialize type="html">.  It is so common
> > the type="html" is not necessary because it is the default.  I
> > sometimes serialize as XML for testing, but I do not want visitors to
> > see it.
> Yeah, for lenya ATM you may are right but see e.g. something like
> forrest where this is not true. I will start very soon a forrest
> compatible lenya pub where I will drop this focus.

It sounds like I should check out Forrest.  From reading the Lenya
MLs, it sounded like most of the Forrest project could become Lenya
Modules.  Is that correct?

> Lenya never stated to be *just* a web CMS focusing on editing webpages
> (html). That may is your assumption from knowing 1.2 but making lenya
> forest compatible will give you a collection of new outputs out of the
> box. Some already are available in lenya like OpenOffice and basic
> OpenFormat support. The idea is to use lenya to manage any kind of
> content not only webbased on.

Good idea.  It does not change the definitions of Page and Document.  See below.

> > The "rss" Module would serialize as XML, but RSS uses "Feeds"
> > containing "Channels" containing "Items", as well as referring to the
> > response as a "Document" using the pure XML definition.
>
> Here you see that some formats are referring to the response of a
> request as document (be it the "pure XML definition" or not). Not only
> RSS but as well e.g. open office, POD and pdf are referring to the
> response of a request as a "Document". Even you can say "the html
> document".

I have never referred to an "HTML Document".  HTML has Pages.  XML has
Documents.  RSS and OOo are XML-based software, so they use
"Documents".  PDF and MSWord refer to "Word Processing Document",
which is not necessarily XML, although both formats can be converted
to XML.  What is "POD"?

> > > What is the relationship between document and pages?
> > Documents are internal units of XML data.
> see above, why limiting documents to xml?

Because XML has "XML Documents", Lenya and Cocoon are XML-based, and
that is how "Document" was defined for Lenya 1.2.

How could Lenya change to work with non-XML data?  I am not certain it
is possible.  Much of Lenya's functionality is the merging and
filtering of data.  There must be a common format for the data.  XML
is the current (and probably best) format.  Let us pretend there is a
Module that coverts an Image to XML.  What could we do with it? 
Converting a PDF or MSWord DOC to XML produces something usable, but I
doubt Lenya should do anything before they are converted to XML. 
Everything must be converted to XML, or treated as an
uploadable/downloadable/unchangeable-within-Lenya Asset.

> > They are separated by
> > purpose, type, and security.
> >
> > Purpose: one Document contains the list of Contributors, another how
> > to install Lenya.  Different purpose, different document.
> Hmm, separation by purpose normally brings a different "type". Like in
> forrest document-v20.dtd is for "normal" docs and faq-v12.dtd for faq
> documents. I think the type determines the purpose.
This was giving reasons to have more than one Document.  You can have
a FAQ about Lenya, and a FAQ about Forrest.  I probably should have
used "Subject" instead of "Purpose".

> > Type: Most documents are XHTML, basic word processed content.  A
> > "product" document contains different (and more rigid) fields.
> > Different DTD, different document.
> >
> > Security: One document can be edited by any editor.  "Product"
> > Documents can only be edited by the inventory maintainer.  Different
> > security requirements, different document.
> I agree on type, but security is provided by the overall application
> controller (lenya). The documents should be only linked in the security
> component and not know about security itself.

A Resource (Document, Asset, and even Code) should be able to
configure its own security.  Yes, security must be handled by the
platform, but each Resource should be able to configure who can read
it, who can edit it, and sometimes who can or is required to approve
or publish it.  Most of the time that list is inherited from the
Parent or a default, but each Resource should be able to change it.

An Image, such as the company logo, can be read by everybody, edited
by the graphics designer, and approved/published by senior management.
 A Document, such as the draft of a financial report, may be seen by
accounting and senior management, edited by accounting, must be
approved by 2 senior managers, and published by the CFO.  Workflow
handles most of it, but Security verifies what is allowed.  The
janitor cannot read the financial report until publishing (Workflow)
changes the security of the Document so everybody can read it.

> > When creating a Page, Lenya can aggregate introduction text from one
> > Document, a list of products from other Documents, navigation based on
> > the Publication, and other presentation information.
> Coming back to the "printed paper industry" (the main user of a CMS!) a
> paper is *one* piece of a document and not the other way around. The
> above said *only* makes sense in html. You do not need navigation e.g.
> in POD or PDF or ....

Yes, most of my ideas are assuming HTML.
Yes, in the print industry, Pages are sections of a Document.
And in one of my posts, I suggested a Module that returns a section of
a Document.

Documents (as XML) are units of data.  Pages are units of display. 
Whether a Page is built from zero, one, or more Documents depends on
the requirements.  Whether a Page uses the entire Document depends on
the requirements.  The "menu" and "rss" Modules use a very small
portion of many Resources to create one Document.  The "page" Module
returns a section of a Document.  The "live" Module aggregates them
and returns an HTML Page.

> The above re-written:
> "When creating a content aggregation, Lenya can use different content
> nuggets like introduction text, a list of products, navigation based on
> the Publication, and other presentation information."

Andreas and I have both said we do not care if the term is
"ContentItem" or "Resource".  Only you have commented on
"ContentNugget".  I really do not care, except...

In one of today's posts, I implied that I do not want it named
"Content" because I want the class to be usable for development
Resources, such as CSS, XMAPs, XSLTs, and XSPs.  The Security layer
will be very important to restrict access to the developers.  The
Translation layer is easily disabled (just have only one Translation
and set it as the default), but might be useful for CSS and sometimes
XSLTs (different XSL used for LTR and RTL languages).  The Revision
layer would be useful for anything editable.  (Would it be good to be
able to rollback an XMAP?)

> > A Document is a
> > possible input for creating a Page.  A Page can be created from many
> > Resources, including Documents, Assets, and Modules.  (I have a few
> > "You cannot do that" Pages that are just HTML files piped to the
> > visitor.  Responding to the request does not use any Documents or
> > special processing.)
> Well that is exactly what the forrest dispatcher does but for multiple
> formats (<forrest:view type="css" hooksXpath="/"/> and <forrest:view
> type="html" hooksXpath="/html/body">)
> [Forrest code]

I did not understand the Forrest code, but I will accept that Forrest
applies similar functions to CSS as Lenya applies to XML.  Lenya could
merge CSS Elements from many Documents into one Document, and
transform the CSS tag into proper format for HTML.  I am not certain
why that is important since every graphical web browser accepts
multiple CSS inputs.

> > > Why "a unit of information formatted as XML" can contain other "units of
> > > information formatted as XML" (documents can contain other documents,
> > > or)?
> > I am not certain where this line appeared, but
> > - when a unit of information formatted as XML contains other units of
> > information formatted as XML, the result is a unit of information
> > formatted as XML.
> > or phrase it:
> > - when a Document contains other Documents, the result is a Document.
> >
> > For example, a NavigationModule would aggregate many Documents, filter
> > the data, and serialize as XML.  The Live Module handles the result
> > from a NavigationModule just like a Document retrieved from the
> > datastore.  The aggregation of a Document from the datastore with the
> > results of several NavigationModules creates a new Document.  That is
> > passed to a Transformer to create an XHTML Document, which is passed
> > to a Serializer to create the HTML Page (which may not be valid XML
> > and so should not be called a Document) which is returned to the
> > visitor.
>
> wow, all the above said about NavigationModule is for me just a contract
> of the dispatcher and do not have to be in a module.

"Dispatcher" is a Forrest term, not a Lenya term.  In Lenya 1.2, the
Navigation framework handles most multiple (entire Publication)
Resource inputs to create menus and such.  In Lenya 1.4, the
Navigation framework should be moved to Modules.  As I said earlier in
this post, I thought much of Forrest could be Lenya Modules.  What
does the "Dispatcher" do, and how will it add value to Lenya?

> > For programming purposes, "Document" always refers to an "XML
> > Document".  Some are stored in Content.  Others are created by
> > Modules, and only exist in memory.  But all can be handled with the
> > same functions.
> IMO you are limiting the naming to certain formats and resulting
> processes which IMO is not suitable for a multi-format environment/CMS.
> Further I do not see the need nor the benefits and would like to see
> consisting naming for users and devs.
> "Lenya never stated to be *just* a web CMS focusing on editing webpages
> (html)." ;-)

I have not noticed the Lenya front-end client.  There have been
mentions of using WebDAV clients with Lenya (usually because there
were problems.)

I am uncertain how my suggested terminology limits Lenya when used in
a "multi-format environment".  Information in any XML format (XHTML,
SVG) is called a "Document", can be edited, and can be manipulated by
XSL.  Information not in XML format is called an "Asset", is not
editable within Lenya, and can be uploaded and downloaded.  Both are
"Resources" and have Security, Translations, and Revisions: the
primary functions of a CMS.

Accepting that Lenya will be used to store content in many formats,
how do you propose to add value to non-XML data besides storage and
retrieval?

A customer of mine uses Lenya to store PDFs.  They complain accessing
the information in those PDFs loses the Website's navigation, they
cannot provide links to anchors in the PDFs, and (because of the
really limited software they use to create PDFs) they cannot add links
in the PDFs.  We could provide a PDF editor within Lenya that would
solve most of the issues, but there is no good reason why the
information in those PDFs is not stored in Lenya's standard "xhtml"
Documents.

How will Lenya add value to MSWord DOC files?  MSExcel XLS files? 
MSPowerPoint PPTs? GIFs? JPEGs? PNGs? MPEGs? AVIs? WMVs?

Can any of these be edited in Lenya?  Or are they just Assets, with
upload, download, Security, Translations, and Revisions?

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Thorsten Scherler <th...@apache.org>.

El mié, 08-02-2006 a las 13:12 -0500, solprovider@apache.org escribió:
> On 2/8/06, Thorsten Scherler <th...@wyona.com> wrote:
...
> > I think I should just leave it like this, but:
> >
> > page -> why that is only html?
> 
> Websites use HTML.  It is the primary format for web browsing.  Most
> pipelines finish with <map:serialize type="html">.  It is so common
> the type="html" is not necessary because it is the default.  I
> sometimes serialize as XML for testing, but I do not want visitors to
> see it.

Yeah, for lenya ATM you may are right but see e.g. something like
forrest where this is not true. I will start very soon a forrest
compatible lenya pub where I will drop this focus. 

Lenya never stated to be *just* a web CMS focusing on editing webpages
(html). That may is your assumption from knowing 1.2 but making lenya
forest compatible will give you a collection of new outputs out of the
box. Some already are available in lenya like OpenOffice and basic
OpenFormat support. The idea is to use lenya to manage any kind of
content not only webbased on.

> The "rss" Module would serialize as XML, but RSS uses "Feeds"
> containing "Channels" containing "Items", as well as referring to the
> response as a "Document" using the pure XML definition.

Here you see that some formats are referring to the response of a
request as document (be it the "pure XML definition" or not). Not only
RSS but as well e.g. open office, POD and pdf are referring to the
response of a request as a "Document". Even you can say "the html
document". 

> > What is the relationship between document and pages?
> 
> Documents are internal units of XML data.  

see above, why limiting documents to xml? 

> They are separated by
> purpose, type, and security.
> 
> Purpose: one Document contains the list of Contributors, another how
> to install Lenya.  Different purpose, different document.

Hmm, separation by purpose normally brings a different "type". Like in
forrest document-v20.dtd is for "normal" docs and faq-v12.dtd for faq
documents. I think the type determines the purpose.

> 
> Type: Most documents are XHTML, basic word processed content.  A
> "product" document contains different (and more rigid) fields. 
> Different DTD, different document.
> 
> Security: One document can be edited by any editor.  "Product"
> Documents can only be edited by the inventory maintainer.  Different
> security requirements, different document.
> 

I agree on type, but security is provided by the overall application
controller (lenya). The documents should be only linked in the security
component and not know about security itself.

> When creating a Page, Lenya can aggregate introduction text from one
> Document, a list of products from other Documents, navigation based on
> the Publication, and other presentation information.  

Coming back to the "printed paper industry" (the main user of a CMS!) a
paper is *one* peace of a document and not the other way around. The
above said *only* makes sense in html. You do not need navigation e.g.
in POD or PDF or .... 

The above re-written:
"When creating a content aggregation, Lenya can use different content
nuggets like introduction text, a list of products, navigation based on
the Publication, and other presentation information."

> A Document is a
> possible input for creating a Page.  A Page can be created from many
> Resources, including Documents, Assets, and Modules.  (I have a few
> "You cannot do that" Pages that are just HTML files piped to the
> visitor.  Responding to the request does not use any Documents or
> special processing.)

Well that is exactly what the forrest dispatcher does but for multiple
formats (<forrest:view type="css" hooksXpath="/"/> and <forrest:view
type="html" hooksXpath="/html/body">)

<forrest:views xmlns:forrest="http://apache.org/forrest/templates/1.0"
xmlns:jx="http://apache.org/cocoon/templates/jx/1.0">

  <!-- The following variables are used to contact data models and/or contracts. -->
  <jx:set var="getRequest" value="#{$cocoon/parameters/getRequest}"/>
  <jx:set var="getRequestExstension" value="#{$cocoon/parameters/getRequestExstension}"/>
  
  <!-- CSS View of the request e.g. index.dispatcher.css -->
  <forrest:view type="css" hooksXpath="/">
    <forrest:contract name="branding-theme-profiler"/>
</forrest:view>
  
  <!-- HTML View of the request (e.g. index.html)-->
  <forrest:view type="html" hooksXpath="/html/body">
   <!-- 
        @type defines this structurer to html.
        @hooksXpath defines where all hooks will be injected (as prefix).
        -->
    <forrest:contract name="branding-css-links">
      <!-- More information around this contract
        http://marc.theaimsgroup.com/?l=forrest-dev&m=113473237805195&w=2
        -->
      <!--Note: The forrest:properties element does not exit anymore (in comparison to a previous versions) -->
      <forrest:property name="branding-css-links-input">
        <css url="common.css" media="screen" rel="alternate stylesheet" theme="common"/>
        <css url="pelt.basic.css" media="screen" theme="Pelt"/>
        <css url="pelt.print.css" media="print"/>
        <css>/* Extra css */
    p.quote {
      margin-left: 2em;
      padding: .5em;
      background-color: #f0f0f0;
      font-family: monospace;
    }</css>
      </forrest:property>
    </forrest:contract>
    <forrest:hook name="container">
      <forrest:contract name="branding-breadcrumbs">
        <forrest:property name="branding-breadcrumbs">
          <trail>  
            <link1 name="Apache Forrest" href="http://forrest.apache.org/"/>
            <link2 name="Plugins" href="http://forrest.apache.org/docs/plugins/"/>
            <link3 name="org.apache.forrest.plugin.output.themer" href="http://forrest.apache.org/docs/plugins/org.apache.forrest.plugin.output.themer/"/>
          </trail>
        </forrest:property>
      </forrest:contract>
      <forrest:contract name="content-minitoc" dataURI="cocoon://#{$getRequest}.toc.xml">
        <forrest:property name="content-minitoc-conf" max-depth="2" min-sections="1" location="page"/>
      </forrest:contract>
      <forrest:contract name="content-main" dataURI="cocoon://#{$getRequest}.body.xml">
        <forrest:property name="content-main-conf">
          <headings type="underlined"/>
        </forrest:property>
      </forrest:contract>
    </forrest:hook>

  </forrest:view>
</forrest:views>

> 
> > Why "a unit of information formatted as XML" can contain other "units of
> > information formatted as XML" (documents can contain other documents,
> > or)?
> 
> I am not certain where this line appeared, but
> - when a unit of information formatted as XML contains other units of
> information formatted as XML, the result is a unit of information
> formatted as XML.
> 
> or phrase it:
> - when a Document contains other Documents, the result is a Document.
> 
> For example, a NavigationModule would aggregate many Documents, filter
> the data, and serialize as XML.  The Live Module handles the result
> from a NavigationModule just like a Document retrieved from the
> datastore.  The aggregation of a Document from the datastore with the
> results of several NavigationModules creates a new Document.  That is
> passed to a Transformer to create an XHTML Document, which is passed
> to a Serializer to create the HTML Page (which may not be valid XML
> and so should not be called a Document) which is returned to the
> visitor.

wow, all the above said about NavigationModule is for me just a contract
of the dispatcher and do not have to be in a module. 


> For programming purposes, "Document" always refers to an "XML
> Document".  

> Some are stored in Content.  Others are created by
> Modules, and only exist in memory.  But all can be handled with the
> same functions.

IMO you are limiting the naming to certain formats and resulting
processes which IMO is not suitable for a multi-format environment/CMS.
Further I do not see the need nor the benefits and would like to see
consisting naming for users and devs.

"Lenya never stated to be *just* a web CMS focusing on editing webpages
(html)." ;-)
> 
> solprovider

Thanks for helping to reach a consensus on this. :)

salu2
-- 
thorsten

"Together we stand, divided we fall!" 
Hey you (Pink Floyd)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Thorsten Scherler <th...@wyona.com> wrote:
> El mié, 08-02-2006 a las 12:01 -0500, solprovider@apache.org escribió:
> > On 2/8/06, Thorsten Scherler <th...@apache.org> wrote:
> > > El mié, 08-02-2006 a las 12:37 +0100, J. Wolfgang Kaltz escribió:
> > > > Andreas Hartmann schrieb:
> > > > That's why I favor explicitly having the word "content" in whatever
> > > > terminology we use to describe pieces of content. Thus my proposal a
> > > > while back (http://wiki.apache.org/lenya/ProposalContentModel), where
> > > > "ContentItem" is a piece of content (or ContentNugget, or whatever), and
> > > > a Document is a collection of such pieces.
> > > Ok, for me the above site is awesome and I agree as well on content item
> > > because item = nugget. Nugget is just shorter as "ContentItem", why not
> > > "Item". ;-) The only thing still hurting me is the document. You say "a
> > > Document is a collection of such pieces" so why not call it
> > > "ContentCollection" or "collection"?
> > That page tries to redefine Document as a Page, then gets really
> > confused when Documents (Pages) can contain multiple Documents (units
> > of Content).
> >
> > One more time: a "Document" is the internal term for "a unit of
> > information formatted as XML".  A "Page" is a "response to a request
> > by a visitor formatted as HTML."   Visitors never see "Documents".
>
> I think I should just leave it like this, but:
>
> page -> why that is only html?

Websites use HTML.  It is the primary format for web browsing.  Most
pipelines finish with <map:serialize type="html">.  It is so common
the type="html" is not necessary because it is the default.  I
sometimes serialize as XML for testing, but I do not want visitors to
see it.

The "rss" Module would serialize as XML, but RSS uses "Feeds"
containing "Channels" containing "Items", as well as referring to the
response as a "Document" using the pure XML definition.

> What is the relationship between document and pages?

Documents are internal units of XML data.  They are separated by
purpose, type, and security.

Purpose: one Document contains the list of Contributors, another how
to install Lenya.  Different purpose, different document.

Type: Most documents are XHTML, basic word processed content.  A
"product" document contains different (and more rigid) fields. 
Different DTD, different document.

Security: One document can be edited by any editor.  "Product"
Documents can only be edited by the inventory maintainer.  Different
security requirements, different document.

When creating a Page, Lenya can aggregate introduction text from one
Document, a list of products from other Documents, navigation based on
the Publication, and other presentation information.  A Document is a
possible input for creating a Page.  A Page can be created from many
Resources, including Documents, Assets, and Modules.  (I have a few
"You cannot do that" Pages that are just HTML files piped to the
visitor.  Responding to the request does not use any Documents or
special processing.)

> Why "a unit of information formatted as XML" can contain other "units of
> information formatted as XML" (documents can contain other documents,
> or)?

I am not certain where this line appeared, but
- when a unit of information formatted as XML contains other units of
information formatted as XML, the result is a unit of information
formatted as XML.

or phrase it:
- when a Document contains other Documents, the result is a Document.

For example, a NavigationModule would aggregate many Documents, filter
the data, and serialize as XML.  The Live Module handles the result
from a NavigationModule just like a Document retrieved from the
datastore.  The aggregation of a Document from the datastore with the
results of several NavigationModules creates a new Document.  That is
passed to a Transformer to create an XHTML Document, which is passed
to a Serializer to create the HTML Page (which may not be valid XML
and so should not be called a Document) which is returned to the
visitor.

For programming purposes, "Document" always refers to an "XML
Document".  Some are stored in Content.  Others are created by
Modules, and only exist in memory.  But all can be handled with the
same functions.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Thorsten Scherler <th...@wyona.com>.

El mié, 08-02-2006 a las 12:01 -0500, solprovider@apache.org escribió:
> On 2/8/06, Thorsten Scherler <th...@apache.org> wrote:
> > El mié, 08-02-2006 a las 12:37 +0100, J. Wolfgang Kaltz escribió:
> > > Andreas Hartmann schrieb:
> > > That's why I favor explicitly having the word "content" in whatever
> > > terminology we use to describe pieces of content. Thus my proposal a
> > > while back (http://wiki.apache.org/lenya/ProposalContentModel), where
> > > "ContentItem" is a piece of content (or ContentNugget, or whatever), and
> > > a Document is a collection of such pieces.
> > Ok, for me the above site is awesome and I agree as well on content item
> > because item = nugget. Nugget is just shorter as "ContentItem", why not
> > "Item". ;-) The only thing still hurting me is the document. You say "a
> > Document is a collection of such pieces" so why not call it
> > "ContentCollection" or "collection"?
> 
> That page tries to redefine Document as a Page, then gets really
> confused when Documents (Pages) can contain multiple Documents (units
> of Content).
> 
> One more time: a "Document" is the internal term for "a unit of
> information formatted as XML".  A "Page" is a "response to a request
> by a visitor formatted as HTML."   Visitors never see "Documents". 

I think I should just leave it like this, but:

page -> why that is only html?

What is the relationship between document and pages? 

Why "a unit of information formatted as XML" can contain other "units of
information formatted as XML" (documents can contain other documents,
or)?

salu2
-- 
Thorsten Scherler
COO Spain
Wyona Inc.  -  Open Source Content Management  -  Apache Lenya
http://www.wyona.com                   http://lenya.apache.org
thorsten.scherler@wyona.com                thorsten@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Thorsten Scherler <th...@apache.org> wrote:
> El mié, 08-02-2006 a las 12:37 +0100, J. Wolfgang Kaltz escribió:
> > Andreas Hartmann schrieb:
> > That's why I favor explicitly having the word "content" in whatever
> > terminology we use to describe pieces of content. Thus my proposal a
> > while back (http://wiki.apache.org/lenya/ProposalContentModel), where
> > "ContentItem" is a piece of content (or ContentNugget, or whatever), and
> > a Document is a collection of such pieces.
> Ok, for me the above site is awesome and I agree as well on content item
> because item = nugget. Nugget is just shorter as "ContentItem", why not
> "Item". ;-) The only thing still hurting me is the document. You say "a
> Document is a collection of such pieces" so why not call it
> "ContentCollection" or "collection"?

That page tries to redefine Document as a Page, then gets really
confused when Documents (Pages) can contain multiple Documents (units
of Content).

One more time: a "Document" is the internal term for "a unit of
information formatted as XML".  A "Page" is a "response to a request
by a visitor formatted as HTML."   Visitors never see "Documents". 
Rewrite that Wiki page using "Page": half the text disappears and it
is much easier to understand.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Andreas Hartmann <an...@apache.org> wrote:
> solprovider@apache.org wrote:
> > 5. Lenya 1.2 defined "Usecase" as "special process separate from the
> > primary process".  General Usage defines "Usecase" as "Real world
> > actions required to complete a task".  There was a major discrepancy.
> > Lenya 1.4 changed the term to "Module".
> That's not quite correct, usecases and modules are not the same.
> A usecase is an action controlled by parameters, it can use forms
> etc. for user interaction. A module is a container for related
> resources. Modules can provide usecases.

1.2's Usecase provide a new entry point to a Publication.
1.4's Modules provide a new entry point to a Publication, plus a
directory and file naming scheme for the new functionality.

I used Usecases in 1.2 for adding most functionality that would not be
in the Authoring or Live Modules in 1.4.  It was cleaner than hacking
publication-sitemap.xmap all the time.  But I think weirdly.

> >   All graphics should be "Assets"
> I wouldn't introduce a common term for "all graphics". If it's an
> image, call it image, if it's a PDF, call it PDF ...

(I use "Image" for something graphical that is seen, and "Graphic" or
"Graphic File" for something that is created and manipulated.  My
brain is much too anal-ytical.  Most people use them interchangeably,
and prefer using Image for both of my definitions.)

I was not defining Asset as "all graphics".  I define "Asset" as any
content information that is not XML.  Images are one type of Asset.

> > Presentation configuration will be moved
> > inside Modules.  While none of the old definitions of "Resource" are
> > needed, the word is already used by Lenya, and should be used before
> > adding a new word.
> As I understood it, the term "resource" is rejected by most developers,
> because the meaning is too general. But I guess "resource" vs.
> "content item" will be a never-ending story unless we do a vote and
> all developers commit to the decision.

The General Usage definition of Resource is "something usable".  Lenya
1.2 defined "Resources" as "Anything usable that does not fit another
term."  With Modules getting the functional and configuration parts
out of the way, all that is left is Content.  The most usable units of
Content are Documents and Assets.  A generic term for both types of
data seems good for conversation and programming class naming. 
Obeying my Rules "Use an exiting word if one exists" and "Use one word
instead of two", "Resource" is a good choice.    If anyone finds a
better word (within the Rules), I will not be upset.

> > ===
> > On 2/7/06, Thorsten Scherler <th...@apache.org> wrote:
> >> What both models of http://wiki.apache.org/lenya/GlossaryStructure have
> >> in common is that "content" is stored in a Publication. Now Andreas is
> >> using frequently the word "structure" in referring to the sitetree or
> >> like solprovider is calling it index of a publication. Seeing it from a
> >> different angle a sitetree or index or structure is nothing else then
> >> the typical table of content (toc) of a (text) book.
> > 1. "TableOfContent" is three words where we have been using one
> > ("Sitetree").  It also implies a single structure.
> > 2. "TOC" is not immediately recognizable by people outside the printed
> > paper industry.
> I agree to these points, but I'd replace "sitetree" with "structure"
> or "index".

"Sitetree" is defined in Lenya 1.2 as "XML maintaining the
relationships of Documents in a hierarchy."
"Structure" is not defined in Lenya yet.  General Usage defines
"structure" as "something composed of multiple parts, or their
arrangement", which fits how you have been using it.  "Sitetree" is
the better (more specific) choice when referring to the XML
representation of the arrangement of Content in a Publication.
"Index" is the accepted term in many related technologies (computer
databases) for "a data structure external to the primary data storage
used to organize it."  Some platforms use "View" to mean "the
resulting data when looking at data sorted and filtered by an Index",
but Cocoon redefined "View" to mean a "Breakpoint".

I like keeping structure as a generic term.  Content has structure,
but so does a Document, Publication, and even the Lenya Server.

> > 3. Andreas uses "Structure" to mean the one and only hierarchical
> > structure of information within a Publication.
> Not necessarily, I don't mind having multiple structures per publication,
> though the current API draft supports only a single one.
I am hoping to change that.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, J. Wolfgang Kaltz <ka...@interactivesystems.info> wrote:
> Thorsten Scherler schrieb:
> > El mié, 08-02-2006 a las 12:37 +0100, J. Wolfgang Kaltz escribió:
> > Ok, for me the above site is awesome and I agree as well on content item
> > because item = nugget. Nugget is just shorter as "ContentItem", why not
> > "Item". ;-)
> Because I think we should explicitly distinguish the items pertaining to
> content from the other things used in the CMS (dare I say "resources" ;)
> ): the CSS, XSLTs, maybe other things too (sitemaps ? plugins ?)

Agreed.

>  > The only thing still hurting me is the document. You say "a
> > Document is a collection of such pieces" so why not call it
> > "ContentCollection" or "collection"?
> That's conceivable too, at least for our internal terminology, but we
> need something for external terminology in any case. "Document" IMO is
> more user-oriented, it is the information presented to the consumer. The
> consumer doesn't care that the document may actually be structured in
> several pieces. But the authoring user may need to be aware of that:
> which piece do I want to edit now ? With which tool (according to the
> type of the piece, different tools may be available) ? Am I authorized
> to edit this piece of information, or is this one reserved for the
> senior author (for example a start page, containing news which one
> person may edit, but also a company vision or something, which only the
> senior author may edit) ?
>
> So, my opinion: use "Document" to refer to that which is presented to
> the consumer of the site. Since we need a term for that internally too,
> might as well use the same one (though ContentCollection could be used
> here, too). Use "ContentItem" for the individual pieces used in a
> "Document". An end-user will never see that terminology, but an author
> may (if the document consists of several pieces)

Again, "Document" is an "XML Document", an internal unit of storage. 
A "Page" is "that which is presented to the consumer of the site."  Go
ask anybody (a parent, a child, the village idiot) what they see on a
website, and they will answer "a page".

When did someone start referring to Pages as Documents?  Lenya has
never displayed Documents; it has always used Documents as input to
create Pages.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

Andreas Hartmann wrote:
> solprovider@apache.org wrote:
> 
> [...]
> 
> I mostly agree to that, but IMO changing an existing definition
> is OK if it helps conforming to points 0 and 1. As I understand it,
> the term "asset" is commonly used by other CMS (please correct me
> if I'm wrong).

in this case, re-defining "asset" might even be beneficial for lenya 1.2 
users, because the new paradigm that "everything is an asset" is a 
generalization of the old asset concept. users must understand this 
change anyway, regardless of name, and changing the meaning of the old 
term shows how the old concept is broadened to include XML documents as 
well. of course, it's all a matter of documentation.

-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by "J. Wolfgang Kaltz" <ka...@interactivesystems.info>.

Thorsten Scherler schrieb:
> El mié, 08-02-2006 a las 12:37 +0100, J. Wolfgang Kaltz escribió:
> 
>>>[...]
> Ok, for me the above site is awesome and I agree as well on content item
> because item = nugget. Nugget is just shorter as "ContentItem", why not
> "Item". ;-) 

Because I think we should explicitly distinguish the items pertaining to 
content from the other things used in the CMS (dare I say "resources" ;) 
): the CSS, XSLTs, maybe other things too (sitemaps ? plugins ?)

 > The only thing still hurting me is the document. You say "a
> Document is a collection of such pieces" so why not call it
> "ContentCollection" or "collection"? 

That's conceivable too, at least for our internal terminology, but we 
need something for external terminology in any case. "Document" IMO is 
more user-oriented, it is the information presented to the consumer. The 
consumer doesn't care that the document may actually be structured in 
several pieces. But the authoring user may need to be aware of that: 
which piece do I want to edit now ? With which tool (according to the 
type of the piece, different tools may be available) ? Am I authorized 
to edit this piece of information, or is this one reserved for the 
senior author (for example a start page, containing news which one 
person may edit, but also a company vision or something, which only the 
senior author may edit) ?

So, my opinion: use "Document" to refer to that which is presented to 
the consumer of the site. Since we need a term for that internally too, 
might as well use the same one (though ContentCollection could be used 
here, too). Use "ContentItem" for the individual pieces used in a 
"Document". An end-user will never see that terminology, but an author 
may (if the document consists of several pieces)

--
Wolfgang

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Thorsten Scherler <th...@apache.org>.

El mié, 08-02-2006 a las 12:37 +0100, J. Wolfgang Kaltz escribió:
> Andreas Hartmann schrieb:
> > solprovider@apache.org wrote:
> > 
> > [...]
> > 
> >> Presentation configuration will be moved
> >> inside Modules.  While none of the old definitions of "Resource" are
> >> needed, the word is already used by Lenya, and should be used before
> >> adding a new word.
> > 
> > 
> > As I understood it, the term "resource" is rejected by most developers,
> > because the meaning is too general. But I guess "resource" vs.
> > "content item" will be a never-ending story unless we do a vote and
> > all developers commit to the decision.
> 
> IMO we need to distinguish between pieces of content to be managed by 
> the CMS (Lenya) and other "stuff", for which Lenya may also potentially 
> provide interfaces: what about the XSL stylesheets for example. They are 
> not content and should not be seen by content maintainers. They are 
> however resources of the CMS and, maybe, one day in the future, they 
> will be editable within the CMS (administration area? a new "design 
> area"? but that day, they still won't be content)
> 
> That's why I favor explicitly having the word "content" in whatever 
> terminology we use to describe pieces of content. Thus my proposal a 
> while back (http://wiki.apache.org/lenya/ProposalContentModel), where 
> "ContentItem" is a piece of content (or ContentNugget, or whatever), and 
> a Document is a collection of such pieces.

Ok, for me the above site is awesome and I agree as well on content item
because item = nugget. Nugget is just shorter as "ContentItem", why not
"Item". ;-) The only thing still hurting me is the document. You say "a
Document is a collection of such pieces" so why not call it
"ContentCollection" or "collection"? 

> 
> I am not trying to say "I am right, why doesn't everybody agree" ;)
> But to be honest, I still haven't understood what is wrong with the 
> proposal above.

jeje, yeah I like the proposal. 

salu2
-- 
thorsten

"Together we stand, divided we fall!" 
Hey you (Pink Floyd)


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by "J. Wolfgang Kaltz" <ka...@interactivesystems.info>.

Andreas Hartmann schrieb:
> solprovider@apache.org wrote:
> 
> [...]
> 
>> Presentation configuration will be moved
>> inside Modules.  While none of the old definitions of "Resource" are
>> needed, the word is already used by Lenya, and should be used before
>> adding a new word.
> 
> 
> As I understood it, the term "resource" is rejected by most developers,
> because the meaning is too general. But I guess "resource" vs.
> "content item" will be a never-ending story unless we do a vote and
> all developers commit to the decision.

IMO we need to distinguish between pieces of content to be managed by 
the CMS (Lenya) and other "stuff", for which Lenya may also potentially 
provide interfaces: what about the XSL stylesheets for example. They are 
not content and should not be seen by content maintainers. They are 
however resources of the CMS and, maybe, one day in the future, they 
will be editable within the CMS (administration area? a new "design 
area"? but that day, they still won't be content)

That's why I favor explicitly having the word "content" in whatever 
terminology we use to describe pieces of content. Thus my proposal a 
while back (http://wiki.apache.org/lenya/ProposalContentModel), where 
"ContentItem" is a piece of content (or ContentNugget, or whatever), and 
a Document is a collection of such pieces.

I am not trying to say "I am right, why doesn't everybody agree" ;)
But to be honest, I still haven't understood what is wrong with the 
proposal above.

> (...)
>> 8. "Resource" is better than "ContentItem" because it contains fewer 
>> words.

-1

--
Wolfgang

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Josias Thoeny <jo...@wyona.com>.

On Wed, 2006-02-08 at 10:28 +0100, Andreas Hartmann wrote:
> solprovider@apache.org wrote:
> 
> [...]
> 
> I mostly agree to that, but IMO changing an existing definition
> is OK if it helps conforming to points 0 and 1. As I understand it,
> the term "asset" is commonly used by other CMS (please correct me
> if I'm wrong).

Here is a link about the term "asset" in CMS:
http://www.contentmanager.net/magazine/article_200_media_asset_management.html

Basically we're trying to unify documents and assets (from Lenya 1.2),
right?. I could also imagine to call everything "document" instead of
"asset". 
For me it's more natural to call a JPEG a "document", than to call an
XHTML document an "asset". But it's a matter of personal preference...

Josias


> 
> 
> > 5. Lenya 1.2 defined "Usecase" as "special process separate from the
> > primary process".  General Usage defines "Usecase" as "Real world
> > actions required to complete a task".  There was a major discrepancy. 
> > Lenya 1.4 changed the term to "Module".
> 
> That's not quite correct, usecases and modules are not the same.
> A usecase is an action controlled by parameters, it can use forms
> etc. for user interaction. A module is a container for related
> resources. Modules can provide usecases.
> 
> 
> > Lenya 1.4 has not defined "Resource" yet,
> > but should relate to one or more of Lenya 1.2's definitions.  Software
> > plug-ins are being defined as "Modules".  All graphics should be
> > "Assets"
> 
> I wouldn't introduce a common term for "all graphics". If it's an
> image, call it image, if it's a PDF, call it PDF ...
> 
> > (Why allow static graphics?  Let everything be editable, and
> > improve security so the security granted by requiring file system
> > access is unnecessary.)
> 
> +1
> 
> > Presentation configuration will be moved
> > inside Modules.  While none of the old definitions of "Resource" are
> > needed, the word is already used by Lenya, and should be used before
> > adding a new word.
> 
> As I understood it, the term "resource" is rejected by most developers,
> because the meaning is too general. But I guess "resource" vs.
> "content item" will be a never-ending story unless we do a vote and
> all developers commit to the decision.
> 
> 
> > 7. Lenya 1.2 defines "Publication" as "Content and related processing
> > instructions" where "Content" is defined as "(XML) Documents and
> > related Assets" where "Assets" is defined as "uploaded files".  
> > People keep suggesting using "Site", "Website", "Book", and other
> > terms for this concept.  General Usage defines "Publication" as "A
> > specific issue of a public work including textual information".  The
> > output of a Lenya 1.2 Publication is close enough to the General Usage
> > definition that very few people have been confused by the term.  If it
> > works, do not break it.
> 
> +1
> 
> 
> > 8. "Resource" is better than "ContentItem" because it contains fewer words.
> 
> Yes, but it bears a greater danger of clashes with other meanings of
> the term (see above). My personal priority list is
> 
> 1. asset
> 2. resource
> 3. content item
> 
> but they're very close to each other.
> 
> 
> > ===
> > On 2/7/06, Thorsten Scherler <th...@apache.org> wrote:
> >> What both models of http://wiki.apache.org/lenya/GlossaryStructure have
> >> in common is that "content" is stored in a Publication. Now Andreas is
> >> using frequently the word "structure" in referring to the sitetree or
> >> like solprovider is calling it index of a publication. Seeing it from a
> >> different angle a sitetree or index or structure is nothing else then
> >> the typical table of content (toc) of a (text) book.
> > 
> > 1. "TableOfContent" is three words where we have been using one
> > ("Sitetree").  It also implies a single structure.
> > 2. "TOC" is not immediately recognizable by people outside the printed
> > paper industry.
> 
> I agree to these points, but I'd replace "sitetree" with "structure"
> or "index".
> 
> 
> > 3. Andreas uses "Structure" to mean the one and only hierarchical
> > structure of information within a Publication.
> 
> Not necessarily, I don't mind having multiple structures per publication,
> though the current API draft supports only a single one.
> 
> 
> > I think more
> > flexibility would be easy.
> > 4. I use "Index" to refer to an easily configurable internal data
> > structure used to allow multiple "structures", both flat and
> > hierarchical.  Notice "internal".  The only people aware of Indexes
> > would be:
> > - Programmers of core that implement the feature.
> > - Developers of Modules that concern multiple Resources.  ("Resource"
> > being defined as Nodes within Content.)
> > 
> > To clarify, the bulk of understanding "Index" falls to the programmers
> > of core implementing the feature. The "live" Module is only concerned
> > with one Resource, so the developer does not need to understand
> > "Index".  The "live" Module aggregates the result of the "menu"
> > Module.  The "menu" Module creates a list of multiple Resources, so
> > configures a hierarchical Index.  The developer of the "menu" Module
> > only needs to understand that properly using XML to configure an
> > "Index" allows using the following line to generate XML about multiple
> > documents:
> >    <map:generate type="sitetree" index="myNewIndex"/>
> > The SitetreeGenerator does the work; the developer looks at the
> > resulting XML to see if the Index configuration is correct, then
> > continues writing the Module (probably adding XSL transformations.)
> > 
> > Indexes would also be used internally to translate between the ID (or
> > URL) (/section/category/documentid) and the UNID (unique identifier of
> > the Resource in the flat storage of Content) used to retrieve a
> > Resource.  Again, use of the Index is transparent to everybody except
> > the core programmers implementing the feature.
> 
> That sounds reasonable.
> 
> -- Andreas
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

solprovider@apache.org wrote:

[...]

I mostly agree to that, but IMO changing an existing definition
is OK if it helps conforming to points 0 and 1. As I understand it,
the term "asset" is commonly used by other CMS (please correct me
if I'm wrong).


> 5. Lenya 1.2 defined "Usecase" as "special process separate from the
> primary process".  General Usage defines "Usecase" as "Real world
> actions required to complete a task".  There was a major discrepancy. 
> Lenya 1.4 changed the term to "Module".

That's not quite correct, usecases and modules are not the same.
A usecase is an action controlled by parameters, it can use forms
etc. for user interaction. A module is a container for related
resources. Modules can provide usecases.


> Lenya 1.4 has not defined "Resource" yet,
> but should relate to one or more of Lenya 1.2's definitions.  Software
> plug-ins are being defined as "Modules".  All graphics should be
> "Assets"

I wouldn't introduce a common term for "all graphics". If it's an
image, call it image, if it's a PDF, call it PDF ...

> (Why allow static graphics?  Let everything be editable, and
> improve security so the security granted by requiring file system
> access is unnecessary.)

+1

> Presentation configuration will be moved
> inside Modules.  While none of the old definitions of "Resource" are
> needed, the word is already used by Lenya, and should be used before
> adding a new word.

As I understood it, the term "resource" is rejected by most developers,
because the meaning is too general. But I guess "resource" vs.
"content item" will be a never-ending story unless we do a vote and
all developers commit to the decision.


> 7. Lenya 1.2 defines "Publication" as "Content and related processing
> instructions" where "Content" is defined as "(XML) Documents and
> related Assets" where "Assets" is defined as "uploaded files".  
> People keep suggesting using "Site", "Website", "Book", and other
> terms for this concept.  General Usage defines "Publication" as "A
> specific issue of a public work including textual information".  The
> output of a Lenya 1.2 Publication is close enough to the General Usage
> definition that very few people have been confused by the term.  If it
> works, do not break it.

+1


> 8. "Resource" is better than "ContentItem" because it contains fewer words.

Yes, but it bears a greater danger of clashes with other meanings of
the term (see above). My personal priority list is

1. asset
2. resource
3. content item

but they're very close to each other.


> ===
> On 2/7/06, Thorsten Scherler <th...@apache.org> wrote:
>> What both models of http://wiki.apache.org/lenya/GlossaryStructure have
>> in common is that "content" is stored in a Publication. Now Andreas is
>> using frequently the word "structure" in referring to the sitetree or
>> like solprovider is calling it index of a publication. Seeing it from a
>> different angle a sitetree or index or structure is nothing else then
>> the typical table of content (toc) of a (text) book.
> 
> 1. "TableOfContent" is three words where we have been using one
> ("Sitetree").  It also implies a single structure.
> 2. "TOC" is not immediately recognizable by people outside the printed
> paper industry.

I agree to these points, but I'd replace "sitetree" with "structure"
or "index".


> 3. Andreas uses "Structure" to mean the one and only hierarchical
> structure of information within a Publication.

Not necessarily, I don't mind having multiple structures per publication,
though the current API draft supports only a single one.


> I think more
> flexibility would be easy.
> 4. I use "Index" to refer to an easily configurable internal data
> structure used to allow multiple "structures", both flat and
> hierarchical.  Notice "internal".  The only people aware of Indexes
> would be:
> - Programmers of core that implement the feature.
> - Developers of Modules that concern multiple Resources.  ("Resource"
> being defined as Nodes within Content.)
> 
> To clarify, the bulk of understanding "Index" falls to the programmers
> of core implementing the feature. The "live" Module is only concerned
> with one Resource, so the developer does not need to understand
> "Index".  The "live" Module aggregates the result of the "menu"
> Module.  The "menu" Module creates a list of multiple Resources, so
> configures a hierarchical Index.  The developer of the "menu" Module
> only needs to understand that properly using XML to configure an
> "Index" allows using the following line to generate XML about multiple
> documents:
>    <map:generate type="sitetree" index="myNewIndex"/>
> The SitetreeGenerator does the work; the developer looks at the
> resulting XML to see if the Index configuration is correct, then
> continues writing the Module (probably adding XSL transformations.)
> 
> Indexes would also be used internally to translate between the ID (or
> URL) (/section/category/documentid) and the UNID (unique identifier of
> the Resource in the flat storage of Content) used to retrieve a
> Resource.  Again, use of the Index is transparent to everybody except
> the core programmers implementing the feature.

That sounds reasonable.

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

Rules for Determining Terminology

0. For all rules, priority is general usage, then previous versions of
the product, and finally related technology.

1. If a word exists with the required definition, use that word.

2. Do not change the existing definition of a term.  If functionality
is significantly altered, a new term is needed.

3. Use real words whenever possible.

4. Single-word names are better than multiple-word names.

5. Use terms that fit the current technology and its planned future.

6. Try not to introduce new terms.  More terms creates a larger
glossary that must be understood, increasing the learning curve, and
reducing the audience willing to invest in the product.

== Examples ==
(The numbering does not imply any connection to the Rules.)

1. Do not use "AngleText" for text separated into functional units
using angle brackets.  General Usage uses "XML".

2. General Usage defines "Document" as "a single unit of textual
information".  A Related Technology (XML) defines "Document" as "a
single unit of textual information conforming to the XML
specification".  A Previous Version (Lenya 1.2) defines "Document" as
"a single unit of textual information conforming to the XML
specification with specified security and workflow".  No other
definition is allowed.

3. General Usage defines "Sitemap" as a single-page directory of a
website.  A Related Technology (Cocoon) redefines it as XML processing
instructions for handling a request.  General Usage wins, so find
another name for XMAP files.  Other technologies use "program",
"subroutine", or "servlet" for "processing instructions for handling a
request", but none of those fit this technology.  I use "XMAP" for
this because everyone familiar with the technology immediately
understands.  A better term would be "route" or "router", but a new
term must originate with Cocoon; outside parties could not spread a
new term .

4. Lenya 1.2 defined "Asset" as an uploaded file, not editable and
barely usable by Lenya.  Do not change "Asset" to mean "Part of a
Document", or "Any Resource including XML".  We can fix the "barely
usable" part, but not in a way that significantly changes the
definition.

5. Lenya 1.2 defined "Usecase" as "special process separate from the
primary process".  General Usage defines "Usecase" as "Real world
actions required to complete a task".  There was a major discrepancy. 
Lenya 1.4 changed the term to "Module".  General Usage defines
"Module" as "A standardized component of a system designed for
ease-of-use and flexibility".   Lenya 1.4's usage conforms to that
definition.  (And I hope we remove the differentiation between Modules
and the primary processing path.)

6. No software uses the term "Nugget".  The phrase "I created a
Nugget" currently has no meaning.  "I created a Document" can be
understood by people who are barely computer literate.  "I created an
Asset" (meaning uploaded a file) is understood by users of Lenya 1.2. 
"I created a Resource" would be understood by most people.  Lenya 1.2
used "Resources" for a variety of additions: static graphics, storage
of "Assets", presentation configuration (CSS and i18n), and software
plug-ins (such as editors).  That was confusing to developers, and
created a messy datastore. Lenya 1.4 has not defined "Resource" yet,
but should relate to one or more of Lenya 1.2's definitions.  Software
plug-ins are being defined as "Modules".  All graphics should be
"Assets" (Why allow static graphics?  Let everything be editable, and
improve security so the security granted by requiring file system
access is unnecessary.)  Presentation configuration will be moved
inside Modules.  While none of the old definitions of "Resource" are
needed, the word is already used by Lenya, and should be used before
adding a new word.

7. Lenya 1.2 defines "Publication" as "Content and related processing
instructions" where "Content" is defined as "(XML) Documents and
related Assets" where "Assets" is defined as "uploaded files".  
People keep suggesting using "Site", "Website", "Book", and other
terms for this concept.  General Usage defines "Publication" as "A
specific issue of a public work including textual information".  The
output of a Lenya 1.2 Publication is close enough to the General Usage
definition that very few people have been confused by the term.  If it
works, do not break it.

8. "Resource" is better than "ContentItem" because it contains fewer words.

===
On 2/7/06, Thorsten Scherler <th...@apache.org> wrote:
> What both models of http://wiki.apache.org/lenya/GlossaryStructure have
> in common is that "content" is stored in a Publication. Now Andreas is
> using frequently the word "structure" in referring to the sitetree or
> like solprovider is calling it index of a publication. Seeing it from a
> different angle a sitetree or index or structure is nothing else then
> the typical table of content (toc) of a (text) book.

1. "TableOfContent" is three words where we have been using one
("Sitetree").  It also implies a single structure.
2. "TOC" is not immediately recognizable by people outside the printed
paper industry.
3. Andreas uses "Structure" to mean the one and only hierarchical
structure of information within a Publication.  I think more
flexibility would be easy.
4. I use "Index" to refer to an easily configurable internal data
structure used to allow multiple "structures", both flat and
hierarchical.  Notice "internal".  The only people aware of Indexes
would be:
- Programmers of core that implement the feature.
- Developers of Modules that concern multiple Resources.  ("Resource"
being defined as Nodes within Content.)

To clarify, the bulk of understanding "Index" falls to the programmers
of core implementing the feature. The "live" Module is only concerned
with one Resource, so the developer does not need to understand
"Index".  The "live" Module aggregates the result of the "menu"
Module.  The "menu" Module creates a list of multiple Resources, so
configures a hierarchical Index.  The developer of the "menu" Module
only needs to understand that properly using XML to configure an
"Index" allows using the following line to generate XML about multiple
documents:
   <map:generate type="sitetree" index="myNewIndex"/>
The SitetreeGenerator does the work; the developer looks at the
resulting XML to see if the Index configuration is correct, then
continues writing the Module (probably adding XSL transformations.)

Indexes would also be used internally to translate between the ID (or
URL) (/section/category/documentid) and the UNID (unique identifier of
the Resource in the flat storage of Content) used to retrieve a
Resource.  Again, use of the Index is transparent to everybody except
the core programmers implementing the feature.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

Thorsten Scherler wrote:
> El mar, 07-02-2006 a las 14:24 +0100, Andreas Hartmann escribió:
>> solprovider@apache.org wrote:
>>> On 2/7/06, Andreas Hartmann <an...@apache.org> wrote:
>>>> Andreas Hartmann wrote:
>>>>> solprovider@apache.org wrote:
>>>>>> A Document is XML.  An Asset is not XML.
>>>>> IMO we should use the term Asset for all content items, whether XML
>>>>> or not. I don't see the need for differentiating between XML and non-XML
>>>>> assets.
>>>> All content-related issues would be the responsibility of the AssetType
>>>> (formerly known as ResourceType / DocumentType).
>>>> For instance:
>>>> [Many examples of using Assets that are XML-based.]
>>> I have a couple of thoughts/suggestions/objections.
>>>
>>> First, we need a superclass for objects placed in Content.  This
>>> superclass contains identifiers, security information, and
>>> Translations (Nodes for different languages).  Each Translation
>>> contains Revisions (Nodes for historical versions of each
>>> Translation).
>>>
>>> For the name of the superclass, we were discussing "ContentItem" or
>>> "Resource".
>> Yes, and neither of them seems to win.
>>
>>> Now you suggest "Asset" right after a discussion that
>>> wanted to define Assets as parts of a Document.
>> Yes.
> <snip stuff I agree/>
> 
> Now I can understand solprovider, when the term asset is used in 1.4
> different then in 1.2 that will cause confusions. 
> 
> I recommend the term "nugget". A (content) nugget is an atomic part of
> information. This nuggets can be images, audio files, xml snippets of
> information, plain text,... Content is an aggregation of (content)
> nuggets.

i'm fine with that. there is a need for new terms and concepts. either 
we spend a lot of time explaining how terms change, or we invent new 
ones and deprecate the others.

you are right, our "asset" has become a "liability" :-D

what i don't like about "nugget" is that it has no common meaning 
outside of lenya, so people can't infer the concept from what they 
already know. so i'd rather stick to asset and make it very clear in 
READMEs and all over the place how the meaning of asset has now been 
generalized (it's not a wholly different concept, just a broadening, so 
imho it's ok).


> Like Andreas already pointed out I do not see the need for the
> differentiation between Non-XML and xml data since a suitable editor
> may/is "just a module away". ;-) A SVG image for example can be both an
> editable xml file and a binary png (svg2png serializer).
> 
> Further for me a document is a presentational representation of content
> aggregation for a certain format (Andreas uses the word "view" for this)
> and I have sometimes the feeling that if we talk about documents we are
> talking about the html representation. 
> 
> What both models of http://wiki.apache.org/lenya/GlossaryStructure have
> in common is that "content" is stored a publication. Now Andreas is
> using frequently the word "structure" in referring to the sitetree or
> like solprovider is calling it index of a publication. Seeing it from a
> different angle a sitetree or index or structure is nothing else then
> the typical table of content (toc) of a (text) book. 
> 
> A toc normally lists the chapter of the book, each chapter can have
> subchapters, ... In chapters you can use arbitrary (content) nuggets
> (illustrations, analysis, ...). 
> 
> IMO if we now have reached the point to question our naming, why not ask
> whether publication is the right name?
> 
> I see it like:
> Publication = book
> sitetree = toc
> document = chapter
> asset = (content) nugget

book does not cut it. it's very uncommon in the context of web 
publishing. chapter collides with "page" (in printing, a chapter 
contains of one or more pages, in web publishing, a page is more like a 
"chapter").

if these need changing at all, i'd vote for

site
site tree
page (taking the place of andreas' "document")
asset




-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Thorsten Scherler <th...@apache.org>.

El mar, 07-02-2006 a las 14:24 +0100, Andreas Hartmann escribió:
> solprovider@apache.org wrote:
> > On 2/7/06, Andreas Hartmann <an...@apache.org> wrote:
> >> Andreas Hartmann wrote:
> >>> solprovider@apache.org wrote:
> >>>> A Document is XML.  An Asset is not XML.
> >>> IMO we should use the term Asset for all content items, whether XML
> >>> or not. I don't see the need for differentiating between XML and non-XML
> >>> assets.
> >> All content-related issues would be the responsibility of the AssetType
> >> (formerly known as ResourceType / DocumentType).
> >> For instance:
> >> [Many examples of using Assets that are XML-based.]
> > 
> > I have a couple of thoughts/suggestions/objections.
> > 
> > First, we need a superclass for objects placed in Content.  This
> > superclass contains identifiers, security information, and
> > Translations (Nodes for different languages).  Each Translation
> > contains Revisions (Nodes for historical versions of each
> > Translation).
> > 
> > For the name of the superclass, we were discussing "ContentItem" or
> > "Resource".
> 
> Yes, and neither of them seems to win.
> 
> > Now you suggest "Asset" right after a discussion that
> > wanted to define Assets as parts of a Document.
> 
> Yes.
<snip stuff I agree/>

Now I can understand solprovider, when the term asset is used in 1.4
different then in 1.2 that will cause confusions. 

I recommend the term "nugget". A (content) nugget is an atomic part of
information. This nuggets can be images, audio files, xml snippets of
information, plain text,... Content is an aggregation of (content)
nuggets.

Like Andreas already pointed out I do not see the need for the
differentiation between Non-XML and xml data since a suitable editor
may/is "just a module away". ;-) A SVG image for example can be both an
editable xml file and a binary png (svg2png serializer).

Further for me a document is a presentational representation of content
aggregation for a certain format (Andreas uses the word "view" for this)
and I have sometimes the feeling that if we talk about documents we are
talking about the html representation. 

What both models of http://wiki.apache.org/lenya/GlossaryStructure have
in common is that "content" is stored a publication. Now Andreas is
using frequently the word "structure" in referring to the sitetree or
like solprovider is calling it index of a publication. Seeing it from a
different angle a sitetree or index or structure is nothing else then
the typical table of content (toc) of a (text) book. 

A toc normally lists the chapter of the book, each chapter can have
subchapters, ... In chapters you can use arbitrary (content) nuggets
(illustrations, analysis, ...). 

IMO if we now have reached the point to question our naming, why not ask
whether publication is the right name?

I see it like:
Publication = book
sitetree = toc
document = chapter
asset = (content) nugget

WDYT?

salu2
-- 
thorsten

"Together we stand, divided we fall!" 
Hey you (Pink Floyd)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

solprovider@apache.org wrote:
> On 2/7/06, Andreas Hartmann <an...@apache.org> wrote:
>> Andreas Hartmann wrote:
>>> solprovider@apache.org wrote:
>>>> A Document is XML.  An Asset is not XML.
>>> IMO we should use the term Asset for all content items, whether XML
>>> or not. I don't see the need for differentiating between XML and non-XML
>>> assets.
>> All content-related issues would be the responsibility of the AssetType
>> (formerly known as ResourceType / DocumentType).
>> For instance:
>> [Many examples of using Assets that are XML-based.]
> 
> I have a couple of thoughts/suggestions/objections.
> 
> First, we need a superclass for objects placed in Content.  This
> superclass contains identifiers, security information, and
> Translations (Nodes for different languages).  Each Translation
> contains Revisions (Nodes for historical versions of each
> Translation).
> 
> For the name of the superclass, we were discussing "ContentItem" or
> "Resource".

Yes, and neither of them seems to win.

> Now you suggest "Asset" right after a discussion that
> wanted to define Assets as parts of a Document.

Yes.

> Lenya 1.2 defines an
> Asset as an uploaded file.  Why would we create a new definition of
> "Asset"?  Are we trying to confuse everybody?

I don't mind re-defining the meaning of the term Asset.


> Second, we need to separate XML and non-XML.  Storage could use
> <resource type="XML"> and <resource type="non-XML">, but programming
> requires different class names for each.

Does it? I'm not quite convinced of that ... The class could just provide
input and output streams to store arbitrary content. Which content
can be stored would be determinded by the (asset/resource/...) type.


> XML data can be aggregated, transformed, and serialized.  "Document"
> is well-defined in XML and Lenya 1.2 as an XML Document.  Why would we
> change that?

We wouldn't. We just don't need a Document class.

> Documents then have Types based on the DTD, (and
> possibly a PI or other XML.)  In Lenya 1.2, the doctype parameter
> decided the Type.  In 1.4, every Document should know its Type, but
> can be overridden (ignored) by Modules.  XML data can be edited using
> the CMS GUI.
> 
> For Non-XML data (Assets), there will be different functions.  Non-XML
> data needs to be downloadable (serialized as binary?).  It cannot be
> transformed.  It might be converted to Base64 for use as XML, but
> would use the BinaryToBase64 Module, which is not part of core. 
> Assets can be uploaded; they cannot be edited using the CMS GUI.

Why? You could use a text editor, or an HTML (non-XHTML) editor.
Or even an editor to manipulate images.


> Third, there are several options for the datastore:
> 1. Content/Resource[Type="XML|Not XML"]
> My experience suggests using Elements will prove much easier (less
> code) than using an Attribute for something so basic.  It is the
> difference between:
> <xsl:apply-templates select="resource[type='xml']"/>
> and
> <xsl:apply-templates select="document">
> It is one extra test (repeated extremely often) that can be avoided by
> better design.

The XML representation is not directly related to the fact if there
are subclasses for XML-based and non-XML-based assets/resources/...


> 2. Content/Document and Content/Asset
> It should be easy to get all the Nodes of one Type.

I don't consider "xml/non-xml" as the primary differentiation.
I'd differentiate assets by asset types.


> 3. Content/Documents/Document and Content/Assets/Asset

I'm not quite sure what you mean here ... I would avoid this in
any case.

> This makes it extremely easy to get all the Nodes of one Type, but
> makes functions where the ContentType (or ResourceType) is irrelevant
> much more difficult.  If you want a list of the latest updates, you
> need to search Documents and Assets separately and then merge the
> lists and resort the result.  A simple search for Resources that have
> a new Revision ready but unpublished becomes difficult.  And it adds a
> Node that exists only to categorize other Nodes; I cannot think of
> anything useful to add to these extra Nodes.
> 
> Fourth (repeating #2), regardless of how they are stored, programming
> requires separate classes because too much functionality is different.
>  Identification, security, language support, workflow, and relations
> can be handled by the superclass, but how they are used within Lenya
> and how they are sent to visitors is very different.

IMO this functionality would be in the responsibility of the asset type.
The asset class itself is just a storage item.


-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/7/06, Andreas Hartmann <an...@apache.org> wrote:
> Andreas Hartmann wrote:
> > solprovider@apache.org wrote:
> >> A Document is XML.  An Asset is not XML.
> > IMO we should use the term Asset for all content items, whether XML
> > or not. I don't see the need for differentiating between XML and non-XML
> > assets.
> All content-related issues would be the responsibility of the AssetType
> (formerly known as ResourceType / DocumentType).
> For instance:
> [Many examples of using Assets that are XML-based.]

I have a couple of thoughts/suggestions/objections.

First, we need a superclass for objects placed in Content.  This
superclass contains identifiers, security information, and
Translations (Nodes for different languages).  Each Translation
contains Revisions (Nodes for historical versions of each
Translation).

For the name of the superclass, we were discussing "ContentItem" or
"Resource".  Now you suggest "Asset" right after a discussion that
wanted to define Assets as parts of a Document.  Lenya 1.2 defines an
Asset as an uploaded file.  Why would we create a new definition of
"Asset"?  Are we trying to confuse everybody?

Second, we need to separate XML and non-XML.  Storage could use
<resource type="XML"> and <resource type="non-XML">, but programming
requires different class names for each.

XML data can be aggregated, transformed, and serialized.  "Document"
is well-defined in XML and Lenya 1.2 as an XML Document.  Why would we
change that?  Documents then have Types based on the DTD, (and
possibly a PI or other XML.)  In Lenya 1.2, the doctype parameter
decided the Type.  In 1.4, every Document should know its Type, but
can be overridden (ignored) by Modules.  XML data can be edited using
the CMS GUI.

For Non-XML data (Assets), there will be different functions.  Non-XML
data needs to be downloadable (serialized as binary?).  It cannot be
transformed.  It might be converted to Base64 for use as XML, but
would use the BinaryToBase64 Module, which is not part of core. 
Assets can be uploaded; they cannot be edited using the CMS GUI.

Third, there are several options for the datastore:
1. Content/Resource[Type="XML|Not XML"]
My experience suggests using Elements will prove much easier (less
code) than using an Attribute for something so basic.  It is the
difference between:
<xsl:apply-templates select="resource[type='xml']"/>
and
<xsl:apply-templates select="document">
It is one extra test (repeated extremely often) that can be avoided by
better design.

2. Content/Document and Content/Asset
It should be easy to get all the Nodes of one Type.

3. Content/Documents/Document and Content/Assets/Asset
This makes it extremely easy to get all the Nodes of one Type, but
makes functions where the ContentType (or ResourceType) is irrelevant
much more difficult.  If you want a list of the latest updates, you
need to search Documents and Assets separately and then merge the
lists and resort the result.  A simple search for Resources that have
a new Revision ready but unpublished becomes difficult.  And it adds a
Node that exists only to categorize other Nodes; I cannot think of
anything useful to add to these extra Nodes.

Fourth (repeating #2), regardless of how they are stored, programming
requires separate classes because too much functionality is different.
 Identification, security, language support, workflow, and relations
can be handled by the superclass, but how they are used within Lenya
and how they are sent to visitors is very different.  Why store them
in a struture that requires extra programming throughout Lenya?

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Doug Chestnut <dh...@virginia.edu>.


Andreas Hartmann wrote:
> solprovider@apache.org wrote:
> 
> [...]
> 
>> A Document is XML.  An Asset is not XML.
> 
> 
> IMO we should use the term Asset for all content items, whether XML
> or not. I don't see the need for differentiating between XML and non-XML
> assets.
+1

> 
> -- Andreas
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Andreas Hartmann wrote:
> solprovider@apache.org wrote:
> 
> [...]
> 
>> A Document is XML.  An Asset is not XML.
> 
> IMO we should use the term Asset for all content items, whether XML
> or not. I don't see the need for differentiating between XML and non-XML
> assets.

All content-related issues would be the responsibility of the AssetType
(formerly known as ResourceType / DocumentType).

For instance:

----

AssetType "xhtml":

* format "xhtml"

   <xhtml:html> ... </xhtml:html>

* format "include/xhtml"

   <xhtml:div> ... [everything from /html/body/*] ... </xhtml:div>

* format "link/xhtml"

   <xhtml:a href="...">The Title</xhtml:a>

----

AssetType "image":

* format "include/xhtml"

   <xhtml:img src="..." alt="..." />

* format "include/svg"

   <svg:image x=".." y=".." xlink:href="..." />

* format "include/xslfo"

   ...

----

AssetType "pdf"

* format "include/xhtml", format "link/xhtml"

   <xhtml:a href="[...]/foo.pdf">The Title<xhtml:a>


etc.

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

solprovider@apache.org wrote:

[...]

> A Document is XML.  An Asset is not XML.

IMO we should use the term Asset for all content items, whether XML
or not. I don't see the need for differentiating between XML and non-XML
assets.

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

I take a vacation in a hospital after a high-speed car crash and this
thread takes tangents I could not dream in my worst nightmare.

===
"Asset is a financial term."
Yes, accountants use the word to mean anything of positive value, as
opposed to "Liabilities" which are anything of negative value.

Lenya 1.2's GUI used the term for anything uploadable stored as a file.

I suggest using the term for any user-driven content that is not XML. 
That supersets the Lenya 1.2 usage in a way understandable by the
public.

===
"Assets as part of Pages."
"Pages contain one or more Documents."
These statements confuse storage and development needs with presentation.

A Document is XML.  An Asset is not XML.
- In Lenya 1.2, Assets were associated with a Document.  This causes
major headaches for users because they must use the docID in the URL
to make a link.
- I suggested making Assets children of Publication/Content, just like
Documents.  Then both Documents and Assets can have a parent class
that provides multiple languages (Translations) and versions
(Revisions).  The major difference is that Documents can be processed
by XSL statements (map:translate).  It is impossible to do XSL
Transformations on non-XML data.

A "Page" is the unit a visitor sees on the Website.  Pages are created
by Modules.  A Module can aggregate Documents and the results of other
Modules to create a Page.  The Page can include links to Assets and
other Pages.  Pages are dynamically generated by Modules.  Although it
is possible (and useful for testing) to have a Module that dumps a
Document to the visitor, no Website would use that Module in
production.  All decent Websites provide some navigation, preferably
without dead ends (one reason to avoid PDFs), and the minimum
production Module should add a "Home" or "Back" link to every Page.

The important point is that "Page" is not something storable in Lenya
(except in the non-functional cache where the HTML and other content
generated by Lenya is stored for performance.)

===
A PDF is not a Document, because a PDF is not XML.  Most PDFs would be
uploaded as Assets.

A Module may transform a Document (or multiple Documents and possibly
Assets) into a PDF.  The result could be saved in Lenya as an Asset,
but performance could be improved without creating another static
resource by saving the results in the non-functional cache.  The URL
of the PDF would be the URL of the PDF-creating Module and any
parameters it requires to create the specific PDF.

Implementation:
If a PDF is uploaded, then it is an Asset.
   Publication/Content/Asset[type="PDF"]

If the PDF is dynamically generated, then it is accessed though a Module:
   http://lenyaServer/myPub/PDFModule/someParameters
The parameter(s) could be:
- a single Document identifier that is transformed to PDF using XSL.
- several resources (Document and Assets) sequentially added to PDF.
- a key identifying a configuration document.
- anything else imaginable.

===
It seems likely the parent Node of Documents and Assets will be named
"Content", and the superclass for Documents and Assets will be named
"ContentItem" (although I still prefer "Resource" to reduce the word
count and remove the capital 'I'):
   Publication/Content/ContentItem[subclass="Document|Asset"]/Translation/Revision

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Josias Thoeny wrote:

[...]

> FYI, this link shows some definitions of the term "asset":
> http://www.google.com/search?q=define:asset&defl=en
> 
> It seems to come from the financial world.
> 
> I don't want to start the discussion about the terms all over again,
> just thought it's interesting.

Most of the definitions refer to "an item of value", which is IMO
not the worst association we can generate for Lenya content items :)

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Josias Thoeny <jo...@wyona.com>.

On Mon, 2006-02-06 at 10:59 +0100, Andreas Hartmann wrote:
> Jörn Nettingsmeier wrote:
> > Andreas Hartmann wrote:
> >> Jörn Nettingsmeier wrote:
> >>
> >> [...]
> >>
> >>>> One thing, though:  It isn't clear to me whether Jörn and Solprovider
> >>>> really agree on what a Document is.  Jörn seems to be using Document
> >>>> as the aggregate/container term for both the editable text and the
> >>>> assets of a page.  But Solprovider seems to be suggesting the words
> >>>> Resource or Content for that purpose.
> >>>
> >>> now that you say it, you are right. i had read solprovider's remark 
> >>> the way i wanted it to read :(
> >>
> >> A little off topic, but I'd like to state my opinion once more:
> >> The concept of "the editable text and the assets of a page" should be
> >> dropped. Textual documents and (binary) assets should be equally handled
> >> items in a flat storage.
> > 
> > ++votes.
> > 
> > how do you like my terminology proposal in the wiki?
> > i'm suggesting to use "document" as the thing that is composed of 
> > assets, where assets are "text assets" (can be in different languages), 
> > and image or other media assets (with the same properties).
> 
> 
> Actually I'm not quite pleased with that.
> IMO "set of =>language versions and other =>assets" is a mixture of concepts.
> Assets consist of language versions (translations).
> 
> 
> How about this:
> 
> 
> asset:: An atomic piece of information, handled as a single unit by the API.
>          An asset consists of multiple translations (language versions).
> 
> document:: A dynamically assembled piece of information, based on an asset.
>             The document isgenerated by resolving references to other assets
>             and external resources. [1]
> 
> page:: The aggregation of 1..n documents + presentation.

FYI, this link shows some definitions of the term "asset":
http://www.google.com/search?q=define:asset&defl=en

It seems to come from the financial world.

I don't want to start the discussion about the terms all over again,
just thought it's interesting.

Josias

> 
> 
> [1] Using this definition, navigation widgets can be implemented as documents,
> based on an asset which references the resource which generates the navigation
> widget. This means, according to this definition, a page can contain navigation
> widgets as well.
> 
> 
> -- Andreas
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/9/06, Andreas Hartmann <an...@apache.org> wrote:
> Bob Harner wrote:
> OK, I'll give it another try. If you don't think this whole thing
> is rubbish, I'd like to ask you to just add your +-1 and use footnotes
> for comments, so that the structure stays intact.
> - A piece of information, regardless of its nature, which is handled
>    as a single unit by Lenya is called
>    * asset [+0.3]
>    * resource [+0.2]
>    * content item [+0.2]
>    * document [+0.3]
>    * ...
>    (I thought about it again, Josias has a point saying images can be
>    regarded as documents as well. IMO content item is too long, resource is
>    not specific enough.)
I'll try to stay out of this, as I have already made my opinions very
well known.

"A piece of information, regardless of its nature, which is handled as
a single unit by Lenya"
I vote for Resource as the parent class which is subclassed:
- Document (XML stored in Content and Modules)
- Asset (uploaded file stored in Content and Modules)
- Program (CSS, XMAP, XSLT, and XSP stored in Modules)

We have not discussed a name for "programming resource" yet, but it
should be unnecessary because each type will have its own
class/Module.  XMAP, XSLT, and XSP will inherit from Document since
they are XML.  CSS will need its own, but Thorsten suggested Forrest
has something usable.

'Nuff,
solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Bob Harner <bo...@gmail.com> wrote:
> It is understandably difficult to redesign what is a core part of
> Lenya and simultaneously redefine some of the terms being used.  Maybe
> we need to discuss those things separately, but that would be hard
> also.
>
> Do you all at least share the following goals?
> 1. Find a way to treat both the Lenya-managed xml text and binary
> files in a (mostly) unified way, so that:
>    a. both can have historical versions (revisions)
>    b. both can have a workflow
>    c. both can have language versions (translations)
>    d. relationships can be maintained between them (parent/child,
> embedding, linking)
>    e. both can be directly referenced from the navigation menus
Yes.  We are debating whether to call this class "Resource" or
"ContentItem", but I think we all agree with these specifications of
functionality.

> 2. Enable/simplify having one web page composed of multiple
> separately-managed content.
Lenya does that well already: <map:aggregate>.  "Modules" will extend
the flexibility.

> 3. Enable/simplify having multiple web pages for one unit of content
> (e.g. an article with 5 pages but managed by Lenya as a single unit)

This should not be difficult.  Either:
- Each section is a separate Document. That is the easiest method with
Lenya 1.2.
- A Module takes a "Section" parameter to return only a portion of a
Document.  Some tag needs to be specified as the division point
between Sections.

> 4. Enable the (optional) use of multiple navigation trees (site trees).
I think Indexes should be built into the core as part of the migration
to using JCR.  I hope it can be done by adding the RelationsTable, the
normal "live" hierarchical Index, the SitetreeGenerator, and changing
a few Document functions to use the new technology.  Support for
automatically adding additional Indexes can be added later.  The whole
functionality could be pushed to the next release if it proves more
difficult.

> 5. Avoid making such radical changes to the code that a 1.4 release is
> overly delayed.
Agreed.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Bob Harner <bo...@gmail.com> wrote:
> Rather than voting on the names of individual parts separately, which
> may conflict, I would suggest as a next step that each interested
> person summarize his view of how the site, page, module, document,
> asset, ande site tree all relate to each other, in just one or two
> paragraphs, similar to the way Andreas and SolProvider did at the
> beginning of this thread when they wrote "For me, the following text
> sounds quite good".  There are probably only two or three such
> versions.  Then you committers can vote on which paragraph sounds
> best.

There were already two Pages in the Wiki for teminology.  I posted my
"Glossary" on my site:
   http://solprovider.com/lenya/glossary
Some of the concepts are quite different from the current suggestions,
but the terms are all easily understood and holistically consistent. 
Some of them (see "RelationsTable") will require enough work that they
should be pushed to 1.5.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Bob Harner wrote:

[...]

> Do you all at least share the following goals?
> 
> 1. Find a way to treat both the Lenya-managed xml text and binary
> files in a (mostly) unified way, so that:
> 
>    a. both can have historical versions (revisions)
>    b. both can have a workflow
>    c. both can have language versions (translations)
>    d. relationships can be maintained between them (parent/child,
> embedding, linking)
>    e. both can be directly referenced from the navigation menus

+1

> 2. Enable/simplify having one web page composed of multiple
> separately-managed content.

+1

> 3. Enable/simplify having multiple web pages for one unit of content
> (e.g. an article with 5 pages but managed by Lenya as a single unit)

This is IMO just a view issue (pagination). As solprovider pointed out,
it should be handled by the presentation layer, for instance in a module.


> 4. Enable the (optional) use of multiple navigation trees (site trees).

That should be possible. It has to be discussed if the core should
support multiple indexes.


> 5. Avoid making such radical changes to the code that a 1.4 release is
> overly delayed.

I guess it will take some time to design a solid repository layer.
We could decide to defer that to 1.6 and get out 1.4 in more or less
its current state.

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Bob Harner <bo...@gmail.com>.

On 2/8/06, Bob Harner <bo...@gmail.com> wrote:
> On 2/8/06, Andreas Hartmann <an...@apache.org> wrote:
> > solprovider@apache.org wrote:
> >
> > > Are we trying to confuse and lose the entire customer base?
> >
> > It's so reassuring to finally have someone around
> > who cares for the project ... :)
> >
> >
> >  > Go read my other posts.
> >
> > I did, and I still have my own opinion. I see your points,
> > to some I agree, to others not. Anyway, I consider this
> > discussion too important to step aside.
> >
> > In the end, it's OK with me if we call XML and non-XML "things"
> > documents, or resources, or content items, or whatever.
> > I stated my current priority list. It seems to me that most of
> > us think this way. I didn't think that the discussion would
> > take this long when I started the thread.
> >
> > Maybe we should use a glossary proposal wiki page for a
> > (kind-of) vote. Everyone can write down definitions, and
> > the others can add +1 / -1. Don't add comments to existing
> > ones, just add your own. The positive ones survive, the
> > negative ones don't. No idea if this is useful.
> >
> > Any suggestions how to come to a decision are appreciated.
> >
> > Just my CHF 0.02.
> >
> > -- Andreas
> >
>
> Rather than voting on the names of individual parts separately, which
> may conflict, I would suggest as a next step that each interested
> person summarize his view of how the site, page, module, document,
> asset, ande site tree all relate to each other, in just one or two
> paragraphs, similar to the way Andreas and SolProvider did at the
> beginning of this thread when they wrote "For me, the following text
> sounds quite good".  There are probably only two or three such
> versions.  Then you committers can vote on which paragraph sounds
> best.
>

It is understandably difficult to redesign what is a core part of
Lenya and simultaneously redefine some of the terms being used.  Maybe
we need to discuss those things separately, but that would be hard
also.

Do you all at least share the following goals?

1. Find a way to treat both the Lenya-managed xml text and binary
files in a (mostly) unified way, so that:

   a. both can have historical versions (revisions)
   b. both can have a workflow
   c. both can have language versions (translations)
   d. relationships can be maintained between them (parent/child,
embedding, linking)
   e. both can be directly referenced from the navigation menus

2. Enable/simplify having one web page composed of multiple
separately-managed content.

3. Enable/simplify having multiple web pages for one unit of content
(e.g. an article with 5 pages but managed by Lenya as a single unit)

4. Enable the (optional) use of multiple navigation trees (site trees).

5. Avoid making such radical changes to the code that a 1.4 release is
overly delayed.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Bob Harner wrote:

> [...] I would suggest as a next step that each interested
> person summarize his view of how the site, page, module, document,
> asset, ande site tree all relate to each other, in just one or two
> paragraphs, similar to the way Andreas and SolProvider did at the
> beginning of this thread when they wrote "For me, the following text
> sounds quite good".  There are probably only two or three such
> versions.  Then you committers can vote on which paragraph sounds
> best.

OK, I'll give it another try. If you don't think this whole thing
is rubbish, I'd like to ask you to just add your +-1 and use footnotes
for comments, so that the structure stays intact.

- A *publication* is a collection of related pieces of information and
   functionality to manipulate these items. Each piece of information
   can exist in multiple states at the same time. This might be
   implemented using *areas*, revision history labels, or something else.

- The entirety of information pieces of a publication is called
   the *content* of the publication.

- A piece of information, regardless of its nature, which is handled
   as a single unit by Lenya is called

   * asset [+0.3]
   * resource [+0.2]
   * content item [+0.2]
   * document [+0.3]
   * ...

   (I thought about it again, Josias has a point saying images can be
   regarded as documents as well. IMO content item is too long, resource is
   not specific enough.)

- A language version of a piece of information is called a *translation*.

- A version in the history of a translation is called a *revision*.

- The structuring information (there may be several of them) are

   * indexes [+0.5]
   * structures [+0.5]

- A node in the information structure is called

   * (structure/index) node


Thanks for your participation in this discussion,

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Bob Harner <bo...@gmail.com>.

On 2/8/06, Andreas Hartmann <an...@apache.org> wrote:
> solprovider@apache.org wrote:
>
> > Are we trying to confuse and lose the entire customer base?
>
> It's so reassuring to finally have someone around
> who cares for the project ... :)
>
>
>  > Go read my other posts.
>
> I did, and I still have my own opinion. I see your points,
> to some I agree, to others not. Anyway, I consider this
> discussion too important to step aside.
>
> In the end, it's OK with me if we call XML and non-XML "things"
> documents, or resources, or content items, or whatever.
> I stated my current priority list. It seems to me that most of
> us think this way. I didn't think that the discussion would
> take this long when I started the thread.
>
> Maybe we should use a glossary proposal wiki page for a
> (kind-of) vote. Everyone can write down definitions, and
> the others can add +1 / -1. Don't add comments to existing
> ones, just add your own. The positive ones survive, the
> negative ones don't. No idea if this is useful.
>
> Any suggestions how to come to a decision are appreciated.
>
> Just my CHF 0.02.
>
> -- Andreas
>

Rather than voting on the names of individual parts separately, which
may conflict, I would suggest as a next step that each interested
person summarize his view of how the site, page, module, document,
asset, ande site tree all relate to each other, in just one or two
paragraphs, similar to the way Andreas and SolProvider did at the
beginning of this thread when they wrote "For me, the following text
sounds quite good".  There are probably only two or three such
versions.  Then you committers can vote on which paragraph sounds
best.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

solprovider@apache.org wrote:

> Are we trying to confuse and lose the entire customer base?  

It's so reassuring to finally have someone around
who cares for the project ... :)

 > Go read my other posts.

I did, and I still have my own opinion. I see your points,
to some I agree, to others not. Anyway, I consider this
discussion too important to step aside.

In the end, it's OK with me if we call XML and non-XML "things"
documents, or resources, or content items, or whatever.
I stated my current priority list. It seems to me that most of
us think this way. I didn't think that the discussion would
take this long when I started the thread.

Maybe we should use a glossary proposal wiki page for a
(kind-of) vote. Everyone can write down definitions, and
the others can add +1 / -1. Don't add comments to existing
ones, just add your own. The positive ones survive, the
negative ones don't. No idea if this is useful.

Any suggestions how to come to a decision are appreciated.

Just my CHF 0.02.

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/8/06, Andreas Hartmann <an...@apache.org> wrote:
> Josias Thoeny wrote:
> > On Mon, 2006-02-06 at 10:59 +0100, Andreas Hartmann wrote:
> >> How about this:
> >> asset:: An atomic piece of information, handled as a single unit by the API.
> >>          An asset consists of multiple translations (language versions).
> >> document:: A dynamically assembled piece of information, based on an asset.
> >>             The document isgenerated by resolving references to other assets
> >>             and external resources. [1]
> > Do I understand correctly that such a document is rather an abstract
> > concept, like e.g. the result of a pipeline?
> Yes.
> > I mean, there would be no Document.java?
> Yes.
> > Would the term "document" actually be used in the code?
> Only in URLs like cocoon://generate-document/...
> > About the references from one asset to another, would they be handled in
> > a centralized way by the java core, or would the assets do that
> > themselves (by xinclude, or by generating a link, ...)?
> That has to be decided. I have the feeling that the latter option
> is the way to go, since a common interface might be too restrictive.
> > I'm just trying to understand your idea...
>
> Thanks for your questions, I agree that these issues need clarification.

This is painful.  The text above redefines every term enough to
confuse even someone as knowledgeable about Lenya as Josias.  Are we
trying to confuse and lose the entire customer base?  Go read my other
posts.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Josias Thoeny wrote:
> On Mon, 2006-02-06 at 10:59 +0100, Andreas Hartmann wrote:
> [...]
>> How about this:
>>
>>
>> asset:: An atomic piece of information, handled as a single unit by the API.
>>          An asset consists of multiple translations (language versions).
>>
>> document:: A dynamically assembled piece of information, based on an asset.
>>             The document isgenerated by resolving references to other assets
>>             and external resources. [1]
> 
> Do I understand correctly that such a document is rather an abstract
> concept, like e.g. the result of a pipeline?

Yes.

> I mean, there would be no Document.java?

Yes.

> Would the term "document" actually be used in the code?

Only in URLs like cocoon://generate-document/...


> About the references from one asset to another, would they be handled in
> a centralized way by the java core, or would the assets do that
> themselves (by xinclude, or by generating a link, ...)?

That has to be decided. I have the feeling that the latter option
is the way to go, since a common interface might be too restrictive.


> I'm just trying to understand your idea...

Thanks for your questions, I agree that these issues need clarification.

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Josias Thoeny <jo...@wyona.com>.

On Mon, 2006-02-06 at 10:59 +0100, Andreas Hartmann wrote:
[...]
> How about this:
> 
> 
> asset:: An atomic piece of information, handled as a single unit by the API.
>          An asset consists of multiple translations (language versions).
> 
> document:: A dynamically assembled piece of information, based on an asset.
>             The document isgenerated by resolving references to other assets
>             and external resources. [1]

Do I understand correctly that such a document is rather an abstract
concept, like e.g. the result of a pipeline?
I mean, there would be no Document.java?
Would the term "document" actually be used in the code?

About the references from one asset to another, would they be handled in
a centralized way by the java core, or would the assets do that
themselves (by xinclude, or by generating a link, ...)?

I'm just trying to understand your idea...

Josias

> 
> page:: The aggregation of 1..n documents + presentation.
> 
> 
> [1] Using this definition, navigation widgets can be implemented as documents,
> based on an asset which references the resource which generates the navigation
> widget. This means, according to this definition, a page can contain navigation
> widgets as well.
> 
> 
> -- Andreas
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

Andreas Hartmann wrote:
> Bob Harner wrote:
> 
> [...]
> 
>> Andreas, can you explain your thinking on why a document would be
>> based on just one asset rather than being a collection of asset 
>> references?
> 
> Here are some reasons:
> 
> The major reason is to avoid maintaining additional storage and
> addressing facilities.
> 
> I like concepts where you can express many things using a few basic
> building blocks. XML makes this easy for document-based systems.
> You can store all your content, indizes etc. as XML documents.
> 
> Another reason is the GUI. From the user's point of view, I'd like
> to include assets in other assets. IMO it is much easier to build
> a GUI which supports adding XInclude expressions in existing assets,
> than one which allows to assemble documents from assets. To me, it
> feels more natural if I can write text and include assets in
> arbitrary locations. But this is just a matter of personal preference.

after some thinking, i think this is indeed the best solution.

> In the Lenya repository, we could use assets for various applications:
> 
> - static content asset
> - dynamic content asset (e.g., containing a reference to an RSS feed)
> - collection asset (static or dynamic)
> - structure / index asset (referencing an index generation component)
> 
> To find a specific asset, you need only a single lookup mechanism,
> based on names, meta data, etc.
> 
> XML-based assets can support XInclude, CInclude and other reference
> mechanisms.
> 
> A document is a (parameterized) view of an asset. It is generated by

or of more than one asset, if the original asset has references.

how about: a document is the result of a request, i.e it contains of one 
(or more, in case of references) assets in the requested language 
version (if applicable, otherwise the "default" language) with the 
corresponding processing applied.

> resolving the references and expanding their content. The parameters
> can either be provided explicitly or derived from a view context.
> 
> An example:
> 
> - There is an asset "report list". It contains a list of references
>   to reports.
> 
> - If requested on a web page, the document (view) "HTML presentation"
>   of the asset is generated. Parameters control whether all reports are
>   included in a single page or if a multi-page site with index page
>   is rendered.
> 
> - If requested as PDF, the document (view) "PDF presentation" is
>   created (including a ToC and all reports in a PDF document).
> 
> The available views depend on the resource type of the asset.
> 
> WDYT?

nice.

-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Bob Harner wrote:

[...]

> Andreas, can you explain your thinking on why a document would be
> based on just one asset rather than being a collection of asset references?

Here are some reasons:

The major reason is to avoid maintaining additional storage and
addressing facilities.

I like concepts where you can express many things using a few basic
building blocks. XML makes this easy for document-based systems.
You can store all your content, indizes etc. as XML documents.

Another reason is the GUI. From the user's point of view, I'd like
to include assets in other assets. IMO it is much easier to build
a GUI which supports adding XInclude expressions in existing assets,
than one which allows to assemble documents from assets. To me, it
feels more natural if I can write text and include assets in
arbitrary locations. But this is just a matter of personal preference.

----

In the Lenya repository, we could use assets for various applications:

- static content asset
- dynamic content asset (e.g., containing a reference to an RSS feed)
- collection asset (static or dynamic)
- structure / index asset (referencing an index generation component)

To find a specific asset, you need only a single lookup mechanism,
based on names, meta data, etc.

XML-based assets can support XInclude, CInclude and other reference
mechanisms.

A document is a (parameterized) view of an asset. It is generated by
resolving the references and expanding their content. The parameters
can either be provided explicitly or derived from a view context.

An example:

- There is an asset "report list". It contains a list of references
   to reports.

- If requested on a web page, the document (view) "HTML presentation"
   of the asset is generated. Parameters control whether all reports are
   included in a single page or if a multi-page site with index page
   is rendered.

- If requested as PDF, the document (view) "PDF presentation" is
   created (including a ToC and all reports in a PDF document).

The available views depend on the resource type of the asset.

WDYT?

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Bob Harner <bo...@gmail.com>.

On 2/6/06, Bob Harner <bo...@gmail.com> wrote:
> On 2/6/06, Andreas Hartmann <an...@apache.org> wrote:
> > Jörn Nettingsmeier wrote:
> > > Andreas Hartmann wrote:
> > >> Jörn Nettingsmeier wrote:
> > >>
> > >> [...]
> > >>
> > >>>> One thing, though:  It isn't clear to me whether Jörn and Solprovider
> > >>>> really agree on what a Document is.  Jörn seems to be using Document
> > >>>> as the aggregate/container term for both the editable text and the
> > >>>> assets of a page.  But Solprovider seems to be suggesting the words
> > >>>> Resource or Content for that purpose.
> > >>>
> > >>> now that you say it, you are right. i had read solprovider's remark
> > >>> the way i wanted it to read :(
> > >>
> > >> A little off topic, but I'd like to state my opinion once more:
> > >> The concept of "the editable text and the assets of a page" should be
> > >> dropped. Textual documents and (binary) assets should be equally handled
> > >> items in a flat storage.
> > >
> > > ++votes.
> > >
> > > how do you like my terminology proposal in the wiki?
> > > i'm suggesting to use "document" as the thing that is composed of
> > > assets, where assets are "text assets" (can be in different languages),
> > > and image or other media assets (with the same properties).
> >
> >
> > Actually I'm not quite pleased with that.
> > IMO "set of =>language versions and other =>assets" is a mixture of concepts.
> > Assets consist of language versions (translations).
> >
> >
> > How about this:
> >
> >
> > asset:: An atomic piece of information, handled as a single unit by the API.
> >          An asset consists of multiple translations (language versions).
> >
> > document:: A dynamically assembled piece of information, based on an asset.
> >             The document isgenerated by resolving references to other assets
> >             and external resources. [1]
> >
> > page:: The aggregation of 1..n documents + presentation.
> >
> >
> > [1] Using this definition, navigation widgets can be implemented as documents,
> > based on an asset which references the resource which generates the navigation
> > widget. This means, according to this definition, a page can contain navigation
> > widgets as well.
> >
> >
> > -- Andreas
> >
>
> I like Andreas' definitions, except I'm a little uneasy about saying
> that a page consists of one or more documents.  That's the converse of
> the real-world definition in which a "document" consists of one or
> more pages (which in HTML terms would mean next page & prev page
> links).
>
> Also, let's please consider the possibility that a document could be
> simply a PDF file (linked directly from the navigation).  I'm not sure
> how to work that in, but it would seem to be an important need.

Andreas, can you explain your thinking on why a document would be
based on just one asset rather than being a collection of asset references?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Bob Harner wrote:

[...]

> I like Andreas' definitions, except I'm a little uneasy about saying
> that a page consists of one or more documents.  That's the converse of
> the real-world definition in which a "document" consists of one or
> more pages (which in HTML terms would mean next page & prev page
> links).

OK, I guess there's a more appropriate term for that.

> Also, let's please consider the possibility that a document could be
> simply a PDF file (linked directly from the navigation).  I'm not sure
> how to work that in, but it would seem to be an important need.

The actual PDF would be a translation of an asset. The PDF document
would be equal to the PDF translation, since PDF doesn't support
mechanisms like XLink etc., at least AFAIK.

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Bob Harner <bo...@gmail.com>.

On 2/6/06, Andreas Hartmann <an...@apache.org> wrote:
> Jörn Nettingsmeier wrote:
> > Andreas Hartmann wrote:
> >> Jörn Nettingsmeier wrote:
> >>
> >> [...]
> >>
> >>>> One thing, though:  It isn't clear to me whether Jörn and Solprovider
> >>>> really agree on what a Document is.  Jörn seems to be using Document
> >>>> as the aggregate/container term for both the editable text and the
> >>>> assets of a page.  But Solprovider seems to be suggesting the words
> >>>> Resource or Content for that purpose.
> >>>
> >>> now that you say it, you are right. i had read solprovider's remark
> >>> the way i wanted it to read :(
> >>
> >> A little off topic, but I'd like to state my opinion once more:
> >> The concept of "the editable text and the assets of a page" should be
> >> dropped. Textual documents and (binary) assets should be equally handled
> >> items in a flat storage.
> >
> > ++votes.
> >
> > how do you like my terminology proposal in the wiki?
> > i'm suggesting to use "document" as the thing that is composed of
> > assets, where assets are "text assets" (can be in different languages),
> > and image or other media assets (with the same properties).
>
>
> Actually I'm not quite pleased with that.
> IMO "set of =>language versions and other =>assets" is a mixture of concepts.
> Assets consist of language versions (translations).
>
>
> How about this:
>
>
> asset:: An atomic piece of information, handled as a single unit by the API.
>          An asset consists of multiple translations (language versions).
>
> document:: A dynamically assembled piece of information, based on an asset.
>             The document isgenerated by resolving references to other assets
>             and external resources. [1]
>
> page:: The aggregation of 1..n documents + presentation.
>
>
> [1] Using this definition, navigation widgets can be implemented as documents,
> based on an asset which references the resource which generates the navigation
> widget. This means, according to this definition, a page can contain navigation
> widgets as well.
>
>
> -- Andreas
>

I like Andreas' definitions, except I'm a little uneasy about saying
that a page consists of one or more documents.  That's the converse of
the real-world definition in which a "document" consists of one or
more pages (which in HTML terms would mean next page & prev page
links).

Also, let's please consider the possibility that a document could be
simply a PDF file (linked directly from the navigation).  I'm not sure
how to work that in, but it would seem to be an important need.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

a belated reply (i'm a weekend worker atm) :(

Andreas Hartmann wrote:
> Joern Nettingsmeier wrote:
> 
> [...]
> 
>> <quote>
>> - - - Document (an abstract container for all assets)
>>       contains:
>> - - - - Assets (text assets (in several languages), images, media files,
>> pdfs, etc)
>>
>> Assets have revision control information.
>>
>> Documents have a revision control information summary (the collection of
>> all current revisions of their assets). Whenever one asset changes, the
>> Document revision number is bumped.
>> </quote>
>>
>> one thing that's missing is that all assets can have a language 
>> attribute.
>>
>>> asset:: An atomic piece of information, handled as a single unit by the
>>> API.
>>>         An asset consists of multiple translations (language versions).
>>
>> "can consist"?
> 
>  From a user's point of view, it certainly makes sense to have 
> non-localized
> assets. Well, sand or water might look the same in all cultures (I don't
> want to start a philosophical discussion here).

i'd still say "can consist". i.e. the language field in the asset 
properties can be empty, in which case the asset is considered to be in 
the default language.

>  From a developer's point of view, I try to avoid optional features unless
> they are absolutely necessary. It might be considered to return the same
> content for all translations. getTranslations() would return an array
> containing translations for all languages supported by the publication.
> 
> I just want to avoid code like this:
> 
> if (asset.isLocalized()) {
>     return asset.getTranslation(getLanguage());
> }
> else {
>     return asset.getUniversalTranslation();
> }

of course. but we have the concept of a "default language" for this, 
i.e. if a requested language version does not exist, just return the 
default one. the only modification necessary will be contained inside 
getTranslation() and will be invisible to users, which is how it should be.

>>> document:: A dynamically assembled piece of information, based on an 
>>> asset.
>>
>> ok, i realize that not all assets are created equal. one is the starting
>> point which references the others. but the starting asset should not be
>> special. the document should know which is the starting asset of a
>> particular language version.
> 
> You request an asset. The document comes into existence automatically
> by resolving the asset's references and including their content (or
> parts of their content, meta data, you name it).

ok. so it's just that some assets (namely the xml ones that solprovider 
does not want to call assets) can reference other assets. sounds nice to me.

>>>            The document isgenerated by resolving references to other 
>>> assets
>>>            and external resources. [1]
>>>
>>> page:: The aggregation of 1..n documents + presentation.
>>>
>>>
>>> [1] Using this definition, navigation widgets can be implemented as
>>> documents,
>>> based on an asset which references the resource which generates the
>>> navigation
>>> widget. This means, according to this definition, a page can contain
>>> navigation
>>> widgets as well.
>>
>> hmm. for components such as navigation, how about "dynamic asset"?
> 
> Yes, that would describe it. But I don't see a need for a special handling.

no, of course not. but to reduce user confusion, it's nice to have this 
term in the docs. the handling should be unified.

> The asset would for instance just look like this:
> 
> <ci:include src="cocoon://modules/sitetree/tree.xml?view=list-titles"/>
> 
>> i'd
>> like the document to be the final piece of content that is created to
>> answer a user request.


-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Joern Nettingsmeier wrote:

[...]

> <quote>
> - - - Document (an abstract container for all assets)
>       contains:
> - - - - Assets (text assets (in several languages), images, media files,
> pdfs, etc)
> 
> Assets have revision control information.
> 
> Documents have a revision control information summary (the collection of
> all current revisions of their assets). Whenever one asset changes, the
> Document revision number is bumped.
> </quote>
> 
> one thing that's missing is that all assets can have a language attribute.
> 
>> asset:: An atomic piece of information, handled as a single unit by the
>> API.
>>         An asset consists of multiple translations (language versions).
> 
> "can consist"?

 From a user's point of view, it certainly makes sense to have non-localized
assets. Well, sand or water might look the same in all cultures (I don't
want to start a philosophical discussion here).

 From a developer's point of view, I try to avoid optional features unless
they are absolutely necessary. It might be considered to return the same
content for all translations. getTranslations() would return an array
containing translations for all languages supported by the publication.

I just want to avoid code like this:

if (asset.isLocalized()) {
     return asset.getTranslation(getLanguage());
}
else {
     return asset.getUniversalTranslation();
}


>> document:: A dynamically assembled piece of information, based on an asset.
> 
> ok, i realize that not all assets are created equal. one is the starting
> point which references the others. but the starting asset should not be
> special. the document should know which is the starting asset of a
> particular language version.

You request an asset. The document comes into existence automatically
by resolving the asset's references and including their content (or
parts of their content, meta data, you name it).


>>            The document isgenerated by resolving references to other assets
>>            and external resources. [1]
>>
>> page:: The aggregation of 1..n documents + presentation.
>>
>>
>> [1] Using this definition, navigation widgets can be implemented as
>> documents,
>> based on an asset which references the resource which generates the
>> navigation
>> widget. This means, according to this definition, a page can contain
>> navigation
>> widgets as well.
> 
> hmm. for components such as navigation, how about "dynamic asset"?

Yes, that would describe it. But I don't see a need for a special handling.
The asset would for instance just look like this:

<ci:include src="cocoon://modules/sitetree/tree.xml?view=list-titles"/>

> i'd
> like the document to be the final piece of content that is created to
> answer a user request.

Probably this is more appropriate, since it won't be possible to expand
the references first and apply a style later on.


-- Andreas



-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Joern Nettingsmeier <po...@uni-duisburg.de>.

Andreas Hartmann wrote:
> Jörn Nettingsmeier wrote:
>> Andreas Hartmann wrote:
>>> Jörn Nettingsmeier wrote:
>>>
>>> [...]
>>>
>>>>> One thing, though:  It isn't clear to me whether Jörn and Solprovider
>>>>> really agree on what a Document is.  Jörn seems to be using Document
>>>>> as the aggregate/container term for both the editable text and the
>>>>> assets of a page.  But Solprovider seems to be suggesting the words
>>>>> Resource or Content for that purpose.
>>>>
>>>> now that you say it, you are right. i had read solprovider's remark
>>>> the way i wanted it to read :(
>>>
>>> A little off topic, but I'd like to state my opinion once more:
>>> The concept of "the editable text and the assets of a page" should be
>>> dropped. Textual documents and (binary) assets should be equally handled
>>> items in a flat storage.
>>
>> ++votes.
>>
>> how do you like my terminology proposal in the wiki?
>> i'm suggesting to use "document" as the thing that is composed of
>> assets, where assets are "text assets" (can be in different
>> languages), and image or other media assets (with the same properties).
> 
> 
> Actually I'm not quite pleased with that.
> IMO "set of =>language versions and other =>assets" is a mixture of
> concepts.

sorry, i was in a hurry. the "Glossary" reflects the current state as i
understood it. the proposal i mean is:
http://wiki.apache.org/lenya/GlossaryStructure

<quote>
- - - Document (an abstract container for all assets)
      contains:
- - - - Assets (text assets (in several languages), images, media files,
pdfs, etc)

Assets have revision control information.

Documents have a revision control information summary (the collection of
all current revisions of their assets). Whenever one asset changes, the
Document revision number is bumped.
</quote>

one thing that's missing is that all assets can have a language attribute.

> asset:: An atomic piece of information, handled as a single unit by the
> API.
>         An asset consists of multiple translations (language versions).

"can consist"?

> 
> document:: A dynamically assembled piece of information, based on an asset.

ok, i realize that not all assets are created equal. one is the starting
point which references the others. but the starting asset should not be
special. the document should know which is the starting asset of a
particular language version.

>            The document isgenerated by resolving references to other assets
>            and external resources. [1]
> 
> page:: The aggregation of 1..n documents + presentation.
> 
> 
> [1] Using this definition, navigation widgets can be implemented as
> documents,
> based on an asset which references the resource which generates the
> navigation
> widget. This means, according to this definition, a page can contain
> navigation
> widgets as well.

hmm. for components such as navigation, how about "dynamic asset"? i'd
like the document to be the final piece of content that is created to
answer a user request.

-- 
"Án nýrra verka, án nútimans, hættir fortíðin að vekja áhuga."
"Without new works, without the present the past will cease to be of
interest."
        - Ásmundur Sveinsson (1893-1982)

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Jörn Nettingsmeier wrote:
> Andreas Hartmann wrote:
>> Jörn Nettingsmeier wrote:
>>
>> [...]
>>
>>>> One thing, though:  It isn't clear to me whether Jörn and Solprovider
>>>> really agree on what a Document is.  Jörn seems to be using Document
>>>> as the aggregate/container term for both the editable text and the
>>>> assets of a page.  But Solprovider seems to be suggesting the words
>>>> Resource or Content for that purpose.
>>>
>>> now that you say it, you are right. i had read solprovider's remark 
>>> the way i wanted it to read :(
>>
>> A little off topic, but I'd like to state my opinion once more:
>> The concept of "the editable text and the assets of a page" should be
>> dropped. Textual documents and (binary) assets should be equally handled
>> items in a flat storage.
> 
> ++votes.
> 
> how do you like my terminology proposal in the wiki?
> i'm suggesting to use "document" as the thing that is composed of 
> assets, where assets are "text assets" (can be in different languages), 
> and image or other media assets (with the same properties).


Actually I'm not quite pleased with that.
IMO "set of =>language versions and other =>assets" is a mixture of concepts.
Assets consist of language versions (translations).


How about this:


asset:: An atomic piece of information, handled as a single unit by the API.
         An asset consists of multiple translations (language versions).

document:: A dynamically assembled piece of information, based on an asset.
            The document isgenerated by resolving references to other assets
            and external resources. [1]

page:: The aggregation of 1..n documents + presentation.


[1] Using this definition, navigation widgets can be implemented as documents,
based on an asset which references the resource which generates the navigation
widget. This means, according to this definition, a page can contain navigation
widgets as well.


-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

Andreas Hartmann wrote:
> Jörn Nettingsmeier wrote:
> 
> [...]
> 
>>> One thing, though:  It isn't clear to me whether Jörn and Solprovider
>>> really agree on what a Document is.  Jörn seems to be using Document
>>> as the aggregate/container term for both the editable text and the
>>> assets of a page.  But Solprovider seems to be suggesting the words
>>> Resource or Content for that purpose.
>>
>> now that you say it, you are right. i had read solprovider's remark 
>> the way i wanted it to read :(
> 
> A little off topic, but I'd like to state my opinion once more:
> The concept of "the editable text and the assets of a page" should be
> dropped. Textual documents and (binary) assets should be equally handled
> items in a flat storage.

++votes.

how do you like my terminology proposal in the wiki?
i'm suggesting to use "document" as the thing that is composed of 
assets, where assets are "text assets" (can be in different languages), 
and image or other media assets (with the same properties).





-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Jörn Nettingsmeier wrote:

[...]

>> One thing, though:  It isn't clear to me whether Jörn and Solprovider
>> really agree on what a Document is.  Jörn seems to be using Document
>> as the aggregate/container term for both the editable text and the
>> assets of a page.  But Solprovider seems to be suggesting the words
>> Resource or Content for that purpose.
> 
> now that you say it, you are right. i had read solprovider's remark the 
> way i wanted it to read :(

A little off topic, but I'd like to state my opinion once more:
The concept of "the editable text and the assets of a page" should be
dropped. Textual documents and (binary) assets should be equally handled
items in a flat storage. The page is generated by resolving references
between these items.

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

Bob Harner wrote:
> On 2/4/06, Michael Wechner <mi...@wyona.com> wrote:
>> Andreas Hartmann wrote:
>>
>>>
>>> WDYT?
>>
>> Great that you start this discussion :-). As proposed I think we should
>> use http://wiki.apache.org/lenya/Glossary for writing
>> down the findings and discuss each term on this list individually.
>>
>> Michi
>>
>>> -- Andreas
>>>
> 
> I very much like where this talk is going.  Terms are getting much
> more specific.
> 
> BTWI think I was unclear earlier in what I meant by "Content Item" and
> its relationship to Document.  I meant that both a Document and an
> Asset are each Content Items.  I didn't mean for "Content Item" to
> mean the aggregate of a Document and its Assets.
> 
> One thing, though:  It isn't clear to me whether Jörn and Solprovider
> really agree on what a Document is.  Jörn seems to be using Document
> as the aggregate/container term for both the editable text and the
> assets of a page.  But Solprovider seems to be suggesting the words
> Resource or Content for that purpose.

now that you say it, you are right. i had read solprovider's remark the 
way i wanted it to read :(

> I wonder if we shouldn't promote "Asset" a little like this: A
> Document consists of Assets.  An Asset can be any of the following:  a
> German translation of the document, a French translation, a JPEG
> image, a PDF file, etc. All are assets.  Collectively, they are a
> Document.  This approach does away with the need for "Content Item"
> and helps us in treating assets just like other managed content, which
> several people have stated as a goal.

i like that. "asset" is a very generic term, and its good to use it in 
such a generic way.

> +1 on using Revision instead of Version
> 
> Also, I agree completely in using the Glossary wiki page;  I didn't
> realize it existed.  I will rename CMSTerminology to
> CMSTerminologyComparison and will remove the now-redundant
> definitions.  I think Glossary should be used for both 1.2 and 1.4
> terms, since most aren't changing.  The differences between 1.2. and
> 1.4 definitions should be noted in their Glossary definitions.

agreed. i will add a note saying so.

-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Bob Harner <bo...@gmail.com>.

On 2/4/06, Michael Wechner <mi...@wyona.com> wrote:
> Andreas Hartmann wrote:
>
> >
> >
> > WDYT?
>
>
> Great that you start this discussion :-). As proposed I think we should
> use http://wiki.apache.org/lenya/Glossary for writing
> down the findings and discuss each term on this list individually.
>
> Michi
>
> >
> > -- Andreas
> >

I very much like where this talk is going.  Terms are getting much
more specific.

BTWI think I was unclear earlier in what I meant by "Content Item" and
its relationship to Document.  I meant that both a Document and an
Asset are each Content Items.  I didn't mean for "Content Item" to
mean the aggregate of a Document and its Assets.

One thing, though:  It isn't clear to me whether Jörn and Solprovider
really agree on what a Document is.  Jörn seems to be using Document
as the aggregate/container term for both the editable text and the
assets of a page.  But Solprovider seems to be suggesting the words
Resource or Content for that purpose.

I wonder if we shouldn't promote "Asset" a little like this: A
Document consists of Assets.  An Asset can be any of the following:  a
German translation of the document, a French translation, a JPEG
image, a PDF file, etc. All are assets.  Collectively, they are a
Document.  This approach does away with the need for "Content Item"
and helps us in treating assets just like other managed content, which
several people have stated as a goal.

+1 on using Revision instead of Version

Also, I agree completely in using the Glossary wiki page;  I didn't
realize it existed.  I will rename CMSTerminology to
CMSTerminologyComparison and will remove the now-redundant
definitions.  I think Glossary should be used for both 1.2 and 1.4
terms, since most aren't changing.  The differences between 1.2. and
1.4 definitions should be noted in their Glossary definitions.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Michael Wechner <mi...@wyona.com>.

Andreas Hartmann wrote:

>
>
> WDYT?

Great that you start this discussion :-). As proposed I think we should
use http://wiki.apache.org/lenya/Glossary for writing
down the findings and discuss each term on this list individually.

Michi

>
> -- Andreas
>
>

-- 
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

Bob Harner wrote:

[...]

> I wonder if we shouldn't compile a table of terms that other CMS's
> (both OS & commercial) use, and try to use the terms that most have in
> common for the same concepts.  I suppose between us all we have had
> exposure to a large number of other CMS's.

That sounds very reasonable. Maybe you'd like to set up a Wiki page?

Thanks for your comments,

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Bob Harner <bo...@gmail.com>.

On 2/3/06, Andreas Hartmann <an...@apache.org> wrote:
> solprovider@apache.org wrote:
> > On 2/3/06, Andreas Hartmann <an...@apache.org> wrote:
> >> That's not a vote, but it would be appreciated if you could state
> >> your opinion. I'd like to emphasize that this discussion is not
> >> about the API itself, but only about the terms we use. So please
> >> don't go and complain about the concept of areas or language versions.
> >> These issues can be discussed later on.
> >
> > Well, that eliminates some of my ideas.  The problem is the technology
> > follows the terminology.
>
> It's the chicken+egg problem. This thread is just to create a
> basis for discussion. Class names etc. can be decided later.
> But I see the problem, deciding on the terms can depend on how
> it's implemented.
>
> Thanks a lot for your comments, here's how I'd summarize
> your statements (comments below):
>
>
>  > ----
>  >
>  > A website in Lenya is called a
>  >
>  > - publication [+andreas] [+solprovider]
>  > - site
>  > - ...
>  >
>  > ----
>  >
>  > The resources of a website can exist at the same time in several
>  >
>  > - areas [+andreas]
>  > - ...
>  > - drop the concept [+solprovider]
>  >
>  > ----
>  >
>  > The entirety of plain information (without structuring) is called
>  >
>  > - content [+andreas]
>  > - resources [+solprovider]
>  > - ...
>  >
>  > ----
>  >
>  > The set of language versions of a piece of information is called
>  >
>  > - content node
>  > - content item [+andreas]
>  > - document
>  > - resource [+solprovider]
>  > - ...
>  >
>  > ----
>  >
>  > A specific language version is called
>  >
>  > - content item
>  > - language version [+andreas] [+solprovider]
>  > - document
>  > - ...
>  >
>  > ----
>  >
>  > A version in the history of a language version is called
>  >
>  > - (history) version [+andreas] [+solprovider]
>  > - ...
>  >
>  > ----
>  >
>  > The structuring information (there may be several of them) are
>  >
>  > - sites
>  > - structures [+andreas]
>  > - navigations
>  > - indizes [+solprovider]
>  >
>  > (maybe we have to make a distinction between "structure" and "navigation")
>  >
>  > ----
>  >
>  > A node in the structure is called
>  >
>  > - (structure/index) node [+andreas] [+solprovider]
>  > - site node
>  > - navigation item
>  > - ...
>  >
>  >
>
>
>
> >> ----
> >> A website in Lenya is called a
> >> - publication [+1]
> >> - site
> >
> > Publication: Collection of data and special processing instructions.
> >
> > Website: A collection of Publications and other material accessed
> > through a single Internet domain.
> >
> >> ----
> >> The resources of a website can exist at the same time in several
> >> - areas [+1]
> > Cannot comment since this terminology hurts the technology.
> >
> >> ----
> >> The entirety of plain information (without structuring) is called
> >> - content [+1]
> >> - resources
> >
> > Decision time.  Will "Assets" (images and other files) be stored with
> > the documents?
>
> IMO yes, all of them should be treated in the same way.
>
>
> > Historically, "Content" refers only to a collection of "Documents",
> > and "Resources" referred a collection of "Assets".  If "Assets" are
> > not included, we should keep the term "Content".  Either term
> > ("Content" or "Resources") is good for the collection of Documents and
> > Assets, but changing it to "Resources" may reduce confusion with the
> > historical definition of "Content".
> >
> >
> >> ----
> >> The set of language versions of a piece of information is called
> >> - content node
> >> - content item [+1]
> >> - document
> >> - resource
> >
> > I prefer "Document" for an XML Document, because that is what XML and
> > Lenya1.2 call it.
>
> OK, but that would be a specialization of "content item" / "resource".
>
>
> > To refer to something that could be either a Document or an Asset, I
> > prefer "Resource".  "Content Node" uses two words for the same
> > concept.
>
> That would mean
>
> Resource
> - Document extends Resource (XML)
> - Asset extends Resource (binary, text, ...)
>
> [...]
>
> > For me, the following text sounds quite good:
> > In Lenya, a Website consists of one or more Publications.  A
> > Publication is a collection of Resources (Documents and Assets), and
> > any special functionality.   Documents (and soon Assets) are
> > maintained by Language, and each edit creates a new Version for
> > historical documentation and the ability to rollback to an older
> > Version.  Indexes provide selection of Documents for use by various
> > functionality, including Navigation Elements which provide easy access
> > to the Documents through the web interface.
>
> That sounds good to me as well. "Resource" and "content item" are
> both fine with me. The structure vs. index question needs IMO
> some more discussion. "Structure" sounds more like a static way to
> organize things, "index" rather like a dynamic way. But maybe that's
> only my personal perception.
>
> -- Andreas

I would advise against the continued use of the word "resource".  It
has almost no definite meaning in English, but if anything it means a
thing that supports something else, which is closer to what an asset
is than the union of assets and documents.  "Content Item" is far
better, IMHO, and in my experience is closer to an industry standard
term.

I prefer "site" over "publication".

I prefer "navigation tree" over "structure" or "index".

I wonder if we shouldn't compile a table of terms that other CMS's
(both OS & commercial) use, and try to use the terms that most have in
common for the same concepts.  I suppose between us all we have had
exposure to a large number of other CMS's.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Doug Chestnut <dh...@virginia.edu>.


Doug Chestnut wrote:
> 
> 
> solprovider@apache.org wrote:
> 
>> On 2/3/06, Bob Harner <bo...@gmail.com> wrote:
>>
>>> On 2/3/06, solprovider@apache.org <so...@apache.org> wrote:
>>>
>>>> In Lenya1.2:
>>>> "Assets" is the GUI name for additional data files.
>>>>
>>>> "Resources" is the file system directory name for "Assets".  It was
>>>> only seen by developers, and was never defined well.  The directory
>>>> name should disappear in Lenya1.4 if the Assets are moved to the
>>>> repository.  I like Resource as the superclass for Documents and
>>>> Assets.
>>>>
>>>> "Content Item" implies an "Item" of "Content".  1.2's "Content"
>>>> contained "Documents", so "Content Item" is an alternate name for a
>>>> "Document".
>>>
>>>
>>> Did 1.2's "Content" really only contain "Documents"?  I guess I
>>> consider "Content" a more general, inclusive term than you do, so to
>>> me it fits the need very well.  Lenya is a "content management system"
>>> after all, not a "resource management system".
>>
>>
>>
>> In Lenya 1.2, "content" was the name of the directory containing the
>> documents, but the same argument that "resources" was only a directory
>> name and never seen by editors applies.
>>
>> /pub/resources/document
>> /pub/resources/asset
>> The superclass is "Resource".
>>
>> /pub/content/document
>> /pub/content/asset
>> The superclass is "Content Item"
>>
>> As much as I dislike using two words where one is sufficient, I agree
>> a CMS should use the word "content" somewhere.  Although our marketing
>> could use "Lenya is more than a Content Management System; it is also
>> a Resource Management System." That means nothing but could influence
>> business managers.
>>
>> Andreas prefers "Content Item".  Bob dislikes "Resource".  I will
>> accept either.  Anybody else have a preference?
> 
> I will accept either, just don't like both ;)
together, that is
> 
> --Doug
> 
>>
>> solprovider
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
>> For additional commands, e-mail: dev-help@lenya.apache.org
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Doug Chestnut <dh...@virginia.edu>.


solprovider@apache.org wrote:
> On 2/3/06, Bob Harner <bo...@gmail.com> wrote:
> 
>>On 2/3/06, solprovider@apache.org <so...@apache.org> wrote:
>>
>>>In Lenya1.2:
>>>"Assets" is the GUI name for additional data files.
>>>
>>>"Resources" is the file system directory name for "Assets".  It was
>>>only seen by developers, and was never defined well.  The directory
>>>name should disappear in Lenya1.4 if the Assets are moved to the
>>>repository.  I like Resource as the superclass for Documents and
>>>Assets.
>>>
>>>"Content Item" implies an "Item" of "Content".  1.2's "Content"
>>>contained "Documents", so "Content Item" is an alternate name for a
>>>"Document".
>>
>>Did 1.2's "Content" really only contain "Documents"?  I guess I
>>consider "Content" a more general, inclusive term than you do, so to
>>me it fits the need very well.  Lenya is a "content management system"
>>after all, not a "resource management system".
> 
> 
> In Lenya 1.2, "content" was the name of the directory containing the
> documents, but the same argument that "resources" was only a directory
> name and never seen by editors applies.
> 
> /pub/resources/document
> /pub/resources/asset
> The superclass is "Resource".
> 
> /pub/content/document
> /pub/content/asset
> The superclass is "Content Item"
> 
> As much as I dislike using two words where one is sufficient, I agree
> a CMS should use the word "content" somewhere.  Although our marketing
> could use "Lenya is more than a Content Management System; it is also
> a Resource Management System." That means nothing but could influence
> business managers.
> 
> Andreas prefers "Content Item".  Bob dislikes "Resource".  I will
> accept either.  Anybody else have a preference?
I will accept either, just don't like both ;)

--Doug

> 
> solprovider
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
> For additional commands, e-mail: dev-help@lenya.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

solprovider@apache.org wrote:

[...]

> Summary:
> - Publication
> - - Content
> - - - Document (has Type, DocumentUNID, DocumentID properties)
> - - - - Translation (has Language and LiveRevision properties)
> - - - - - Revision (has CreationDate and Editor properties)
> - - - Asset (has Type, UNID, ID properties)
> [- - - - Translation (has Language and LiveRevision properties)]
> - - - - - Revision (has CreationDate and Editor properties)

+1, I really like it.

But IMO Document and Asset should have a generalization class / interface.
Since "Resource" and "Content Item" don't seem to be generally accepted,
what else can we use?


> Will Assets have Translations?

+1 (e.g., a Chrysler for the American version, a Mercedes-Benz
for the German one)

> Can it be optional?

It can be up to the publication if it allows multiple translations
of an asset.

> How do we handle PDFs in multiple languages?

> One Asset per language?

-1

> Or one Asset with multiple translations?

+1 (I don't see a need for special handling here)


Thanks, Joern and solprovider!

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/4/06, Jörn Nettingsmeier <po...@uni-duisburg.de> wrote:
> * "document" as currently defined on
> http://wiki.apache.org/lenya/CMSTerminology feels awkward, because it
> subtly re-defines a term that everybody thinks they know. (plus it seems
> not to reflect the most recent state of discussion.)
>
> i think we should consider that what we now call "document" is basically
> a container of language versions and their assets (and possibly their
> revision history), and adjust our terminology accordingly.
> i don't like "content item", because to me it sounds more like a small
> snippet, such as a media asset file. and it makes me think of a leaf
> node rather than a container. it feels like a stopgap word that i'd hate
> to have around forever.

That sums up what I have been attempting to write.

> as a workaround, i have used the term "document id", but it's not much
> better.

DocumentID is a property of a Document: the primary URL.
DocumentUNID is a property of a Document: the flat storage key.

> there is also "site tree node", which is somewhat analogous and
> my current favourite, unless we allow multiple hierarchies, when it
> suddenly becomes orthogonal and plain wrong.

I am pushing for dynamic Sitetrees are generating data similar to
1.2's sitetree.xml.  A Sitetree Node is a node as used in menu.xsl and
other Navigation Elements.

> maybe it should just be "document container"?
Please no.

> * "language version" does not stand well on its own (version of what?),
> it clashes with "version" and totally explodes when you talk about
> "language version versions". i suggest to call it a "document", since
> that's what most people will intuitively do. in the case where we need
> to make i18n issues explicit, we can talk about the "document" (in the
> default language) and "document translations" (the other languages). or
> maybe even "document versions" belonging to a "document container" (or
> better yet, to a "site node").

Thank you.  You found the word we needed.  Forget "Language Version"
and "Document Language Node".  I like "Translation" much better.  It
implies Language without requiring multiple words.

> * let's ditch "version" for old states of a page. instead we can use
> "revision" consistently throughout, which is a lot more precise.

"Version" was gaining acceptance, but I agree with you for very
different reasons.  "Version" is used by JCR for versioning, so
datastore Nodes have "versions", but "Documents" have "Revisions".

Our structure can be:
Document/Translation/Revision

> * i would prefer "document type" instead of the current "resource type",
> because it is already in common usage (everybody knows DTDs). it does
> not quite include the processing part, but that trade-off is worth it
> given that it will give most people a precise idea of what it is.
> we can introduce the additional term "document type handler" for all the
> involved xslt and usecases, and summarize it all as a "document type
> module".

Depending on whether we decide on "Resources" or "Content" for the
collection of "Documents" and "Assets", then either "Resource Type" or
"Content Type" will have two possible values: "Document" or "Asset". 
"Content" seems to have won.

"Document Type" refers to the XML DTD used for a "Document", and
should have values like "xhtml", "blog", etc.

> * we currently have problems with "resource" and "item", because they
> are so generic. let's avoid re-defining them for concrete concepts. doc
> writers need good and unpolluted terms for abstract concepts sometimes.

We are renaming Lenya 1.2's "Resource" to "Asset" to match the GUI. 
"Resource" was proposed as an alternative to "Content" for the
collection of "Documents" and "Assets", but most prefer "Content"
because it matches "CMS".

"Item" has been suggested in multiple word names for various
definitions, but I hope those are gone.  If "Item" is used, I want it
to mean a "Document Element".

Summary:
- Publication
- - Content
- - - Document (has Type, DocumentUNID, DocumentID properties)
- - - - Translation (has Language and LiveRevision properties)
- - - - - Revision (has CreationDate and Editor properties)
- - - Asset (has Type, UNID, ID properties)
[- - - - Translation (has Language and LiveRevision properties)]
- - - - - Revision (has CreationDate and Editor properties)

Will Assets have Translations?  Can it be optional?  How do we handle
PDFs in multiple languages?  One Asset per language?  Or one Asset
with multiple translations?

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Michael Wechner <mi...@wyona.com>.

Jörn Nettingsmeier wrote:

> Michael Wechner wrote:
>
>> Jörn Nettingsmeier wrote:
>>
>>> hi *!
>>>
>>>
>>> i've been following the terminology discussion with interest. i had 
>>> been working on a glossary of terms already, but it got lost when i 
>>> renamed it, so bob was probably not aware of it when he started the 
>>> CMSTerminology page. anyway, here it is:
>>>
>>> http://wiki.apache.org/lenya/Glossary
>>
>>
>>
>> I think we should add two navigations to it:
>>
>> - alphabetical (as it is)
>> - Top down (publication/site, publet/module, ...)
>>
>> whereas I think the second is more important for the current
>> discussion than the first.
>
>
> definitely.
>
> as soon as there is a consensus about the currently debated terms, 
> let's get it into the staging documentation asap and use internal 
> links and anchors for cross-referencing.


I would say it's work in progress and we can start right away

> i can find some time to prepare it next weekend. is there a 
> glossary-friendly doctype in the stuff you use to prepare the docs? if 
> so, please point me to a DTD or schema.


you might want to create a schema yourself (in case you don't find
one. Well, I tried, but didn't) and I am sure somebody
will point you to one after you will done so ;-)

>
> probably there should be two separate pages: a glossary where we can 
> dump any term that we can think of (as it is now), and a more concise 
> structural overview or key terms following your "top down" suggestion.


well, one is the repo (and the repo navigation)
and the others are two navigation (or views): alphabetical and top down.

>
>
> i'll start one.


Thanks

Michi

>
>


-- 
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

Michael Wechner wrote:
> Jörn Nettingsmeier wrote:
> 
>> hi *!
>>
>>
>> i've been following the terminology discussion with interest. i had 
>> been working on a glossary of terms already, but it got lost when i 
>> renamed it, so bob was probably not aware of it when he started the 
>> CMSTerminology page. anyway, here it is:
>>
>> http://wiki.apache.org/lenya/Glossary
> 
> 
> I think we should add two navigations to it:
> 
> - alphabetical (as it is)
> - Top down (publication/site, publet/module, ...)
> 
> whereas I think the second is more important for the current
> discussion than the first.

definitely.

as soon as there is a consensus about the currently debated terms, let's 
get it into the staging documentation asap and use internal links and 
anchors for cross-referencing. i can find some time to prepare it next 
weekend. is there a glossary-friendly doctype in the stuff you use to 
prepare the docs? if so, please point me to a DTD or schema.

probably there should be two separate pages: a glossary where we can 
dump any term that we can think of (as it is now), and a more concise 
structural overview or key terms following your "top down" suggestion.

i'll start one.

-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Michael Wechner <mi...@wyona.com>.

Jörn Nettingsmeier wrote:

> hi *!
>
>
> i've been following the terminology discussion with interest. i had 
> been working on a glossary of terms already, but it got lost when i 
> renamed it, so bob was probably not aware of it when he started the 
> CMSTerminology page. anyway, here it is:
>
> http://wiki.apache.org/lenya/Glossary


I think we should add two navigations to it:

- alphabetical (as it is)
- Top down (publication/site, publet/module, ...)

whereas I think the second is more important for the current
discussion than the first.

Michi

-- 
Michael Wechner
Wyona      -   Open Source Content Management   -    Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
michael.wechner@wyona.com                        michi@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Jörn Nettingsmeier <po...@uni-duisburg.de>.

hi *!


i've been following the terminology discussion with interest. i had been 
working on a glossary of terms already, but it got lost when i renamed 
it, so bob was probably not aware of it when he started the 
CMSTerminology page. anyway, here it is:

http://wiki.apache.org/lenya/Glossary

all definitions on Bob's page have been copied over now, and i suggest 
to use this page for further documentation. there are notes about 
debated items that hopefully reflect the current state of discussion, if 
not, please correct it.

perhaps the CMSTerminology page could be used exclusively for the 
cross-reference table that bob has outlined there - it's definitely a 
very interesting thing for new users.

-.-

i came across some problems with the current terms when actually trying 
to write the docs. it's a good way to find out whether the new terms are 
nice or awkward to use:

* "document" as currently defined on 
http://wiki.apache.org/lenya/CMSTerminology feels awkward, because it 
subtly re-defines a term that everybody thinks they know. (plus it seems 
not to reflect the most recent state of discussion.)

i think we should consider that what we now call "document" is basically 
a container of language versions and their assets (and possibly their 
revision history), and adjust our terminology accordingly.
as a workaround, i have used the term "document id", but it's not much 
better. there is also "site tree node", which is somewhat analogous and 
my current favourite, unless we allow multiple hierarchies, when it 
suddenly becomes orthogonal and plain wrong. :(
maybe it should just be "document container"?

i don't like "content item", because to me it sounds more like a small 
snippet, such as a media asset file. and it makes me think of a leaf 
node rather than a container. it feels like a stopgap word that i'd hate 
to have around forever.

* "language version" does not stand well on its own (version of what?), 
it clashes with "version" and totally explodes when you talk about 
"language version versions". i suggest to call it a "document", since 
that's what most people will intuitively do. in the case where we need 
to make i18n issues explicit, we can talk about the "document" (in the 
default language) and "document translations" (the other languages). or 
maybe even "document versions" belonging to a "document container" (or 
better yet, to a "site node").

* let's ditch "version" for old states of a page. instead we can use 
"revision" consistently throughout, which is a lot more precise.

* i would prefer "document type" instead of the current "resource type", 
because it is already in common usage (everybody knows DTDs). it does 
not quite include the processing part, but that trade-off is worth it 
given that it will give most people a precise idea of what it is.
we can introduce the additional term "document type handler" for all the 
involved xslt and usecases, and summarize it all as a "document type 
module".

* we currently have problems with "resource" and "item", because they 
are so generic. let's avoid re-defining them for concrete concepts. doc 
writers need good and unpolluted terms for abstract concepts sometimes.


looking forward to your comments,


jörn





-- 
"Open source takes the bullshit out of software."
	- Charles Ferguson on TechnologyReview.com

--
Jörn Nettingsmeier, EDV-Administrator
Institut für Politikwissenschaft
Universität Duisburg-Essen, Standort Duisburg
Mail: pol-admin@uni-duisburg.de, Telefon: 0203/379-2736

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

solprovider@apache.org wrote:

[...]

> Andreas prefers "Content Item".  Bob dislikes "Resource".  I will
> accept either.  Anybody else have a preference?

Actually both of them are fine with me. Maybe "resource" is really
a little more convenient when it comes to class- and variable names.

-- Andreas

-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/3/06, Bob Harner <bo...@gmail.com> wrote:
> On 2/3/06, solprovider@apache.org <so...@apache.org> wrote:
> > In Lenya1.2:
> > "Assets" is the GUI name for additional data files.
> >
> > "Resources" is the file system directory name for "Assets".  It was
> > only seen by developers, and was never defined well.  The directory
> > name should disappear in Lenya1.4 if the Assets are moved to the
> > repository.  I like Resource as the superclass for Documents and
> > Assets.
> >
> > "Content Item" implies an "Item" of "Content".  1.2's "Content"
> > contained "Documents", so "Content Item" is an alternate name for a
> > "Document".
>
> Did 1.2's "Content" really only contain "Documents"?  I guess I
> consider "Content" a more general, inclusive term than you do, so to
> me it fits the need very well.  Lenya is a "content management system"
> after all, not a "resource management system".

In Lenya 1.2, "content" was the name of the directory containing the
documents, but the same argument that "resources" was only a directory
name and never seen by editors applies.

/pub/resources/document
/pub/resources/asset
The superclass is "Resource".

/pub/content/document
/pub/content/asset
The superclass is "Content Item"

As much as I dislike using two words where one is sufficient, I agree
a CMS should use the word "content" somewhere.  Although our marketing
could use "Lenya is more than a Content Management System; it is also
a Resource Management System." That means nothing but could influence
business managers.

Andreas prefers "Content Item".  Bob dislikes "Resource".  I will
accept either.  Anybody else have a preference?

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Bob Harner <bo...@gmail.com>.

On 2/3/06, solprovider@apache.org <so...@apache.org> wrote:
> You summarized my comments well.
>
> On 2/3/06, Andreas Hartmann <an...@apache.org> wrote:
> > The structure vs. index question needs IMO
> > some more discussion. "Structure" sounds more like a static way to
> > organize things, "index" rather like a dynamic way. But maybe that's
> > only my personal perception.
>
> "Indexes can have a flat or hierarchical structure."
> Agreed.  Indexes implies dynamic, where structure implies static.
> Since I want there to be multiple configurable methods of retrieving
> the relationships between Documents, Indexes is the better choice so
> the technology will follow.
>
> ===
> In Lenya1.2:
> "Assets" is the GUI name for additional data files.
>
> "Resources" is the file system directory name for "Assets".  It was
> only seen by developers, and was never defined well.  The directory
> name should disappear in Lenya1.4 if the Assets are moved to the
> repository.  I like Resource as the superclass for Documents and
> Assets.
>
> "Content Item" implies an "Item" of "Content".  1.2's "Content"
> contained "Documents", so "Content Item" is an alternate name for a
> "Document".

Did 1.2's "Content" really only contain "Documents"?  I guess I
consider "Content" a more general, inclusive term than you do, so to
me it fits the need very well.  Lenya is a "content management system"
after all, not a "resource management system".

"Item" is used by other systems for a piece of data (a
> cell in a table in most databases), and XML uses Item as a superclass
> for Nodes, Elements, Properties/Attributes.

Agreed, so I wouldn't propose the use of the word "item" without
context, but only as part of the phrase "content item".

>
> solprovider
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

You summarized my comments well.

On 2/3/06, Andreas Hartmann <an...@apache.org> wrote:
> The structure vs. index question needs IMO
> some more discussion. "Structure" sounds more like a static way to
> organize things, "index" rather like a dynamic way. But maybe that's
> only my personal perception.

"Indexes can have a flat or hierarchical structure."
Agreed.  Indexes implies dynamic, where structure implies static. 
Since I want there to be multiple configurable methods of retrieving
the relationships between Documents, Indexes is the better choice so
the technology will follow.

===
In Lenya1.2:
"Assets" is the GUI name for additional data files.

"Resources" is the file system directory name for "Assets".  It was
only seen by developers, and was never defined well.  The directory
name should disappear in Lenya1.4 if the Assets are moved to the
repository.  I like Resource as the superclass for Documents and
Assets.

"Content Item" implies an "Item" of "Content".  1.2's "Content"
contained "Documents", so "Content Item" is an alternate name for a
"Document".  "Item" is used by other systems for a piece of data (a
cell in a table in most databases), and XML uses Item as a superclass
for Nodes, Elements, Properties/Attributes.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by Andreas Hartmann <an...@apache.org>.

solprovider@apache.org wrote:
> On 2/3/06, Andreas Hartmann <an...@apache.org> wrote:
>> That's not a vote, but it would be appreciated if you could state
>> your opinion. I'd like to emphasize that this discussion is not
>> about the API itself, but only about the terms we use. So please
>> don't go and complain about the concept of areas or language versions.
>> These issues can be discussed later on.
> 
> Well, that eliminates some of my ideas.  The problem is the technology
> follows the terminology.

It's the chicken+egg problem. This thread is just to create a
basis for discussion. Class names etc. can be decided later.
But I see the problem, deciding on the terms can depend on how
it's implemented.

Thanks a lot for your comments, here's how I'd summarize
your statements (comments below):


 > ----
 >
 > A website in Lenya is called a
 >
 > - publication [+andreas] [+solprovider]
 > - site
 > - ...
 >
 > ----
 >
 > The resources of a website can exist at the same time in several
 >
 > - areas [+andreas]
 > - ...
 > - drop the concept [+solprovider]
 >
 > ----
 >
 > The entirety of plain information (without structuring) is called
 >
 > - content [+andreas]
 > - resources [+solprovider]
 > - ...
 >
 > ----
 >
 > The set of language versions of a piece of information is called
 >
 > - content node
 > - content item [+andreas]
 > - document
 > - resource [+solprovider]
 > - ...
 >
 > ----
 >
 > A specific language version is called
 >
 > - content item
 > - language version [+andreas] [+solprovider]
 > - document
 > - ...
 >
 > ----
 >
 > A version in the history of a language version is called
 >
 > - (history) version [+andreas] [+solprovider]
 > - ...
 >
 > ----
 >
 > The structuring information (there may be several of them) are
 >
 > - sites
 > - structures [+andreas]
 > - navigations
 > - indizes [+solprovider]
 >
 > (maybe we have to make a distinction between "structure" and "navigation")
 >
 > ----
 >
 > A node in the structure is called
 >
 > - (structure/index) node [+andreas] [+solprovider]
 > - site node
 > - navigation item
 > - ...
 >
 >



>> ----
>> A website in Lenya is called a
>> - publication [+1]
>> - site
> 
> Publication: Collection of data and special processing instructions.
> 
> Website: A collection of Publications and other material accessed
> through a single Internet domain.
> 
>> ----
>> The resources of a website can exist at the same time in several
>> - areas [+1]
> Cannot comment since this terminology hurts the technology.
> 
>> ----
>> The entirety of plain information (without structuring) is called
>> - content [+1]
>> - resources
> 
> Decision time.  Will "Assets" (images and other files) be stored with
> the documents?

IMO yes, all of them should be treated in the same way.


> Historically, "Content" refers only to a collection of "Documents",
> and "Resources" referred a collection of "Assets".  If "Assets" are
> not included, we should keep the term "Content".  Either term
> ("Content" or "Resources") is good for the collection of Documents and
> Assets, but changing it to "Resources" may reduce confusion with the
> historical definition of "Content".
> 
> 
>> ----
>> The set of language versions of a piece of information is called
>> - content node
>> - content item [+1]
>> - document
>> - resource
> 
> I prefer "Document" for an XML Document, because that is what XML and
> Lenya1.2 call it.

OK, but that would be a specialization of "content item" / "resource".


> To refer to something that could be either a Document or an Asset, I
> prefer "Resource".  "Content Node" uses two words for the same
> concept.

That would mean

Resource
- Document extends Resource (XML)
- Asset extends Resource (binary, text, ...)

[...]

> For me, the following text sounds quite good:
> In Lenya, a Website consists of one or more Publications.  A
> Publication is a collection of Resources (Documents and Assets), and
> any special functionality.   Documents (and soon Assets) are
> maintained by Language, and each edit creates a new Version for
> historical documentation and the ability to rollback to an older
> Version.  Indexes provide selection of Documents for use by various
> functionality, including Navigation Elements which provide easy access
> to the Documents through the web interface.

That sounds good to me as well. "Resource" and "content item" are
both fine with me. The structure vs. index question needs IMO
some more discussion. "Structure" sounds more like a static way to
organize things, "index" rather like a dynamic way. But maybe that's
only my personal perception.

-- Andreas


-- 
Andreas Hartmann
Wyona Inc.  -   Open Source Content Management   -   Apache Lenya
http://www.wyona.com                      http://lenya.apache.org
andreas.hartmann@wyona.com                     andreas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org

Re: [RFC] Terminology

Posted by so...@apache.org.

On 2/3/06, Andreas Hartmann <an...@apache.org> wrote:
> That's not a vote, but it would be appreciated if you could state
> your opinion. I'd like to emphasize that this discussion is not
> about the API itself, but only about the terms we use. So please
> don't go and complain about the concept of areas or language versions.
> These issues can be discussed later on.

Well, that eliminates some of my ideas.  The problem is the technology
follows the terminology.

> ----
> A website in Lenya is called a
> - publication [+1]
> - site

Publication: Collection of data and special processing instructions.

Website: A collection of Publications and other material accessed
through a single Internet domain.

> ----
> The resources of a website can exist at the same time in several
> - areas [+1]
Cannot comment since this terminology hurts the technology.

> ----
> The entirety of plain information (without structuring) is called
> - content [+1]
> - resources

Decision time.  Will "Assets" (images and other files) be stored with
the documents?

Historically, "Content" refers only to a collection of "Documents",
and "Resources" referred a collection of "Assets".  If "Assets" are
not included, we should keep the term "Content".  Either term
("Content" or "Resources") is good for the collection of Documents and
Assets, but changing it to "Resources" may reduce confusion with the
historical definition of "Content".

> ----
> The set of language versions of a piece of information is called
> - content node
> - content item [+1]
> - document
> - resource

I prefer "Document" for an XML Document, because that is what XML and
Lenya1.2 call it.

To refer to something that could be either a Document or an Asset, I
prefer "Resource".  "Content Node" uses two words for the same
concept.

"Content Item" sounds like a field or a section of a Document.

> ----
> A specific language version is called
> - content item
> - language version [+1]
> - document
> A version in the history of a language version is called
> - (history) version [+1]

Agreed.  "Documents" have "Language Versions" which have "Versions". 
"Language Version" sounds fine.  "Document Language Node" is more
accurate, but too long.

The nodes which track the historical changes for a specific language
must be "Versions".

> ----
> The structuring information (there may be several of them) are
> - sites
> - structures [+1]
> - navigations
> (maybe we have to make a distinction between "structure" and "navigation")

I think I refer to these concepts as "Navigation Elements" and "Indexes".
Navigation Element: A DIV used by XSL (or the code that creates one)
to provide access to Documents in a structured manner: menus,
breadcrumbs, website maps.

Indexes: Structural information for selecting, sorting, and retrieving
the hierarchical relations of a collection of Documents.

"Sites" is commonly used for "Websites".
"Structures" is fine, but "Indexes" should fit the technology better.
"Navigation" is better/shorter than "Navigation Element", but has the
issue that it is commonly used for the collection of all methods to
navigate a website (Navigation Elements and static links).

> ----
> A node in the structure is called
> - (structure) node [+1]
> - site node
> - navigation item

"Node" is the standard term for an element of a data structure.

---
For me, the following text sounds quite good:
In Lenya, a Website consists of one or more Publications.  A
Publication is a collection of Resources (Documents and Assets), and
any special functionality.   Documents (and soon Assets) are
maintained by Language, and each edit creates a new Version for
historical documentation and the ability to rollback to an older
Version.  Indexes provide selection of Documents for use by various
functionality, including Navigation Elements which provide easy access
to the Documents through the web interface.

solprovider

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lenya.apache.org
For additional commands, e-mail: dev-help@lenya.apache.org