You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Ross Gardler <rg...@apache.org> on 2004/01/26 14:59:18 UTC

docv12txt.xsl

I recently found the need to create a stylesheet to convert xdocs to
text. It's not quite fully complete, in particular tables are not too
well represented at present (I don't need them in my use case at present).

I've copied the output of http://127.0.0.1:8888/document-v12.txt using
this stylesheet below. Would this be of use in CVS?

Ross

Ouput of http://127.0.0.1:8888/document-v12.txt
NB Mail client will have wrapped long lines not
the stylesheet.
-----------------------------------------------

*********************
The document-v1.2 DTD
*********************

____________________________________________________________
Notice:
------------------------------------------------------------
This document doesn't make any sense at all.
____________________________________________________________

A nonsense document using all possible elements in the current
document-v12.dtd .


Changes since document-v11
**************************

doc-v12 enhances doc-v11 by relaxing various restrictions that were
found to be unnecessary.

   * Links (link,jump,fork) and inline elements (br,img,icon,acronym)
are allowed inside title.
   * Paragraphs (p,source,note,warning,fixme), table and figure,anchor
are allowed inside li.
   * Paragraphs (p,source,note,warning,fixme), lists (ol,ul,dl), table,
figure,anchor are allowed inside definition lists (dd) and tables (td
and dh).
   * Inline content
(strong,em,code,sub,sup,br,img,icon,acronym,link,jump,fork) is allowed
in strong and em.

Sample Content
**************

Hint: See the xml source to see how the various elements are used and
see the DTD documentation.

This is a simple paragraph. Most documents contain a fair amount of
paragraphs. Paragraphs are called <p>.
With the <p xml:space="preserve"> attribute, you can declare that
whitespace should be preserved, without implying it is in any other way
special.
A number of in-line elements are available in the DTD, we will show them
inside an unordered list (<ul>):

   * Here is a simple list item ( <li> ).
   * Have you seen the use of the <code> element in the previous item?
   * Also, we have <sub> and <sup> elements to show
contentaboveorbelowthe text baseline.
   * There is a facility toemphasizecertain words using the <em>
<strong> elements.
   * We can use <icon> s, too.
   * Another possibility is the <img> element:, which offers the ability
to refer to an image map.
   * We have elements for hyperlinking:
          <link href="faq.html">
             Use this to link [See link target: faq.html] to another
document. As per normal, this will open the new document in the same
browser window.
          <link href="#section">
             Use this to link [See link target: #section] to the named
anchor in the current document.
          <link href="contrib.html#cvshowto">
             Use this to link [See link target: contrib.html#cvshowto]
to another document and go to the named anchor. This will open the new
document in the same browser window.
          <jump href="contrib.html">
             Use this to jump [See link target: contrib.html] to another
document and optionally go to a named anchor [See link target:
contrib.html#cvshowto] within that document. This will open the new
document in the same browser window. So what is the difference between
link and jump? The jump behaves differently, in that it will replace any
frames in the current window. This is the equivalent of <a ...
target="_top">
          <fork href="faq.html">
             Use this to fork [See link target: faq.html] your
webbrowser to another document. This will open the document in a new,
unnamed browser window. This is the equivalent of <a ... target="_blank">
   * Oh, by the way, a definition list <dl> was used inside the previous
list item. We could put another
     * unordered list
     * inside the list item                   A sample nested table

  __________________________________________________________
| Or even tables | inside lists | |
!__________________________________________________________!

So far for the in-line elements, let's look at some paragraph-level
elements.

____________________________________________________________
Fixme (SN)
------------------------------------------------------------
The <fixme> element is used for stuff which still needs work. Mind the
author attribute!
____________________________________________________________

____________________________________________________________
Note
------------------------------------------------------------
Use the <note> element to draw attention to something, e.g. ...The
<code> element is used when the author can't express himself clearly
using normal sentences ;-)
____________________________________________________________

____________________________________________________________
Warning
------------------------------------------------------------
Sleep deprivation can be the result of being involved in an open source
project. (a.k.a. the <warning> element).
____________________________________________________________

____________________________________________________________
Important
------------------------------------------------------------
If you want your own labels for notes and warnings, specify them using
the label attribute.
____________________________________________________________

Apart from unordered lists, we have ordered lists too, of course.

   1. Item 1
   2. Item 2
     2.1. Item 2.1
   3. This should be 3 if my math is still OK.

Using sections
==============

You can use sections to put some structure in your document. For some
strange historical reason, the section title is an attribute of the
<section> element.


Sections, the sequel
====================

Just some second section.


Section 2.1
-----------

Which contains a subsection (2.1).


Showing preformatted source code
================================

Enough about these sections. Let's have a look at more interesting
elements, <source> for instance:

  // This example is from the book _Java in a Nutshell_ by David Flanagan.
           // Written by David Flanagan.  Copyright (c) 1996 O'Reilly &
Associates.
           // You may study, use, modify, and distribute this example
for any purpose.
           // This example is provided WITHOUT WARRANTY either expressed
or implied.

           import java.applet.*;    // Don't forget these import statements!
           import java.awt.*;

           public class FirstApplet extends Applet {
           // This method displays the applet.
           // The Graphics class is how you do all drawing in Java.
           public void paint(Graphics g) {
           g.drawString("Hello World", 25, 50);
           }
           }

Please take care to still use a sensible line-length within your source
elements.


Using tables
============

And now for a table:
                        Table caption
  __________________________________________________________
| heading cell | heading cell | |
| data cell | data cell | |
| Tables can be nested |
   * and can include most other elements, like lists | |
!__________________________________________________________!

Not much of attributes with <table>, if you ask me.


Using figures
=============

And a <figure> to end all of this. Note that this can also be
implemented with an <img> element.
by Jeff Turner




Re: docv12txt.xsl

Posted by Rick Tessner <ri...@onnadayr.ca>.
On Sun, 2004-08-01 at 20:12, David Crossley wrote:

> At some stage we should add a sample to the "seed site".

Makes sense ... I'll look at how best to do that.

> I added a Jira Issue (FOR-240) for the todo notes provided above.

Thank you!  I guess I should have opened a jira issue as a "New
Feature".  Will do that in future.

> (Wondering out loud, no time to try ...)
> Is it possible to use a sub-sitemap pod.xmap?
> The main sitemap is getting very cluttered.

Hmmm ... I'll see what it takes to do that.

> In future please don't use email attachments unless you
> really must. The Issue tracker is far better for many
> reasons:
> http://marc.theaimsgroup.com/?l=forrest-dev&m=109140623712387

Ewww ... that does come out ugly ... Will use Jira in future for New
Features as well as bugs. Sorry 'bout that.

-- 
Rick Tessner <ri...@onnadayr.ca>


Re: docv12txt.xsl

Posted by David Crossley <cr...@apache.org>.
Rick Tessner wrote:
> Attached is copy of a document2pod.xsl.  It handles lists, rudimentary
> tables, verbatim text.  It's rough, but seems to work okay.  It doesn't
> handle definition lists, notes and warnings (or nested tables, lists in
> tables, etc).

Thanks, that is committed now.

At some stage we should add a sample to the "seed site".

I added a Jira Issue (FOR-240) for the todo notes provided above.

(Wondering out loud, no time to try ...)
Is it possible to use a sub-sitemap pod.xmap?
The main sitemap is getting very cluttered.

In future please don't use email attachments unless you
really must. The Issue tracker is far better for many
reasons:
http://marc.theaimsgroup.com/?l=forrest-dev&m=109140623712387

-- 
David Crossley


Re: docv12txt.xsl

Posted by Rick Tessner <ri...@onnadayr.ca>.
On Sat, 2004-07-31 at 20:13, David Crossley wrote:
> Rick Tessner wrote:

> > If there's any interest, I'd contribute the document2pod XSL that I'm
> > working on as well.
> 
> Yes, definitely interest. I don't even know what its function is,
> and i am saying yes please, because anything that extends the use
> of Forrest in the documentation arena is within scope.

Hi all,

Attached is copy of a document2pod.xsl.  It handles lists, rudimentary
tables, verbatim text.  It's rough, but seems to work okay.  It doesn't
handle definition lists, notes and warnings (or nested tables, lists in
tables, etc).

<aside>
        POD, Plain Old Documentation, comes from the perl world.  It's
        basically text with simple directives to do headers, lists,
        verbatim text, and some minor in-line bits for bolding, italics
        etc.
        
        On a *nix machine with perl installed, you should be able to do
        a "man perlmod" to get an overview of what POD is.
</aside>

I'm hoping that we can have a common/xslt/text directory with a
document2txt.xsl and document2pod.xsl that will share some common XSLs
for doing text-based handling of tables, lists, string functions, etc.

Attached are three items:

      * src/core/context/skins/common/xslt/text/document2pod.xsl  Along
        with the included patch to the sitemap, you should be able to
        get POD for any forrest document.
      * A patch to the sitemap.xmap that can podify a given forrest
        document by using the ".pod" suffix.
      * An example for document transformed to pod: (gzip'd)
        sitemap-ref.pod I used this one since it has lists and tables. 
        If you're on a *nix machine that has perl installed, you should
        be able to do a "pod2man sitemap-ref.pod | nroff -man | less" to
        see it as a UNIX manpage.

-- 
Rick Tessner <ri...@onnadayr.ca>

Re: docv12txt.xsl

Posted by David Crossley <cr...@apache.org>.
Rick Tessner wrote:
> Ross Gardler wrote:
> > I recently found the need to create a stylesheet to convert xdocs to
> > text. It's not quite fully complete, in particular tables are not too
> > well represented at present (I don't need them in my use case at present).
> > 
> > I've copied the output of http://127.0.0.1:8888/document-v12.txt using
> > this stylesheet below. Would this be of use in CVS?
> 
> I'm finding that I have a similiar need at the moment as well.  I'm
> working on a document2pod XSL.
> 
> It doesn't look like your docv12txt.xsl was committed to CVS (now SVN)
> at all.  Would it be possible for it to be committed?
> 
> If there's any interest, I'd contribute the document2pod XSL that I'm
> working on as well.

Yes, definitely interest. I don't even know what its function is,
and i am saying yes please, because anything that extends the use
of Forrest in the documentation arena is within scope.

-- 
David Crossley


Re: adding EXSLT capabilities

Posted by David Crossley <cr...@apache.org>.
Ross Gardler wrote:
> David Crossley wrote:
> > The next issue with using the EXSLT is how to
> > distribute it within Forrest and how to manage it
> > and keep our version up-to-date.
> > 
> > Should we store the total distribution (all-exslt.zip)
> > as lib/core/exslt-${UTCtimestamp}.zip for example,
> > and then have the forrest build system unpack it into
> > the core stylesheets area.
> 
> For the document2text I just need a single function within one of the 
> packages. However, there are some very useful functions in there.
> 
> The all-exslt.zip is 1.1Mb in size so this seems like overkill. The 
> str.zip is just 85Kb. Based on this I would suggest it would be best to 
> just include the str.zip at this stage and we remember that the other 
> facillities are there when we hit particular problems.
> 
> Comments?

We could add them as needed then. Still, it was the management
method and naming convention that i was looking for:
lib/core/exslt-str-${UTCtimestamp}.zip and unpack on build.
Would that work?

-- 
David Crossley


Re: adding EXSLT capabilities

Posted by Ross Gardler <rg...@apache.org>.
David Crossley wrote:
> The next issue with using the EXSLT is how to
> distribute it within Forrest and how to manage it
> and keep our version up-to-date.
> 
> Should we store the total distribution (all-exslt.zip)
> as lib/core/exslt-${UTCtimestamp}.zip for example,
> and then have the forrest build system unpack it into
> the core stylesheets area.
> 

For the document2text I just need a single function within one of the 
packages. However, there are some very useful functions in there.

The all-exslt.zip is 1.1Mb in size so this seems like overkill. The 
str.zip is just 85Kb. Based on this I would suggest it would be best to 
just include the str.zip at this stage and we remember that the other 
facillities are there when we hit particular problems.

Comments?

Ross

adding EXSLT capabilities

Posted by David Crossley <cr...@apache.org>.
The next issue with using the EXSLT is how to
distribute it within Forrest and how to manage it
and keep our version up-to-date.

Should we store the total distribution (all-exslt.zip)
as lib/core/exslt-${UTCtimestamp}.zip for example,
and then have the forrest build system unpack it into
the core stylesheets area.

-- 
David Crossley


EXSLT license and license convention (Was: docv12txt.xsl)

Posted by David Crossley <cr...@apache.org>.
Ross Gardler wrote:
> Ross Gardler wrote:
> 
> > So, currently the code is regarded as public domain. What does this mean 
> > with regards to commiting it to Forrest? - one day I'll understand all 
> > this legal stuff (yeah right!)

We are all in the same boat, but we have help in other
ASF forums whenever we need it.

> A little more digging in their archives turned up the following mail, 
> which was posted in June 2004, in response to my mail of January 2004. 
> Anyway, I think it contains our answer:

Good discovery Ross. All you need to do is create a file in
our /legal/ directory following the naming conventions [*]
add their jar and go. Since they also ask for attribution
you would add them to NOTICE.txt

[*] naming convention for license files

If the supporting product is bundled as a jar file, and provides
a separate license file, then add that file with same jar name
and the .txt extension.

Otherwise use your judgement as to the appropriate filename
and the contents of that license file. Some examples in this
category are the w3c-dtd and oreilly. This latter seems to cover
a specific case and also broadly all O'Reilly code examples
(together with attribution for each case).

-- 
David Crossley


Re: docv12txt.xsl

Posted by Ross Gardler <rg...@apache.org>.
Ross Gardler wrote:

>>> The problem is that it includes a small amount of code from another 
>>> source that does not publish its licensing terms. I have, via their 

<snip/>

> http://lists.fourthought.com/pipermail/exslt/2004-July/002096.html

<snip/>

> So, currently the code is regarded as public domain. What does this mean 
> with regards to commiting it to Forrest? - one day I'll understand all 
> this legal stuff (yeah right!)

A little more digging in their archives turned up the following mail, 
which was posted in June 2004, in response to my mail of January 2004. 
Anyway, I think it contains our answer:


--- copied text from 
http://lists.fourthought.com/pipermail/exslt-manage/2004-June/001203.html
  ---

You're free and clear to use the templates, but you should credit and
link to exslt.org.  As for copyright, there is no legal entity for
EXSLT, but you can credit the original individual managers as follows:

EXSLT templates are Copyright 2001-2004 Jeni Tennison, David Pawson,
James Fuller and Uche Ogbuji

There is no "license" as such, although I started discussion for
choosing one (probably a creative commons license).  You're not required
to attribute or post notice, so you'd be fine within the Apache license,
but I'd encourage you to pass on this attribution request, anyway.

Thanks.


-- 
Uche Ogbuji                                    Fourthought, Inc.


--- end copied text ----

So does this mean we can simply put the relevant code into the XSL and 
place a comment to the effect "This template is Copyright...." in the 
code (as Uche says this is not required but it is good manners). Does 
the Apache side of things allow us to do this?

Of course, it may be easier for someone with more XSLT skill than I to 
simply write an original version of the template (create a string of a 
given character that is length x).

Ross



Re: docv12txt.xsl

Posted by Ross Gardler <rg...@apache.org>.
Dave Brondsema wrote:

> Ross Gardler wrote:
> 
>> Rick Tessner wrote:
>>

<snip/>

>>> It doesn't look like your docv12txt.xsl was committed to CVS (now SVN)
>>> at all.  Would it be possible for it to be committed?
>>
>>
>>
>> The problem is that it includes a small amount of code from another 
>> source that does not publish its licensing terms. I have, via their 
>> mail lists, asked if I can use the code over here but they appear to 
>> be dead.
>>
> 
> If it's a small amount, we could probably re-code that section 
> ourselves.  Can you explain what's missing or link to it?

I've attached a modified version of my solution to the issue at 
http://issues.cocoondev.org/jira/secure/ViewIssue.jspa?key=FOR-125.

The comment describes what is missing, in short we need a way of 
creating a string of x characters, but read on...

I've also had another look at the project I took the snippet of code 
from, it seems it is not dead afterall. In fact, last months archives 
have te following:

http://lists.fourthought.com/pipermail/exslt/2004-July/002096.html

which says:

 > I have look in the archives and on the website and I can not find
 > any license or permission to use in a commercial product your XSLT
 > template implementation of the date:difference function.
 >
 > Where can I find such a notice or may I please have your permission
 > to use the above template?

Yes; all the EXSLT functions and templates are free for anyone to use.

---

So, currently the code is regarded as public domain. What does this mean 
with regards to commiting it to Forrest? - one day I'll understand all 
this legal stuff (yeah right!)

Ross

Re: docv12txt.xsl

Posted by Dave Brondsema <da...@brondsema.net>.
Ross Gardler wrote:
> Rick Tessner wrote:
> 
>> On Mon, 2004-01-26 at 05:59, Ross Gardler wrote:
>>
>>> I recently found the need to create a stylesheet to convert xdocs to
>>> text. It's not quite fully complete, in particular tables are not too
>>> well represented at present (I don't need them in my use case at 
>>> present).
>>>
>>> I've copied the output of http://127.0.0.1:8888/document-v12.txt using
>>> this stylesheet below. Would this be of use in CVS?
>>
>>
>>
>> I'm finding that I have a similiar need at the moment as well.  I'm
>> working on a document2pod XSL.
>>
>> It doesn't look like your docv12txt.xsl was committed to CVS (now SVN)
>> at all.  Would it be possible for it to be committed?
> 
> 
> The problem is that it includes a small amount of code from another 
> source that does not publish its licensing terms. I have, via their mail 
> lists, asked if I can use the code over here but they appear to be dead.
> 

If it's a small amount, we could probably re-code that section 
ourselves.  Can you explain what's missing or link to it?

> However, if you take a look at 
> http://issues.cocoondev.org/jira/secure/ViewIssue.jspa?key=FOR-125 you 
> will see an issue has been raised and Dave Brondsema has attached a 
> patch to use the FOPSerializer to output text. I've not looked at it 
> myself and the FOP folk say it needs improvement (see 
> http://xml.apache.org/fop/output.html#txt).
> 

Yes, it works but FOP is not the way to do it; rasterization and such 
cause bad spacing.

> At some point I will need to find a solution for this that I can share, 
> but I'm afraid it is not a major itch for me right now.
> 
> Ross
> 


-- 
Dave Brondsema : dave@brondsema.net
http://www.splike.com : programming
http://csx.calvin.edu : student org
http://www.brondsema.net : personal

Re: docv12txt.xsl

Posted by Ross Gardler <rg...@apache.org>.
Rick Tessner wrote:
> On Mon, 2004-01-26 at 05:59, Ross Gardler wrote:
> 
>>I recently found the need to create a stylesheet to convert xdocs to
>>text. It's not quite fully complete, in particular tables are not too
>>well represented at present (I don't need them in my use case at present).
>>
>>I've copied the output of http://127.0.0.1:8888/document-v12.txt using
>>this stylesheet below. Would this be of use in CVS?
> 
> 
> I'm finding that I have a similiar need at the moment as well.  I'm
> working on a document2pod XSL.
> 
> It doesn't look like your docv12txt.xsl was committed to CVS (now SVN)
> at all.  Would it be possible for it to be committed?

The problem is that it includes a small amount of code from another 
source that does not publish its licensing terms. I have, via their mail 
lists, asked if I can use the code over here but they appear to be dead.

However, if you take a look at 
http://issues.cocoondev.org/jira/secure/ViewIssue.jspa?key=FOR-125 you 
will see an issue has been raised and Dave Brondsema has attached a 
patch to use the FOPSerializer to output text. I've not looked at it 
myself and the FOP folk say it needs improvement (see 
http://xml.apache.org/fop/output.html#txt).

At some point I will need to find a solution for this that I can share, 
but I'm afraid it is not a major itch for me right now.

Ross


Re: docv12txt.xsl

Posted by Rick Tessner <ri...@onnadayr.ca>.
On Mon, 2004-01-26 at 05:59, Ross Gardler wrote:
> I recently found the need to create a stylesheet to convert xdocs to
> text. It's not quite fully complete, in particular tables are not too
> well represented at present (I don't need them in my use case at present).
> 
> I've copied the output of http://127.0.0.1:8888/document-v12.txt using
> this stylesheet below. Would this be of use in CVS?

I'm finding that I have a similiar need at the moment as well.  I'm
working on a document2pod XSL.

It doesn't look like your docv12txt.xsl was committed to CVS (now SVN)
at all.  Would it be possible for it to be committed?

If there's any interest, I'd contribute the document2pod XSL that I'm
working on as well.

-- 
Rick Tessner <ri...@onnadayr.ca>


Re: docv12txt.xsl

Posted by Nicola Ken Barozzi <ni...@apache.org>.
Ross Gardler wrote:
> I recently found the need to create a stylesheet to convert xdocs to
> text. It's not quite fully complete, in particular tables are not too
> well represented at present (I don't need them in my use case at present).
> 
> I've copied the output of http://127.0.0.1:8888/document-v12.txt using
> this stylesheet below. Would this be of use in CVS?

Sure :-)

Add the processing in the sitemap as I did with SVG, and add the 
stylesheet to the common skin as

    skins/common/xslt/txt/document2txt.xsl

Oh, and also adding it to the skinconf as a possible link on the page 
(like PDF, Print, xml) would be neat too (I didn't do it with SVG as 
it's just a stub that needs development on the xsl).

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------