You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by Berin Loritsch <bl...@apache.org> on 2001/10/08 17:43:10 UTC

[Proposal] Generated Docs

It has come to my attention that the docs in Axis are hand-coded or generated
from a tool like DreamWeaver.  In an open environment, this is less than
optimal for several reasons.  The first of which is that not everyone has
access to the tool (as good or bad as it may be).  Therefore their only
recourse is to open up a text editor and hand modify the HTML to include
their new text.  There is a point where documentation generated by another
open source tool like Cocoon, Anakia, or Stylebook makes document writing
easier.

Before I get too far into details, I would like to say that I appreciate
the presentation of the current documentation, and I would like the new
tool to maintain the look and feel.  All three of the tools I mentioned
in the prior paragraph can accomplish this goal.

When Hand Coding Is Enough
--------------------------
Let's face it, adding a separation between content and presentation does
trade one layer of complexity for another.  In my experience with web
application development, when you have one person creating all the
content for a small site you don't really need tools to generate the
HTML.  The overhead of setting up the environment is more than the effort
of simply creating the pages.  This is true up to a threshold.  I have
found that when you get up above 6-8 pages the effort of setting up
an environment comes below the effort involved in creating the pages.

Open Source and Open Documentation
----------------------------------
The Open Source solutions also encourage an Open Documentation approach.
In fact, in many projects the documentation is not written by the developers.
User documentation is more effective when it is written by a user.  These
people who are contributing documentation are providing something that is
every bit as valuable as the code itself.  Anything we can do to ensure
that the content is all present and current is a step in the right
direction.  In most cases where there is good documentation--or at least
alot of documentation is has been written or edited by several people.

By separating content and presentation, you separate two concerns that
do not need to be intertwined.  You also create an environment where people
who want to add content don't have to wade through several layers of TABLE,
DIV, and FONT tags to figure out where to place the content.  You also
ensure that all your HTML is correct and has a consistent style throughout
the page.  There have been several times where I had to debug HTML to find
out why the page didn't display the information only to find out that someone
else had placed the text between some TR tags instead of the TD tags (prior
to migrating to generated documentation).

The Tools
---------
Currently there are three tools used accross the Apache java projects:
Cocoon, Stylebook, and Anakia.  All three tools have differing levels
of complexity and support.  Since I am not new to templating engines I
will post what I know about each of the tools.

Anakia is used by several Jakarta projects and the main Jakarta site
itself.  It uses Velocity as the templating engine.  The Velocity templates
do not use standard XSLT, and instead use its own "markup".  I am not
that familiar with this tool, and since I would most likely be the one
setting up the environment, I wouldn't opt for it.  Several folks swear
by it, and I am not criticizing the tool.  I am merely saying it is a
learning curve I don't want to incur right now.

Stylebook is used by several XML (.apache.org) projects, and the main
xml.apache.org site.  It is a completely undocumented project, without
even visibility on the xml.apache.org site.  Several people have complained
about this particular templating engine--but it has some points that do
need to be considered: it uses standard XSLT, and provides a migration
path to Cocoon based generation.

Cocoon is more than just a templating engine, however it is used to
generate it's own docs as well as Avalon's docs.  It's downside is that
the environment is a little more complex to set up--however the power
of the system is incredible.  Cocoon uses everything from standard XSLT
OR Velocity to custom transformers that allow you to process XInclude
operations and more.  Another benefit is that it provides mechanisms to
generate PDF documentation alongside the HTML generation.

The Cost
--------
The first myth that I want to debunk right now is that templating engines
"cost" alot in setup and environment time.  The truth is that once the
system is set up, it is rarely altered until there is a need.  Common needs
are additional output types (like adding PDF support).  But then, the
environment is still stable--you are just adding another transformation
type.  If the team decides to change the look of the docs, that can be
done by changing the template.  I can turn a plain HTML page into a template
in a couple of hours.  I can turn a plain HTML page into the content
specific markup in about 10-15 minutes.

The Markup
----------
The point of content specific markup is to simplify the documentation
writing process.  You can use DocBook (used by a portion of the Avalon
site--we are still converting), the standard Stylebook markup (used by
almost all the stylebook based sites), or a hybrid.  For personal sites,
I have opted for a hybrid mainly because it is relatively easy to use.

DocBook allows you to have alot more meta-data embedded--which if you
need to support searching on your site is necessary.  Stylebook is much
simpler, but could be made a little better.  An example Stylebook doc
is listed below:

<document>
  <header>
    <title>Hello World</title>
    <authors>
      <person id="BL" name="Berin Loritsch" email="bloritsch@apache.org"/>
    </authors>
  </header>
  <body>
    <s1 title="Hello World!">
      <p>
        This is a simple paragraph.  Much of the markup is similar to the
        XHTML equivalent.  <em>Especially</em>, when we are adding
        <em>emphasis</em> or saying something <strong>strongly</strong>.
      </p>
      <p>
        The <link href="#foo">links</link> are a bit more
        rich.  The above example was a standard link that is equivalent
        to a simple &gt;a/&lt; tag.  Sometimes you want to
        <fork href="http://xml.apache.org">fork a new page</fork> so that
        a new browser window is opened.  <anchor name="foo">And sometimes</anchor>
        you just want to mark a position in the page.
      </p>
      <s2 title="Another level">
        <p>
          The sections can be nested so that you can have your standard
          subheadings.
        </p>
        <ul>
          <li>Most documents have subheadings</li>
          <li>Most documents rarely go below four levels</li>
        </ul>
      </s2>
    </s1>
  </body>
</document>

This document structure is very simple, and leverages simple XHTML tags for
most constructs.  The only thing I don't like about it is that the titles
are attributes in the sections, but a full level element in the header.  My
hybrid approach takes everything else from Stylebook, but replaces the section
markup with DocBook's approach:

<section>
  <title>Hello World!</title>
  <section>
    <title>Another level</title>
  </section>
</section>

Also, by not specifically mentioning the nesting level, we can simply rearrange
a document and promote sections simply by moving the whole section tag to a
new place.  I have also extended the StyleBook to include form elements, but I
don't think that is necessary right now...