You are viewing a plain text version of this content. The canonical link for it is here.
Posted to docs@httpd.apache.org by Yoshiki Hayashi <yo...@xemacs.org> on 2002/07/30 13:55:14 UTC

XML validation and Ant

I once again stumbled across Ant's bug.  Using Ant's
xmlvalidate task, validating iso-2022-jp XML files always
fail because XML parser assumes passed file is in UTF-8.

I fixed the bug as in bug #11279.  I re-uploaded jar files
to
http://httpd.apache.org/dev/dist/jakarta-ant-1.5-modified-jar.tar.gz

Can we agree to use that jar instead of current modified
1.4.1 jar and apply xmlvalidate patch from Erik Abele
<er...@codefaktor.de> below?  It will give us the option to
validate XML documents.

I don't think we should make xslt target depend on validate
because validate target validates all XML files, not just
updated ones.  We can occasionally validate them using
% sh build.sh validate

Index: build.xml
===================================================================
RCS file: /home/penny/cvsroot/httpd-2.0/docs/manual/style/build.xml,v
retrieving revision 1.6
diff -u -r1.6 build.xml
--- build.xml	29 Jul 2002 11:43:11 -0000	1.6
+++ build.xml	30 Jul 2002 11:27:20 -0000
@@ -48,6 +48,20 @@
       <mapper type="glob" from="*.xml.ja" to="*.html.ja.jis"/>
       <param name="relative-path" expression="."/>
     </style>
+  </target>
+  <target name="validate"
+    description="Validate the XML source files">
 
+    <!-- Validate the root directory of the manual -->
+    <xmlvalidate lenient="false" failonerror="false" warn="true">
+        <fileset dir="../"
+                 includes="*.xml *.xml.ja"/>
+    </xmlvalidate>
+
+    <!-- Validate the mod directory (.en + .ja) -->
+    <xmlvalidate lenient="false" failonerror="false" warn="true">
+        <fileset dir="../mod/"
+                 includes="*.xml *.xml.ja"/>
+    </xmlvalidate>
   </target>
 </project>


-- 
Yoshiki Hayashi

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: XML validation and Ant

Posted by Joshua Slive <jo...@slive.ca>.
On 30 Jul 2002, Yoshiki Hayashi wrote:

> I don't think we should make xslt target depend on validate
> because validate target validates all XML files, not just
> updated ones.  We can occasionally validate them using
> % sh build.sh validate

+1

Joshua.


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: XML validation and Ant

Posted by Yoshiki Hayashi <yo...@xemacs.org>.
"Vincent de Lau" <vi...@delau.nl> writes:

> For mod_rewrite.xml, here is a patch to fix the title-less section's. It has
> been submitted before, but the patch also contained another patch that is
> comitted already.

> [2 mod_rewrite.patch <application/octet-stream (quoted-printable)> ...]
> 
> [3 buildxml.patch <application/octet-stream (quoted-printable)> ...]

Committed.  Thanks.

-- 
Yoshiki Hayashi

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Re: XML validation and Ant

Posted by Yoshiki Hayashi <yo...@xemacs.org>.
Joshua Slive <jo...@slive.ca> writes:

> The other thing that might be nice is to remove build.sh/build.cmd/etc
> from httpd-2.0 and instead place them, together with all the necessary
> libraries in a seperate CVS repository where they can just be checked out
> into the right place.  I am a little uncomfortable with the fact that we
> are placing .cmd and .sh files into a directory that is accessible by
> default for everyone who installs apache.  It isn't executable by default,
> but with a slight configuration error...
> 
> What do people think about that?

I'm +1 on moving them to other CVS repository, say
httpd-docs-tools.

-- 
Yoshiki Hayashi

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


RE: XML validation and Ant

Posted by Vincent de Lau <vi...@delau.nl>.
> -----Original Message-----
> From: Vincent de Lau [mailto:vincent@delau.nl]
> Sent: Wednesday, July 31, 2002 3:50 AM
>
> > -----Original Message-----
> > From: Yoshiki Hayashi [mailto:yoshiki@xemacs.org]
> > Sent: Tuesday, July 30, 2002 1:55 PM
> >
> > I once again stumbled across Ant's bug.  Using Ant's
> > xmlvalidate task, validating iso-2022-jp XML files always
> > fail because XML parser assumes passed file is in UTF-8.
> >
> > I fixed the bug as in bug #11279.  I re-uploaded jar files
> > to
> > http://httpd.apache.org/dev/dist/jakarta-ant-1.5-modified-jar.tar.gz
> >
>
> I've noticed two documents that won't validate. I'll post a patch for them
> soon.

For mod_rewrite.xml, here is a patch to fix the title-less section's. It has
been submitted before, but the patch also contained another patch that is
comitted already.

Then there is allmodules.xml:

The validation target should exclude mod/allmodules.xml since this document
does not have a DTD assigned. It cannot be assigned either, since it is used
as an imported entity. If rewritten the proposed validate target to:

<target name="validate"
    description="Validate the XML source files">
    <!-- Validate almost all XML files in all languages -->
    <xmlvalidate lenient="false" failonerror="false" warn="true">
        <fileset dir="../"
                 includes="**/*.xml **/*.xml.*"
                 excludes="mod/allmodules.xml mod/allmodules.xml.*
style/**"/>
    </xmlvalidate>
</target>

I'm making the assumption that all XML documents should be valid, except
where it is impossible. The latter can be excluded.
Furthermore, XML files have an extension of .xml or .xml.lc (where lc can be
any valid language code).

I've also excluded style/ since this directory contains XML documents that
don't have a DTD assigned (build.xml and lc.xml)

For the future, I've also included the 'translated' allmodules.xml

The patch also excludes allmodules.xml from the xslt target.

Vincent de Lau
 vincent@delau.nl


Viewing XML files directive (was RE: XML validation and Ant)

Posted by Joshua Slive <jo...@slive.ca>.
On Thu, 1 Aug 2002, Vincent de Lau wrote:
> The XML files are a kind of source files as well. Although you can look at a
> single XML file with a 'XML client' (like a web browser), the generated HTML
> references other HTML files. In an installation, this would mean that the
> XML files are useless and you would need an XSLT processor to read them. To
> overcome this, there are two options (Option three being a bit more work
> ;) ).

A couple things to think about:

- With MultiViews on, you can request documents with no extension at all.
We could possibly change the way we are linking so that we include neither
xml nor html in the links.  I like this idea, because it is good to link
to "content" rather than to a specific representation of content.  But I
don't think we are quite ready for this yet, because I there are browsers
out there (Mozilla, for one) that ask for XML in their Accept header, but
then barf when they get it.

- Some of the xml files are really not suited to real-time processing.
The major examples are mod/index.xml and mod/directives.xml.  Processing
these files requires the browser to read every single module file.
Obviously we can't have that happening on a regular basis.  One possible
solution is to process them into an intermediate xml representation, which
can in turn be processed real-time.

Joshua.


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


RE: XML validation and Ant

Posted by Vincent de Lau <vi...@delau.nl>.
> This looks very nice!  I like decreasing barriers to entry (no cygwin
> necessary)!

Doesn't Java mean portable? ;)

> I don't have time to test it right now, but if a windows XP user can give
> it a go and tell us it works, I'll be happy to commit it.

I'm more concerned about Windows NT.

> The other thing that might be nice is to remove build.sh/build.cmd/etc
> from httpd-2.0 and instead place them, together with all the necessary
> libraries in a seperate CVS repository where they can just be checked out
> into the right place.  I am a little uncomfortable with the fact that we
> are placing .cmd and .sh files into a directory that is accessible by
> default for everyone who installs apache.  It isn't executable by default,
> but with a slight configuration error...
>
> What do people think about that?

I've had some thoughts along that line as well, but even more rigorous.

The XML files are a kind of source files as well. Although you can look at a
single XML file with a 'XML client' (like a web browser), the generated HTML
references other HTML files. In an installation, this would mean that the
XML files are useless and you would need an XSLT processor to read them. To
overcome this, there are two options (Option three being a bit more work
;) ).

1. Move the XML files to the 'build tree' Joshua just suggessted. Before
release, the HTML files will be generated and packed.

2. Hack the style sheets to use links to XML files by default and if a
parameter is set (for instance by Ant), links to HTML files.

  <xsl:param name="filetype" select="'xml'" />
  ....
  <a href="{name}.{$filetype}">

(3. 'mod_xslt' in default distro)

I'll be willing to get option 2 implemented if that seems a good idea.

Vincent de Lau
 vincent@delau.nl


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


RE: XML validation and Ant

Posted by Joshua Slive <jo...@slive.ca>.
On Wed, 31 Jul 2002, Vincent de Lau wrote:
> I've tested it on Win2K Pro with Sun's JRE 1.4 and it works perfectly. I've
> 'ported' the build.sh script to a Windows .cmd script.

This looks very nice!  I like decreasing barriers to entry (no cygwin
necessary)!

I don't have time to test it right now, but if a windows XP user can give
it a go and tell us it works, I'll be happy to commit it.

The other thing that might be nice is to remove build.sh/build.cmd/etc
from httpd-2.0 and instead place them, together with all the necessary
libraries in a seperate CVS repository where they can just be checked out
into the right place.  I am a little uncomfortable with the fact that we
are placing .cmd and .sh files into a directory that is accessible by
default for everyone who installs apache.  It isn't executable by default,
but with a slight configuration error...

What do people think about that?


Joshua.


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


RE: XML validation and Ant

Posted by Vincent de Lau <vi...@delau.nl>.
> -----Original Message-----
> From: Yoshiki Hayashi [mailto:yoshiki@xemacs.org]
> Sent: Tuesday, July 30, 2002 1:55 PM
>
> I once again stumbled across Ant's bug.  Using Ant's
> xmlvalidate task, validating iso-2022-jp XML files always
> fail because XML parser assumes passed file is in UTF-8.
>
> I fixed the bug as in bug #11279.  I re-uploaded jar files
> to
> http://httpd.apache.org/dev/dist/jakarta-ant-1.5-modified-jar.tar.gz
>

I've tested it on Win2K Pro with Sun's JRE 1.4 and it works perfectly. I've
'ported' the build.sh script to a Windows .cmd script.

It requires two files: A wrapper to call a new version of CMD.EXE with
'delayed environment variable expansion' enabled and then calls the real
piece of work: build2.cmd. This does NOT work on Win9x/WinME (No, replacing
CMD.EXE with COMMAND.COM doesn't solve it!). It should work on WinNT and
WinXP, but I haven't tested it. There is a check in BUILD2.CMD that tests if
'delayed environment variable expansion' is enabled.

COMMAND.COM < CMD.EXE < /bin/sh :(

I've noticed two documents that won't validate. I'll post a patch for them
soon.

Vincent de Lau
 vincent@delau.nl