You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Paul Derbyshire <de...@globalserve.net> on 2000/05/25 06:05:19 UTC

Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

I put an XML file in D:\Wol\wolsrc\kb\private\ that contained:

<?xml-stylesheet href="..\..\metadata\root.xsl" type="text/xsl"?>


which, according to the standard on hrefs, means the style sheet is to be
found in D:\Wol\wolsrc\metadata\ where, in fact, it is.

When I invoke my Xalan application, which uses a null second argument to
XSLTProcessor.process(), it does in fact accept the null second argument as
meaning it should look for a stylesheet PI in the first argument, and it
finds the stylesheet PI and parses it OK.

This is the good news.

The bad news is that Xalan then f*cks up royally. It searches for the style
sheet *relative to the directory the app is running in* and not *relative
to the file directory!* -- IOW, it looks in the wrong place. For example,
if I invoke the app in D:\Wol\wolsrc, I get:

XSL Error: SAX Exception
org.apache.xalan.xslt.XSLProcessorException: File
"file:D:/WOL/wolsrc/../../metadata/root.xsl" not found.
	at org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java:1630)
	at org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java:1594)
	at org.apache.xalan.xslt.XSLTEngineImpl.process(XSLTEngineImpl.java:655)
	at pgd.wol.compiler.Compiler.processXMLFile(compiler/Compiler.java:265)
	at pgd.wol.compiler.Compiler.process(compiler/Compiler.java:193)
	at pgd.wol.compiler.Compiler.process(compiler/Compiler.java:187)
	at pgd.wol.compiler.Compiler.main(compiler/Compiler.java:117)
Exception in thread "main" 

It's looking in D:\metadata\ and not in D:\Wol\wolsrc\metadata\ and
according to the W3C Recommendation on the stylesheet specifier (and on
hrefs in general) this is just plain WRONG.

It is URGENT that this bug in Xalan be fixed and the fix patch released
ASAP, because otherwise this project is at a complete halt. A work stoppage
of longer than 24 hours is not something that can be tolerated under the
current circumstances, by the way.

-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Paul Derbyshire <de...@globalserve.net>.
At 09:38 AM 5/26/00 +0200, you wrote:
>Paul Derbyshire wrote:
>> 
>> >You might want to read up on that. A leading / is exactly within the
>> >standard and means exactly the document root.
>> 
>> But it's not truly relative, as I explained. If I moved to another server
>> and the path to my user directory changed from /users/jrandom to /~jruser
>> I'd have to change every internal link in my site, and if the site is big
>> enough that's impractical to expect of me.
>
>No

Wrong.

If I move from a site where my directory on the Web server is
/users/jrandom to one where it is
/~jruser and leave a link that formerly worked as <a
href=/users/jrandom/foo.html>, that link will
start to be a 404, because the address http://new.isp.com/users/jrandom
doesn't exist, unlike http://old.isp.com.
The only way this kind of URL would never have to be changed would be
moving from a server where my site is rooted at the server document root to
another server where my site gets rooted at the server document root, and
the only time you can ever have your site at the server's document root is
when it's *your server*. I don't have one. Yet.

What could work would be to use a PURL for every link, even local ones; but
this has its own problems, such as the unneeded extra load put on the PURL
resolver and the dependence on a PURL resolver (which could prove to be
less than 100% reliable, and which would demand that the site file
hierarchy be uploaded to the server just to test it. I want to be able to
test the site locally without even having to make an Internet connection,
using just the batch-XML-to-HTML program and a Web browser.

>Why not write a script that runs cocoon repeatedly?

You're joking, right? Can you imagine how slow that would be, a full-blown
JVM startup for every single new or updated file? The JVM takes ages to
bootstrap, which is why I'm using a Java app to call a Xalan API, thus
meaning that there is only one JVM startup for every *batch* of possibly
hundreds of updated and new files.

>In any case, what I would do is install Apache and JServ and grab the HTML
pages automatically with wget.

An ugly hack that requires learning a bunch of new pieces of software
before I'm particularly ready, and requires going to Linux for every test
and laboriously rebooting to Windows for every network access -- either to
upload the new and updated files to a server or to use email or surf the
Web or download things.

One step at a time. Get the XML to HTML working; get a starting library of
style sheets working; get some content up and tested; and then worry about
running my own Web server, figuring out how to do servlets, getting that
broadband connection, etc.


-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Ulrich Mayring <ul...@denic.de>.
Paul Derbyshire wrote:
> 
> >You might want to read up on that. A leading / is exactly within the
> >standard and means exactly the document root.
> 
> But it's not truly relative, as I explained. If I moved to another server
> and the path to my user directory changed from /users/jrandom to /~jruser
> I'd have to change every internal link in my site, and if the site is big
> enough that's impractical to expect of me.

No, you won't have to change a single thing if you use relative URIs
with the exception of the document root of the webserver. If you use
fully-specified URIs then you'll have that problem, that's why they
invented relative URIs. As I said, there's a W3C statement out on
relative URIs, you really should read it. But I'm not going to search
out the URL for you, so if you can't be bothered to look... :)

> Cocoon doesn't seem to have an API to use it in a program like mine, which
> batch processes large numbers of XML files at a whack. It either operates
> on the command line one file at a painstaking time, or it operates as a
> servlet.

Why not write a script that runs cocoon repeatedly? In any case, what I
would do is install Apache and JServ and grab the HTML pages
automatically with wget.

Ulrich

-- 
Ulrich Mayring
DENIC eG, Systementwicklung

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Paul Derbyshire <de...@globalserve.net>.
At 11:28 PM 5/25/00 +0200, you wrote:
>On Thu, 25 May 2000, Paul Derbyshire wrote:
>
>> I'm not trying to leave the document root of a web server. This was Xalan
>> running standalone, and running from D:\Wol\wolsrc, so if anything at all
>> could be considered a "document root" it was D:\Wol\wolsrc, and I wasn't
>> trying to escape that. More to the point, Xalan *did* escape that, hunting
>> in D:\metadata!
>
>You're not using Xalan, you're using a href attribute. Never mind what the
>application is, but if you use a href attribute, you better make sure that
>it contains an URI :)

It does, and it should work. The error is Xalan's; it parses the URI
relative to the CWD and not to the directory containing the file containing
the href attribute containing the URI.

>You might want to read up on that. A leading / is exactly within the
>standard and means exactly the document root.

But it's not truly relative, as I explained. If I moved to another server
and the path to my user directory changed from /users/jrandom to /~jruser
I'd have to change every internal link in my site, and if the site is big
enough that's impractical to expect of me.

>> Xalan should correctly handle fully relative paths like
>> "../../metadata/root.xsl".
>
>cocoon does :)

Cocoon doesn't seem to have an API to use it in a program like mine, which
batch processes large numbers of XML files at a whack. It either operates
on the command line one file at a painstaking time, or it operates as a
servlet.

-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Uli Mayring <ul...@denic.de>.
On Thu, 25 May 2000, Paul Derbyshire wrote:

> I'm not trying to leave the document root of a web server. This was Xalan
> running standalone, and running from D:\Wol\wolsrc, so if anything at all
> could be considered a "document root" it was D:\Wol\wolsrc, and I wasn't
> trying to escape that. More to the point, Xalan *did* escape that, hunting
> in D:\metadata!

You're not using Xalan, you're using a href attribute. Never mind what the
application is, but if you use a href attribute, you better make sure that
it contains an URI :)

> Those are also breaking the standard. A URI with a / at the start should be
> absolute, although "absolute" is relative to the document root.

You might want to read up on that. A leading / is exactly within the
standard and means exactly the document root.

> Xalan should correctly handle fully relative paths like
> "../../metadata/root.xsl".

cocoon does :)

Ulrich

-- 
Ulrich Mayring
DENIC eG, Softwareentwicklung


Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Paul Derbyshire <de...@globalserve.net>.
At 10:14 AM 5/25/00 +0200, you wrote:
>Paul Derbyshire wrote:
>> 
>> It's looking in D:\metadata\ and not in D:\Wol\wolsrc\metadata\ and
>> according to the W3C Recommendation on the stylesheet specifier (and on
>> hrefs in general) this is just plain WRONG.
>
>Actually anything containing a backslash is suspect as far as it is
>encountered in a href attribute.

The href contained only forward slashes. I only used backslashes when
referring to the paths, not when referring to the href attribute. The
attribute was "../../metadata/root.xsl".

>You
>should not be able to leave the document root of a web server and URI
>notation ensures that.

I'm not trying to leave the document root of a web server. This was Xalan
running standalone, and running from D:\Wol\wolsrc, so if anything at all
could be considered a "document root" it was D:\Wol\wolsrc, and I wasn't
trying to escape that. More to the point, Xalan *did* escape that, hunting
in D:\metadata!

>I'm not sure about Xalan stand-alone, but if you use cocoon with xalan
>you can now use relative URIs like "/std.xsl" or
>"/metadata/something.xsl". The "/" is mapped to the filesystem as
>specified by the document root and the rest of the path is taken from
>there. Pretty sensible behavior in my mind.

Those are also breaking the standard. A URI with a / at the start should be
absolute, although "absolute" is relative to the document root. But for a
site to be relocatable, all URIs in it should be truly relative, otherwise
if the site is moved to a new server every single internal link has to be
changed, which for a large site is completely out of the question. For
example, in moving the site from fooISP.com to barISP.com you might have to
change the absolute link http://www.fooISP.com/users/~jruser/index.html to
http://www.barISP.com/jrandom/index.html and the not-so-absolute link
/users/~jruser/index.html to /jrandom/index.html.

Xalan should correctly handle fully relative paths like
"../../metadata/root.xsl".
-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Ulrich Mayring <ul...@denic.de>.
Paul Derbyshire wrote:
> 
> It's looking in D:\metadata\ and not in D:\Wol\wolsrc\metadata\ and
> according to the W3C Recommendation on the stylesheet specifier (and on
> hrefs in general) this is just plain WRONG.

Actually anything containing a backslash is suspect as far as it is
encountered in a href attribute. href means URI, not filepath. You
should not be able to leave the document root of a web server and URI
notation ensures that.

> It is URGENT that this bug in Xalan be fixed and the fix patch released
> ASAP, because otherwise this project is at a complete halt. A work stoppage
> of longer than 24 hours is not something that can be tolerated under the
> current circumstances, by the way.

I'm not sure about Xalan stand-alone, but if you use cocoon with xalan
you can now use relative URIs like "/std.xsl" or
"/metadata/something.xsl". The "/" is mapped to the filesystem as
specified by the document root and the rest of the path is taken from
there. Pretty sensible behavior in my mind.

Ulrich

-- 
Ulrich Mayring
DENIC eG, Systementwicklung

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Donald Ball <ba...@webslingerZ.com>.
On Thu, 25 May 2000, Paul Derbyshire wrote:

> At 11:19 PM 5/25/00 +0200, you wrote:
> >mail to xalan-dev-subscribe@xml.apache.org
> 
> I'm an end-user, not a developer.

It doesn't matter, there is no xalan users list. If you have any xalan
questions, that's the place right now. If you doubt xalan's XSLT
conformance, you should test your examples with another XSLT transformer,
e.g. james clark's xt processor:

http://www.jclark.com/xml/xt.html

before submitting a bug report. If you've found a bug in Xalan's XSLT
implementation, I'm sure the xalan developers would love to hear about it.
The cocoon users list is simply not the proper forum for it.

In general, you would do well to be less acerbic in your email. The XML
apache guys are in general quite friendly and knowledgeable, but aren't
likely to go out of their way to help someone as demanding and sure of
themselves as you seem to be.

- donald


Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Paul Derbyshire <de...@globalserve.net>.
At 11:19 PM 5/25/00 +0200, you wrote:
>mail to xalan-dev-subscribe@xml.apache.org

I'm an end-user, not a developer.
-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Uli Mayring <ul...@denic.de>.
On Thu, 25 May 2000, Paul Derbyshire wrote:

> There is no xalan-users list. Change that, and I will be happy to oblige.

I just changed it, to subscribe mail to xalan-dev-subscribe@xml.apache.org

bye,

Ulrich

-- 
Ulrich Mayring
DENIC eG, Softwareentwicklung


Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Paul Derbyshire <de...@globalserve.net>.
At 01:27 AM 5/25/00 -0500, you wrote:
>on 5/25/00 12:13 AM, Paul Derbyshire at derbyshire@globalserve.net wrote:
>> 
>> Please fix this bug quickly, regardless of the discovery of a workaround.
>This is not a xalan mailing list.  Please join the Xalan list and post your
>bugs there.

There is no xalan-users list. Change that, and I will be happy to oblige.
-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|

Re: Asking the right mailling list (was: Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.)

Posted by Sebastien Koechlin I-VISION <sk...@n-soft.com>.
Michael Scheuner a écrit :
> 
> Hi!
> >
> > I really need to make Cocoon output decimal entities instead of
> > using a charset (I mean it should output '&#233;' instead of 'é')
> > but I don't know which part is in charge of this.
> > Cocoon ? Xalan ? Xerces-J ?
> 
> A quick test showed: &amp;#233; works. Don't know if thats enough for you.

Sorry, I did not explain very well:

I have a WAP phone (or maybe a gateway) that does not understand
charset encoding.
If I send an XML document using ISO-something encoding, or UTF-8,
I get garbage instead of my character when it's value is over 127.

The only solution is to send thoses characters as decimal entities.

If I send
	<?xml version="1.0" encoding="iso-8859-1"?>
	<tags..>Déjeuner</tags..>

It display something like 
	D@~jeuner
(It look like something translated my XML file into UTF-8,
but it is displayed as if it was a 8bits charset. As I
have no control over clients software, I can not correct
this bug).

I need cocoon to output the string like:
	D&#233;jeuner

and then, displayed characters are OK.

-- 
Sebastien Koechlin

RE: Asking the right mailling list (was: Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.)

Posted by Michael Scheuner <sc...@cos-data.de>.
Hi!
>
> I really need to make Cocoon output decimal entities instead of
> using a charset (I mean it should output '&#233;' instead of 'é')
> but I don't know which part is in charge of this.
> Cocoon ? Xalan ? Xerces-J ?
>
> --
> Sebastien Koechlin

A quick test showed: &amp;#233; works. Don't know if thats enough for you.

Michael


Asking the right mailling list (was: Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.)

Posted by Sebastien Koechlin I-VISION <sk...@n-soft.com>.
Ian Abbott wrote:
>
> Xalan: xalan-dev-subscribe@xml.apache.org
> Xerces-J: xerces-j-dev-subscribe@xml.apache.org
> FOP: fop-dev-subscribe@xml.apache.org
> 
> Cocoon is the system that brings these 3 together. It can't change their
> codebases, and we don't forward to xalan-dev. You should go there and
> nag them. Not Stefano and co...

I really need to make Cocoon output decimal entities instead of
using a charset (I mean it should output '&#233;' instead of 'é')
but I don't know which part is in charge of this.
Cocoon ? Xalan ? Xerces-J ?

-- 
Sebastien Koechlin

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Paul Derbyshire <de...@globalserve.net>.
At 10:01 AM 5/25/00 +0100, you wrote:
>As you clearly can't be bothered to find out the correct mailing list
>names...

I found out the correct mailing list names. If I use *your* definition,
there *is* no correct mailing list, since there is no xalan-*users* list.
But I have to ask these questions somewhere. This is the only -users list
in sight, and so the only appropriate list for an end-user like myself to use.

>...xalan-dev. You should go there...

I'm an end-user, not a Xalan developer.
-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Ian Abbott <ia...@cinesite.co.uk>.
Paul Derbyshire wrote:
> 
> Please fix this bug quickly, regardless of the discovery of a workaround.

Paul, you're emailing the wrong list. 'cocoon-users' and 'cocoon-dev'
are for the Cocoon project, an XML publishing framework built upon the
*separate* development of the Xalan, Xerces and FOP systems.

You're starting to piss a few people off by nagging the list to fix bugs
that they cannot.

As you clearly can't be bothered to find out the correct mailing list
names, here's the list from xml.apache.org/mail.html:-

Xalan: xalan-dev-subscribe@xml.apache.org
Xerces-J: xerces-j-dev-subscribe@xml.apache.org
FOP: fop-dev-subscribe@xml.apache.org

Cocoon is the system that brings these 3 together. It can't change their
codebases, and we don't forward to xalan-dev. You should go there and
nag them. Not Stefano and co...

Cheers
Ian

Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Mike Engelhart <me...@earthtrip.com>.
on 5/25/00 12:13 AM, Paul Derbyshire at derbyshire@globalserve.net wrote:
> 
> Please fix this bug quickly, regardless of the discovery of a workaround.
This is not a xalan mailing list.  Please join the Xalan list and post your
bugs there.

Mike


Re: Serious error in Xalan -- parsing stylesheet PIs -- URGENT.

Posted by Paul Derbyshire <de...@globalserve.net>.
At 12:05 AM 5/25/00 -0400, you wrote:
>It's looking in D:\metadata\ and not in D:\Wol\wolsrc\metadata\ and
>according to the W3C Recommendation on the stylesheet specifier (and on
>hrefs in general) this is just plain WRONG.

Not so urgent now. I've discovered a workaround, although it's an
astonishingly ugly hack. For the enlightenment of anyone else who has
encountered this problem, or might do so in the future, the hack is as
follows: Before every Xalan call on a file (or before every file in a
differing directory to the previous), call System.setProperty("user.dir",
dir) where dir is the directory containing the file. This tricks the
erroneous search in Xalan into finding the correct file after all.

Ugly hacks usually have deleterious consequences or annoying tradeoffs; one
of the nastier consequences of this particular ugly hack is that files in
different directories can't be processed concurrently. Since I have a
single-CPU machine, this doesn't bother me too much, but users with a
dual-processor Linux box (or a Connection Machine, hehe) might find this a
pain if they want to process a batch of files some of which are alone in
their directories.

Ugly hacks should not be necessary, and my software shouldn't have to set,
depend on, or muck with "user.dir". I can thank my lucky stars my Java
implementation doesn't make this system property read-only, or WOL would be
dead in the water.

Please fix this bug quickly, regardless of the discovery of a workaround.
-- 
   .*.  "Clouds are not spheres, mountains are not cones, coastlines are not
-()  <  circles, and bark is not smooth, nor does lightning travel in a
   `*'  straight line."    -------------------------------------------------
        -- B. Mandelbrot  |http://surf.to/pgd.net derbyshire@globalserve.net
_____________________ ____|________                          Paul Derbyshire
Programmer & Humanist|ICQ: 10423848|