You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by "Peter B. West" <pb...@powerup.com.au> on 2002/10/01 04:23:36 UTC

Re:

Arved Sandstrom wrote:
>>-----Original Message-----
>>From: Tony Graham [mailto:Tony.Graham@Sun.COM]


>>Peter B. West wrote at 30 Sep 2002 13:28:18 +1000:
>> > Tony Graham wrote:
>> > > jaccoud@petrobras.com.br wrote at 27 Sep 2002 16:44:32 -0300:
>>...
>> > >  > That means  "-", "#12235" , etc are characters, while
>>"'1'" is not.
>> > >
>> > > &#12235; is a character reference.  '#12235' is how you talk about a
>> > > character's code point, although the hexadecimal representation is
>> > > usually preferable.
>> > >
>> > > In XSL terms, "'1'" is a one-character string literal, but while you
>> > > could claim that it is one character, there's no XSL
>>conversion from a
>> > > string to a character, so <fo:character character="'1'"/>
>>should fail.
>> >
>> > Tony,
>> >
>> > I don't think this gets us out of difficulty.  A casual inspection
>>
>>Forgive me, but I wasn't trying to get anybody out of any difficulty,
>>I was just trying to keep the terminology accurate.
>>
>>...
>> > So how do I represent a character?
>> >
>> > To me, the cleanest, least ambiguous way is to represent a <character>
>> > attribute assignment value with "'<character>'" - a string literal of
>> > length 1.
>>
>>Except that you know that that's not specified among the allowed
>>conversions.
>>
>>The interesting thing is that 'character' doesn't appear in the
>>productions in Section 5.9, Expressions, of the XSL Recommendation.
>>Now there's a question for xsl-editors@w3.org!
>>
>>I think that you represent a character as a single character, e.g.,
>>character="c", or as a numeric character reference, e.g.,
>>character="&#xA;".
> 
> 
> I agree with this last, after having digested everything.
> 
> Point is well taken that we have some points to nitpick with xsl-editors,
> mostly about disambiguating some of the language.

Arved,

Help me here. I must be missing something.  What is it that you agree 
with?  That the spec, as worded, leaves us with
  character="c"
or
  character="&#x63;"
which amounts to the same thing?

If so, fair enough.  Do you also agree that "c" is an NCName?  And that
  character="-"
is a parsing error?

As far as I can see, the only immediate ways forward are to descend into 
the mire of context dependent parsing (which the editors have recently 
formally decided that we must do in respect of "format") or apply our 
own disambiguating condition.  How are you intending to implement 
<character>?

Peter
-- 
Peter B. West  pbwest@powerup.com.au  http://www.powerup.com.au/~pbwest/
"Lord, to whom shall we go?"


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re:

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Peter B. West wrote:
> This is the critical point.  The namespace not only restricts the 
> elements and attributes, but imposes itself on the contents of the 
> attribute values passed in by the XML parser.

Umm, the namespace does not impose anything. It's the XSLFO spec which
defines the semantics of some elements and XML attribute values. That
said elements happen to be in a certain namespace is not really relevant
for getting something formatted.

>  I need to think about 
> this a bit more, but it seems to me that the recent ruling on <string> 
> with respect to the "format" attribute, which makes my flesh creep every 
> time I think about it, disguises an attempt to smuggle part of the 
> Transform namespace's constraints into the Format namespace.  They are 
> completely different expression environments, which is why it doesn't 
> work.  Has anyone else given this any thought?

Where does XSLT come into the picture? The whole thing is specified
in the XSLFO spec, section 5. The expressions which make up property
values in the end come from 5.9ff. The expression language used by
XSLT, XPath, is an entirely different beast (I don't think this is
much of an advantage).

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re:

Posted by "Peter B. West" <pb...@powerup.com.au>.
Joerg and Arved,

Thanks for sorting this out while I was asleep.  I talk about these 
things in terms of the parser, in spite of the offence it might give to 
specification purists, because that is where I have spent a lot of my 
time lately.

J.Pietschmann wrote:
> Don't look at XML AttValue, look at the XSLFO property expression language.
> Somehow it is implicit that all attributes in a XSLFO document are parsed
> as expressions which are defined in 5.9 "Expressions".

This is the critical point.  The namespace not only restricts the 
elements and attributes, but imposes itself on the contents of the 
attribute values passed in by the XML parser.  I need to think about 
this a bit more, but it seems to me that the recent ruling on <string> 
with respect to the "format" attribute, which makes my flesh creep every 
time I think about it, disguises an attempt to smuggle part of the 
Transform namespace's constraints into the Format namespace.  They are 
completely different expression environments, which is why it doesn't 
work.  Has anyone else given this any thought?

Peter
-- 
Peter B. West  pbwest@powerup.com.au  http://www.powerup.com.au/~pbwest/
"Lord, to whom shall we go?"


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re:

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Arved Sandstrom wrote:
> I think they screwed up the grammar.
Me too. However, I think it would be really hard to press
something which is "intuitive", consistent as well as easy to
parse into a single grammar for all XSLFO porperties. It seems
they fell for the same as the C preprocessor guys did, which
is intuitive and easy to implement for the most part, but had
this abominable 0xe-12 problem as well as the rather unintuitive
"argument prescanning" hidden in its dark corners.

> As I stated before, I find it ludicrous
> that character="-" would not be OK.
That's ok, but it would require some extensions to the whole
property handling.


J.Pietschmann



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE:

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
> -----Original Message-----
> From: J.Pietschmann [mailto:j3322ptm@yahoo.de]
> Sent: October 6, 2002 2:15 PM
> To: fop-dev@xml.apache.org
> Subject: Re: <character>
>
>
> Arved Sandstrom wrote:
> > And unless _I_ am missing something, "-" precisely matches that
> production.
> >
> > You are looking at
> >
> > "'" [^']* "'"
> >
> > but I am looking at
> >
> > '"' [^"]* '"'
> >
> > According to the latter I can absolutely do "-".
>
> Well, in
>    hyphenation-char="-"
> the hyphen is the expression, not "the hyphen surrounded by double
> quotes". As I said, unless I'm something missing, the FO property
> expression is the value of the XML attribute, which in turn is the
> hyphen, because the double quotes are part of the XML syntax and
> are stripped by the XML parser. The XSLFO property expression parser
> gets the hyphen, without any quotes, double, or single. And without
> the quotes, it does not match either of the two productions for literal.
> This is the problem here.
>
> Perhaps I should have written that
>    hyphenation-char="'-'"
> and
>    hyphenation-char='"-"'
> as well as
>      hyphenation-char='&quot;-&quot;'
> are legal, while neiter
>      hyphenation-char='-'
> nor
>      hyphenation-char="-"
> are ok.

Yes, I see your point.

I think they screwed up the grammar. As I stated before, I find it ludicrous
that character="-" would not be OK.

Arved


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re:

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Arved Sandstrom wrote:
> And unless _I_ am missing something, "-" precisely matches that production.
> 
> You are looking at
> 
> "'" [^']* "'"
> 
> but I am looking at
> 
> '"' [^"]* '"'
> 
> According to the latter I can absolutely do "-".

Well, in
   hyphenation-char="-"
the hyphen is the expression, not "the hyphen surrounded by double
quotes". As I said, unless I'm something missing, the FO property
expression is the value of the XML attribute, which in turn is the
hyphen, because the double quotes are part of the XML syntax and
are stripped by the XML parser. The XSLFO property expression parser
gets the hyphen, without any quotes, double, or single. And without
the quotes, it does not match either of the two productions for literal.
This is the problem here.

Perhaps I should have written that
   hyphenation-char="'-'"
and
   hyphenation-char='"-"'
as well as
     hyphenation-char='&quot;-&quot;'
are legal, while neiter
     hyphenation-char='-'
nor
     hyphenation-char="-"
are ok.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE:

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
> -----Original Message-----
> From: J.Pietschmann [mailto:j3322ptm@yahoo.de]
> Sent: October 6, 2002 1:29 PM
> To: fop-dev@xml.apache.org
> Subject: Re: <character>
> 
> 
> Arved Sandstrom wrote:
> > An Expr can be a Literal, the production for which is
> > 
> > '"' [^"]* '"'
> > | "'" [^']* "'"
> > 
> > If I look at the first alternative,
> > 
> > '"' [^"]* '"'
> > 
> > it seems to me that I have pretty considerable leeway, and "-" 
> isn't ruled
> > out at all.
> 
> Erm, the expression is supposed to be inside the XML attribute quotes,
> for example hyphenation-char="'-'" would be ok (literal, second
> production), but hyphenation-char="-" does not match the literal
> production, nor any other (except "operator"). Unless I missed
> something, of course.

And unless _I_ am missing something, "-" precisely matches that production.

You are looking at

"'" [^']* "'"

but I am looking at

'"' [^"]* '"'

According to the latter I can absolutely do "-".

Arved


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re:

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Arved Sandstrom wrote:
> An Expr can be a Literal, the production for which is
> 
> '"' [^"]* '"'
> | "'" [^']* "'"
> 
> If I look at the first alternative,
> 
> '"' [^"]* '"'
> 
> it seems to me that I have pretty considerable leeway, and "-" isn't ruled
> out at all.

Erm, the expression is supposed to be inside the XML attribute quotes,
for example hyphenation-char="'-'" would be ok (literal, second
production), but hyphenation-char="-" does not match the literal
production, nor any other (except "operator"). Unless I missed
something, of course.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE:

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
> -----Original Message-----
> From: J.Pietschmann [mailto:j3322ptm@yahoo.de]
> Sent: October 6, 2002 12:39 PM
> To: fop-dev@xml.apache.org
> Subject: Re: <character>
>
> Arved Sandstrom wrote:
> > Can you cite the specific productions that lead to this
> conclusion? I am not
> > saying that you are wrong but I can't find it.
> >
> > I must be tired. ;-) I just looked at the XML 1.1 production
> for AttValue
> > which is
>
> Don't look at XML AttValue, look at the XSLFO property expression
> language.
> Somehow it is implicit that all attributes in a XSLFO document are parsed
> as expressions which are defined in 5.9 "Expressions". Refer specifically
> to 5.9.3 "Basics". A single hyphen is not a valid expression according to
> the XSLFO expression grammar.
> Maybe some fallbacks are implicit somewhere, I don't know.

An Expr can be a Literal, the production for which is

'"' [^"]* '"'
| "'" [^']* "'"

If I look at the first alternative,

'"' [^"]* '"'

it seems to me that I have pretty considerable leeway, and "-" isn't ruled
out at all.

Arved


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re:

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Arved Sandstrom wrote:
> Can you cite the specific productions that lead to this conclusion? I am not
> saying that you are wrong but I can't find it.
> 
> I must be tired. ;-) I just looked at the XML 1.1 production for AttValue
> which is

Don't look at XML AttValue, look at the XSLFO property expression language.
Somehow it is implicit that all attributes in a XSLFO document are parsed
as expressions which are defined in 5.9 "Expressions". Refer specifically
to 5.9.3 "Basics". A single hyphen is not a valid expression according to
the XSLFO expression grammar.
Maybe some fallbacks are implicit somewhere, I don't know.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE:

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
> -----Original Message-----
> From: J.Pietschmann [mailto:j3322ptm@yahoo.de]
> Sent: October 6, 2002 12:00 PM
> To: fop-dev@xml.apache.org
> Subject: Re: <character>
>
>
> Arved Sandstrom wrote:
> > Why is character="-" a parsing error? The XML Recommendation
> has at least
> > one example of an attribute value that contains a hyphen.
>
> This comes from assuming that every unqoted sequence of characters which
> is not a number, mesutrement or a color has to be interpreted as NCName,
> as the grammar suggests, and IIRC a NCName must not start with a hyphen.
> This means
>   hyphenation-char="-"
> can't parse as number, can't parse as string, can't parse as color, can't
> parse as NCName  -> parsing error.

Hi Joerg

Can you cite the specific productions that lead to this conclusion? I am not
saying that you are wrong but I can't find it.

I must be tired. ;-) I just looked at the XML 1.1 production for AttValue
which is

AttValue    ::=    '"' ([^<&"] | Reference)* '"'
   |  "'" ([^<&'] | Reference)* "'"

and I see a prohibition here on using a literal '<' or '&' in the attribute
value, anywhere. But I see nothing about '-'.

If the grammar of the recommendations leads to the conclusion that

character="-"

is not OK, then this just simply offends my common sense.

Arved


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re:

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Arved Sandstrom wrote:
> Why is character="-" a parsing error? The XML Recommendation has at least
> one example of an attribute value that contains a hyphen.

This comes from assuming that every unqoted sequence of characters which
is not a number, mesutrement or a color has to be interpreted as NCName,
as the grammar suggests, and IIRC a NCName must not start with a hyphen.
This means
  hyphenation-char="-"
can't parse as number, can't parse as string, can't parse as color, can't
parse as NCName  -> parsing error.
Interestingly
  hyphenation-char="-1"
would parse, but certainly can't be converted to a char
Some other niceties:
  hyphenation-char="1*4"
would this make the hyphenation charater be "4"?
Can
  hyphenation-char="1 div 4"
be converted to &#x00BC? <bg> I know this becomes silly.

>>How are you intending to implement
>><character>?
> 
> By storing it as a Unicode value according to the XML Rec production

Functions complicate matters, and something like
   hyphenation-char="from-table-column('hyphenation-char')"
might even make some sense.

J.Pietschmann


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE:

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
> -----Original Message-----
> From: Peter B. West [mailto:pbwest@powerup.com.au]
> Sent: September 30, 2002 11:24 PM
> To: fop-dev@xml.apache.org
> Subject: Re: <character>
>
> Arved Sandstrom wrote:
> >>-----Original Message-----
> >>From: Tony Graham [mailto:Tony.Graham@Sun.COM]
>
> >>Peter B. West wrote at 30 Sep 2002 13:28:18 +1000:
> >> > Tony Graham wrote:
> >> > > jaccoud@petrobras.com.br wrote at 27 Sep 2002 16:44:32 -0300:
> >>...
> >> > >  > That means  "-", "#12235" , etc are characters, while
> >>"'1'" is not.
> >> > >
> >> > > &#12235; is a character reference.  '#12235' is how you
> talk about a
> >> > > character's code point, although the hexadecimal representation is
> >> > > usually preferable.
> >> > >
> >> > > In XSL terms, "'1'" is a one-character string literal, but
> while you
> >> > > could claim that it is one character, there's no XSL
> >>conversion from a
> >> > > string to a character, so <fo:character character="'1'"/>
> >>should fail.
> >> >
> >> > Tony,
> >> >
> >> > I don't think this gets us out of difficulty.  A casual inspection
> >>
> >>Forgive me, but I wasn't trying to get anybody out of any difficulty,
> >>I was just trying to keep the terminology accurate.
> >>
> >>...
> >> > So how do I represent a character?
> >> >
> >> > To me, the cleanest, least ambiguous way is to represent a
> <character>
> >> > attribute assignment value with "'<character>'" - a string literal of
> >> > length 1.
> >>
> >>Except that you know that that's not specified among the allowed
> >>conversions.
> >>
> >>The interesting thing is that 'character' doesn't appear in the
> >>productions in Section 5.9, Expressions, of the XSL Recommendation.
> >>Now there's a question for xsl-editors@w3.org!
> >>
> >>I think that you represent a character as a single character, e.g.,
> >>character="c", or as a numeric character reference, e.g.,
> >>character="&#xA;".
> >
> >
> > I agree with this last, after having digested everything.
> >
> > Point is well taken that we have some points to nitpick with
> xsl-editors,
> > mostly about disambiguating some of the language.
>
> Arved,
>
> Help me here. I must be missing something.  What is it that you agree
> with?  That the spec, as worded, leaves us with
>   character="c"
> or
>   character="&#x63;"
> which amounts to the same thing?

Yes, this is what I agree with.

> If so, fair enough.  Do you also agree that "c" is an NCName?  And that
>   character="-"
> is a parsing error?

Well, the production for NCName doesn't live in isolation, with reference to
http://www.w3.org/TR/REC-xml-names/#ns-decl. Yes, "c" fits the production,
but it's really an NCName when you have also declared the namespace.

Why is character="-" a parsing error? The XML Recommendation has at least
one example of an attribute value that contains a hyphen.

Maybe _I_ am missing something here. ;-)

> As far as I can see, the only immediate ways forward are to descend into
> the mire of context dependent parsing (which the editors have recently
> formally decided that we must do in respect of "format") or apply our
> own disambiguating condition.  How are you intending to implement
> <character>?

By storing it as a Unicode value according to the XML Rec production

Char    ::=    #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]

It will depend on the implementation library. ICU for example has UChar and
UChar32 types.

Regards,
Arved


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org