You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by kl...@club-internet.fr on 2002/02/17 23:24:04 UTC

REDESIGN: where I have been hiding

Hi all,

As you may have noticed I've been quite sllent for a while now. It's not
for lack of interest but as usual for want of time. I've started a
project at work which is going to be eating most of my time and energy
for at least a couple of months more.

I'm happy to see that we seem to be getting some new blood in the group
and I applaud Keiron's "Understanding ... initiative". Perhaps you'd
like me to write a bit about the current property handling in that
series?

I've actually (finally!) done a bit of work on FOP so here's an update.

I've just committed some stuff into the main branch. I put a class
called CTM (coordinate transformation matrix as in Postscript/PDF) into
the area package. It's currently set up when the page master is making
regions. The idea is that it will transform writing-mode relative
coordinates into media-relative coordinates. For now media means
standard 1st quandrant coordinates as used in the default PDF or
PostScript coordinate system, where origin is at the lower left of the
page.

The CTM accounts for both reference-orientation and writing mode on
reference areas. There is a CTM at the page-reference area level which
is used to transform writing-mode relative region coordinates into media
coordinates. Similarly the CTM at the region level should transform
writing-mode relative coordinates for its child areas into media
coordinates. The layout managers should then generate Area objects whose
position and size is expressed in writing-mode relative values. So if
"x" = start, "y" = before, "width" = ipd and "height" = bpd, the CTM
should turn that into actual x, y coordinates on the page.

The CTM class itself just does the basic math functions. I've put most
of the logic of setting up the CTM into the PropertyManager which may
not be the right place, but at least it's central. The method
"getCTMandRelDims" is called both from SimplePageMaster and Region (in
fo/pagination). The RelDims business is sort of a hack, but I wanted to
set inline and block-progression dimensions and I already had the info
from the CTM calculations... I'm sure one of you will have a better
idea!

The logic is certainly incomplete and the CTM currently has no effect on
the rendering logic. It may well be buggy too, but it compiles and runs
the "hello world" test. I will try to write some test cases, but I'm not
promising any dates.

I'm not sure what the best way to hook this in with rendering is. It may
well depend on the renderer. In PDF or Postscript the obvious easy
solution is just to set the CTM when entering the reference area, but
for other renderers this will probably not be possible.

I'm happy to answer any questions on this (if I can).

The other thing I've worked on is in the actual LayoutManager logic.
I've got this concept of a "BreakPosition" and have some code written at
the inline level (text, inline, line). The idea is that instead of an LM
calling generateAreas on its child LM, it repeatedly calls
"getNextBreakPosition". It then uses the returned BreakPosition
information to decide on the best break. Only then does it ask the child
LM to actually make the Areas necessary to break at the point. My goal
is to try to get this stuff ASAP into a state where it will at least
compile and can be put into the current code base.

Sorry for not being more active,
Karen

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: REDESIGN: where I have been hiding

Posted by "Peter B. West" <pb...@powerup.com.au>.
Karen,

It's good to hear from you.  In answer to your question, "Yes please," 
personally speaking.  I would like to hear, inter alia, about the timing 
of property resolution.

Peter

klease@club-internet.fr wrote:

>I'm happy to see that we seem to be getting some new blood in the group
>and I applaud Keiron's "Understanding ... initiative". Perhaps you'd
>like me to write a bit about the current property handling in that
>series?
>



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: REDESIGN: where I have been hiding

Posted by kl...@club-internet.fr.
A couple of remarks below:

Keiron Liddle wrote:
> 
> Hi Karen,
> 
> Welcome back.
> Yes it would be great if you could write about the property handling stuff
> if you have the time.

Done.
 
> As for the CTM stuff.
> For PDF there is a class PDFState which keeps track of pdf graphic state
> information. Maybe this could be useful. I think the only way to deal with
> any transform is to start a new state (q, Q). Currently it is used by the
> PDFGraphics2D for SVG drawing. It helps to not need to put every drawing
> instruction into a new graphics state and colour etc. is only changed when
> needed.

PDFState currently is not used at all in the PDFRenderer. Maybe we could
integrate it in there too? For now, however, I've put some minimal code
in the renderer to handle the CTM. Basically I've bracketed the region
viewport code with start and end methods. The start method takes the
CTM, so the renderer can set it up. In PDF, I just save the current
context (q) and concatenate the CTM which seems to work. Then when I end
the reference area, I restore the previous state with 'Q'.

All coordinates in the reference area should be relative to that
reference area origin where the "x" coordinate is the position along the
"start" axis, whatever that is, and the "y" coordinate is the position
along the "before" axis. Note however, that the viewport rectangle which
I'm storing on the RegionViewport is actually in absolute Page
coordinates (ie, origin at top-left of media rectangle).

I set a CTM at the page level in PDF which inverses the y coordinate and
subtracts it from the page-height. Because of that I also had to inverse
the y coordinate in the text matrix (Tm) otherwise my letters were
upside down... But it actually made a PDF file :-)

Also just realized that CTM is the same thing (more or less) as
java.awt.geom.AffineTransform (except the way they write their
matrices). Maybe I'll see if I can substitute that in.
 
> I was also wondering if instead of the call "LayoutManager
> getLayoutManager()" it might be better to use "void
> addLayoutManager(LayoutManager parent)". So in cases, for example wrapper,
> where the element does not have an layout manager but there could be
> multiple children that have a layout manager it will be easier to handle.
> It does change how the parent layout manager handles its children.

Yes that sounds good to me. Note also that I have some special handling
at the Block/Line level to make the LineLayoutManager.

>  From the brief description of the "getNextBreakPosition" it looks like it
> might be a good idea.

> Keiron.
> 
> On 2002.02.17 23:24 klease@club-internet.fr wrote:
> > Hi all,
> >
> > As you may have noticed I've been quite sllent for a while now. It's not
> > for lack of interest but as usual for want of time. I've started a
> > project at work which is going to be eating most of my time and energy
> > for at least a couple of months more.
> >
> > I'm happy to see that we seem to be getting some new blood in the group
> > and I applaud Keiron's "Understanding ... initiative". Perhaps you'd
> > like me to write a bit about the current property handling in that
> > series?
> >
> > I've actually (finally!) done a bit of work on FOP so here's an update.
> >
> > I've just committed some stuff into the main branch. I put a class
> > called CTM (coordinate transformation matrix as in Postscript/PDF) into
> > the area package. It's currently set up when the page master is making
> > regions. The idea is that it will transform writing-mode relative
> > coordinates into media-relative coordinates. For now media means
> > standard 1st quandrant coordinates as used in the default PDF or
> > PostScript coordinate system, where origin is at the lower left of the
> > page.
> >
> > The CTM accounts for both reference-orientation and writing mode on
> > reference areas. There is a CTM at the page-reference area level which
> > is used to transform writing-mode relative region coordinates into media
> > coordinates. Similarly the CTM at the region level should transform
> > writing-mode relative coordinates for its child areas into media
> > coordinates. The layout managers should then generate Area objects whose
> > position and size is expressed in writing-mode relative values. So if
> > "x" = start, "y" = before, "width" = ipd and "height" = bpd, the CTM
> > should turn that into actual x, y coordinates on the page.
> >
> > The CTM class itself just does the basic math functions. I've put most
> > of the logic of setting up the CTM into the PropertyManager which may
> > not be the right place, but at least it's central. The method
> > "getCTMandRelDims" is called both from SimplePageMaster and Region (in
> > fo/pagination). The RelDims business is sort of a hack, but I wanted to
> > set inline and block-progression dimensions and I already had the info
> > from the CTM calculations... I'm sure one of you will have a better
> > idea!
> >
> > The logic is certainly incomplete and the CTM currently has no effect on
> > the rendering logic. It may well be buggy too, but it compiles and runs
> > the "hello world" test. I will try to write some test cases, but I'm not
> > promising any dates.
> >
> > I'm not sure what the best way to hook this in with rendering is. It may
> > well depend on the renderer. In PDF or Postscript the obvious easy
> > solution is just to set the CTM when entering the reference area, but
> > for other renderers this will probably not be possible.
> >
> > I'm happy to answer any questions on this (if I can).
> >
> > The other thing I've worked on is in the actual LayoutManager logic.
> > I've got this concept of a "BreakPosition" and have some code written at
> > the inline level (text, inline, line). The idea is that instead of an LM
> > calling generateAreas on its child LM, it repeatedly calls
> > "getNextBreakPosition". It then uses the returned BreakPosition
> > information to decide on the best break. Only then does it ask the child
> > LM to actually make the Areas necessary to break at the point. My goal
> > is to try to get this stuff ASAP into a state where it will at least
> > compile and can be put into the current code base.
> >
> > Sorry for not being more active,
> > Karen
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
> For additional commands, email: fop-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: REDESIGN: where I have been hiding

Posted by Keiron Liddle <ke...@aftexsw.com>.
Hi Karen,

Welcome back.
Yes it would be great if you could write about the property handling stuff 
if you have the time.

As for the CTM stuff.
For PDF there is a class PDFState which keeps track of pdf graphic state 
information. Maybe this could be useful. I think the only way to deal with 
any transform is to start a new state (q, Q). Currently it is used by the 
PDFGraphics2D for SVG drawing. It helps to not need to put every drawing 
instruction into a new graphics state and colour etc. is only changed when 
needed.

I was also wondering if instead of the call "LayoutManager 
getLayoutManager()" it might be better to use "void 
addLayoutManager(LayoutManager parent)". So in cases, for example wrapper, 
where the element does not have an layout manager but there could be 
multiple children that have a layout manager it will be easier to handle. 
It does change how the parent layout manager handles its children.

 From the brief description of the "getNextBreakPosition" it looks like it 
might be a good idea.

Keiron.

On 2002.02.17 23:24 klease@club-internet.fr wrote:
> Hi all,
> 
> As you may have noticed I've been quite sllent for a while now. It's not
> for lack of interest but as usual for want of time. I've started a
> project at work which is going to be eating most of my time and energy
> for at least a couple of months more.
> 
> I'm happy to see that we seem to be getting some new blood in the group
> and I applaud Keiron's "Understanding ... initiative". Perhaps you'd
> like me to write a bit about the current property handling in that
> series?
> 
> I've actually (finally!) done a bit of work on FOP so here's an update.
> 
> I've just committed some stuff into the main branch. I put a class
> called CTM (coordinate transformation matrix as in Postscript/PDF) into
> the area package. It's currently set up when the page master is making
> regions. The idea is that it will transform writing-mode relative
> coordinates into media-relative coordinates. For now media means
> standard 1st quandrant coordinates as used in the default PDF or
> PostScript coordinate system, where origin is at the lower left of the
> page.
> 
> The CTM accounts for both reference-orientation and writing mode on
> reference areas. There is a CTM at the page-reference area level which
> is used to transform writing-mode relative region coordinates into media
> coordinates. Similarly the CTM at the region level should transform
> writing-mode relative coordinates for its child areas into media
> coordinates. The layout managers should then generate Area objects whose
> position and size is expressed in writing-mode relative values. So if
> "x" = start, "y" = before, "width" = ipd and "height" = bpd, the CTM
> should turn that into actual x, y coordinates on the page.
> 
> The CTM class itself just does the basic math functions. I've put most
> of the logic of setting up the CTM into the PropertyManager which may
> not be the right place, but at least it's central. The method
> "getCTMandRelDims" is called both from SimplePageMaster and Region (in
> fo/pagination). The RelDims business is sort of a hack, but I wanted to
> set inline and block-progression dimensions and I already had the info
> from the CTM calculations... I'm sure one of you will have a better
> idea!
> 
> The logic is certainly incomplete and the CTM currently has no effect on
> the rendering logic. It may well be buggy too, but it compiles and runs
> the "hello world" test. I will try to write some test cases, but I'm not
> promising any dates.
> 
> I'm not sure what the best way to hook this in with rendering is. It may
> well depend on the renderer. In PDF or Postscript the obvious easy
> solution is just to set the CTM when entering the reference area, but
> for other renderers this will probably not be possible.
> 
> I'm happy to answer any questions on this (if I can).
> 
> The other thing I've worked on is in the actual LayoutManager logic.
> I've got this concept of a "BreakPosition" and have some code written at
> the inline level (text, inline, line). The idea is that instead of an LM
> calling generateAreas on its child LM, it repeatedly calls
> "getNextBreakPosition". It then uses the returned BreakPosition
> information to decide on the best break. Only then does it ask the child
> LM to actually make the Areas necessary to break at the point. My goal
> is to try to get this stuff ASAP into a state where it will at least
> compile and can be put into the current code base.
> 
> Sorry for not being more active,
> Karen

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: REDESIGN: where I have been hiding

Posted by kl...@club-internet.fr.
"Peter B. West" wrote:

I'll try to explain the algorithm a bit for line-building; maybe that
will help to clarify what I meant.

The TextLayoutManager generates a BreakPosition for each possible line
break (not including hyphenation at first). This means breakable spaces
or other possible line end characters like hard hyphens (maybe some
UserAgent list of what constitutes a reasonable linebreak??). The parent
of the TextLM is either an InlineLM or a LineLM. If it's an InlineLM, it
just "wraps" the BreakPosition from the Text by adding any extra space
at the inline level (space-start, space-end, padding...). At the LineLM
level, the manager knows how much space it has available for the
LineArea. It looks at each BreakPosition to see if it still fits in that
space. When it sees one that doesn't fit (ie, the BreakPosition is
beyond the end of its available inline-progression-dimension), it will
then go into hyphenation mode to try to find a break between the
previous BP (which still fit) and the new one. This is where it may
decide on various options, such as more or less stretch in the
white-space, vs hyphens in succeeding lines vs keeps etc.

At the block level, the analogous logic is in the Flow LayoutManager. It
will look at various BreakPositions which express how many Lines or
other block-stacked Areas can fit in the current Flow Area. It makes its
decision based on keep conditions and white-space stretch.

Regards,
Karen

> klease@club-internet.fr wrote:
> 
> >The other thing I've worked on is in the actual LayoutManager logic.
> >I've got this concept of a "BreakPosition" and have some code written at
> >the inline level (text, inline, line). The idea is that instead of an LM
> >calling generateAreas on its child LM, it repeatedly calls
> >"getNextBreakPosition". It then uses the returned BreakPosition
> >information to decide on the best break. Only then does it ask the child
> >LM to actually make the Areas necessary to break at the point.
> >
> Karen,
> 
> I like the idea of BreakPosition being constantly updated (see my notes
> on co-routines).  What do you mean by "decide on the best break"?  What
> sort of things do you see going into that decision that cannot, in
> essence, be decided by the child?  Are you thinking about the resolution
> of ambiguous situations which require knowledge of partial results from
> a number of parallel area subtrees?
> 
> Peter
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
> For additional commands, email: fop-dev-help@xml.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: REDESIGN: where I have been hiding

Posted by "Peter B. West" <pb...@powerup.com.au>.
klease@club-internet.fr wrote:

>The other thing I've worked on is in the actual LayoutManager logic.
>I've got this concept of a "BreakPosition" and have some code written at
>the inline level (text, inline, line). The idea is that instead of an LM
>calling generateAreas on its child LM, it repeatedly calls
>"getNextBreakPosition". It then uses the returned BreakPosition
>information to decide on the best break. Only then does it ask the child
>LM to actually make the Areas necessary to break at the point.
>
Karen,

I like the idea of BreakPosition being constantly updated (see my notes 
on co-routines).  What do you mean by "decide on the best break"?  What 
sort of things do you see going into that decision that cannot, in 
essence, be decided by the child?  Are you thinking about the resolution 
of ambiguous situations which require knowledge of partial results from 
a number of parallel area subtrees?

Peter


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: [PROPOSAL] linebreak

Posted by ewitness - Ben Fowler <bf...@ewitness.co.uk>.
>Comments below.
>
>[ snip ]
>
>3. Final discussion comment: XSL formatters _do_ ignore the presence of
>linefeeds (in one of several different interpretations of "ignore") by
>default. By choosing 'preserve' for linefeed-treatment you _are_ basically
>doing a <PRE> operation, with respect to linefeeds. So I don't see much of a
>difference or any grounds for objection.
>
>But I do see an argument for a semantic linebreak in the source. Relying on
>linefeeds or the lack thereof in source XML is a bit problematic.

Thank you. I don't exactly have a problem with the mechanism itself,
more that it is too complicated for most people to understand without
a tutor (as I found). This can be countered by an argument (which I
accept) that .fo files are usually machine produced, and are not
pretty printed or edited. Against that, (1) the fragments that make
up an .fo file most certainly are, and (2) there is no bar to creating
a .fo file directly of by some mechanism other than XSLT.

>4. normalize-space(): The XPath function takes tabs, spaces, carriage
>returns and linefeeds and does what you say. I think that the existing
>string functions in XPath/XSLT are not sufficiently powerful to easily do
>what you wish; OTOH the activities of the XSL people to come out with a new
>XSLT and XPath include regular expressions (see
>http://www.w3.org/TR/xquery-operators/) so this is one way in which you
>could do what you want.

Thank you. I am simply not that familiar with XPath. The issue arose,
as you might have guessed by an editor assuming that it could add any
amount of white space at the beginning or end of an element (quite
reasonable in the XML world), and I had assumed that there would be
a matching function that would remove it. Maybe I am making a mountain
out of a molehill. I feel that this is a result of trying to
standardise too early, id est without sufficient, or sufficient duration
of experience.

Ben.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: [PROPOSAL] linebreak

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
Comments below.

-----Original Message-----
From: ewitness - Ben Fowler [mailto:bfowler@ewitness.co.uk]
Sent: February 25, 2002 9:41 AM
To: fop-dev@xml.apache.org
Subject: RE: [PROPOSAL] linebreak


> >
>>I guess the reason nobody thought <fo:br/> or <fo:newline/> would be
> >required is because a U+000A will do the trick.
>
> [ snip ]
>
>In any case, a linefeed (LF) must be honoured, and result in a linebreak.
>_If_ the conditions are right. What that means is, the initial value for
>"linefeed-treatment" is "treat-as-space", which _does_ do a conversion of
>U+000A to U+0020 (space). So you would want to specify
>"linefeed-treatment='preserve'" on an ancestor flow object (possibly
>fo:root) and allow it to propagate to the FOs of interest, as it is
>inheritable. The "whitespace-*" properties do not affect the linefeed, and
>suppress-at-line-break can also be left as it is.
>
>But essentially the LF is there to accomplish what you want to do. The
>initial setting of "linefeed-treatment" acts to give us LaTeX-like
>behaviour, but unlike LaTeX we can switch to something different in this
>regard, rather than use new markup.

The answer that you gave is also to be found a few lines down
from the first URL I gave you

	4.	 Forced line-breaks are respected. Specifically, if A
	is the glyph-area generated by a fo:character whose Unicode
	character is U+000A, then A must be the last area in its
	containing subset Si.

I don't mind admitting that as an outsider to the XML standard, this
looks like a bad, even a really bad, idea.

My reading of your commentary is "Whitespace is sometimes respected,
and only a langauge lawyer can tell you when".

How should this be interpreted?

Do you think that HTML would be improved if the <BR> element was
replaced with a feature that said "You can get the effect of a
forced linebreak by setting 'linefeed-treatment' to 'preserve'
in the <body> of the page (or other container as required), which
causes all unix line feeds to be rendered" instead the <br /> element
which is what was done?

>>From my POV this has an inhibiting effect on all editors and pretty
printing utilities, which must also respect exisiting white space
(as XSL processors do) and never introduce line feeds, in case this
setting was ever turned on. From my POV, a formatter should always
ignore the formatting of the source, unless notified that it is
preformatted as in the case of <PRE> and CDATA, exempli gratia
<URL:
http://archive.ncsa.uiuc.edu/SDG/IT94/Proceedings/Autools/sperberg-mcqueen/s
perberg.html >,
(1994) about half way down.

Do you happen to know whether this was ever discussed (id est objections
sought and answered) or whether this was one person's idea that
was incorporated as is.

I have a related 'issue' which is that the normalize-string( ) function
in XSL does two things. It trims leading and trailling newlines
and other whitespace, and it normalises internal white space.
I have a need for an operation that does the former, but not the latter.

(In fact I have an implementation which appears to be buggy
and replaces 'Miss A Burgrave' with Miss ABurgrave', but handles
'Miss A  Burgrave' correctly.

In short, XML processors including ones that produce XML-FO files
should pass through all whitespace, and processors such as fop
which are also XML processors, but adjusted so that they do not
produce XML, should (at least in general) normalise whitespace.
Where the output file format respects whitespace then it should
be supplied as <fo:text> or as some break (as my original suggestion)
The present situation is that the latter type of processor may not
normalise whitespace, because some newlines are significant.

Incidently, you have not made (or reported) a case against my suggestion:
unless it is harmful (or confusing) there is no real reason why both
styles of indicating significant breaks could not be used, is there?

[ SNIP example ]

line-feed treatment was reported as not working in June last
year, <URL: http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1998 >,
I don't know whether tit is now in.

I now have a linux installation (but not yet CVS), and so I am in a position
to
start some development work on FOP. Where should I start? Is there a list
of outstanding tasks?

I wrote that a few days ago, but delayed sending it until I could
see what bugzilla could tell me. In the meantime, bugzilla has sent
me an e-mail giving no fewer than 195 issues. My search on bugzilla
reveals 18 high priority bugs.
<URL:
http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSI
GNED&bug_status=REOPENED&priority=High&email1=&emailtype1=substring&emailass
igned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&
bug_id=&changedin=&votes=& >

Nonetheless, my query remains, is there a list of issues which
people can start working on now, that won't need to be re-done
once the redesign is on place.

Ben

********************************************************

My Comments:

1. Bear in mind that 'linefeed-treatment' need not be a global. You can
leave it to the initial value ('treat-as-space'), or change it to 'ignore'
if you like (whatever suits) and so the document as a whole will have LaTeX
or HTML-like behaviour when it comes to linefeeds in text.

But for specific blocks you can explicitly set the value to 'preserve', and
then you know that linefeeds in that block will be acted upon.

2. However, I gather you don't like that much. Even if FOP worked in this
regard you still want an explicit linebreak. OK, let's operate on that
premise. And let's assume that we use the spec as it is. In this case one
option for your stylesheet is to implement the above (in comment 1):

<xsl:if test="br">
	<xsl:attribute name="linefeed-treatment">preserve</xsl:attribute>
</xsl:if>

If you have this inside each template of interest:

<xsl:template match="para">
<fo:block>
    <xsl:if test="br">
        <xsl:attribute name="linefeed-treatment">preserve</xsl:attribute>
    </xsl:if>
	<xsl:apply-templates/>
</fo:block>
</xsl:template>

then the presence of a <br/> will throw the switch for that block.

You'd want to finetune this test, so as to reset the switch for descendant
blocks that do _not_ contain <br/>, but you get the idea.

I have no idea how expensive this approach is in terms of XSLT processing
but my gut feeling is it's probably not too bad.

3. Final discussion comment: XSL formatters _do_ ignore the presence of
linefeeds (in one of several different interpretations of "ignore") by
default. By choosing 'preserve' for linefeed-treatment you _are_ basically
doing a <PRE> operation, with respect to linefeeds. So I don't see much of a
difference or any grounds for objection.

But I do see an argument for a semantic linebreak in the source. Relying on
linefeeds or the lack thereof in source XML is a bit problematic.

4. normalize-space(): The XPath function takes tabs, spaces, carriage
returns and linefeeds and does what you say. I think that the existing
string functions in XPath/XSLT are not sufficiently powerful to easily do
what you wish; OTOH the activities of the XSL people to come out with a new
XSLT and XPath include regular expressions (see
http://www.w3.org/TR/xquery-operators/) so thsi is one way in which you
could do what you want.

Regards,
Arved Sandstrom


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: [PROPOSAL] linebreak

Posted by ewitness - Ben Fowler <bf...@ewitness.co.uk>.
> >
>>I guess the reason nobody thought <fo:br/> or <fo:newline/> would be
> >required is because a U+000A will do the trick.
>
> [ snip ]
>
>In any case, a linefeed (LF) must be honoured, and result in a linebreak.
>_If_ the conditions are right. What that means is, the initial value for
>"linefeed-treatment" is "treat-as-space", which _does_ do a conversion of
>U+000A to U+0020 (space). So you would want to specify
>"linefeed-treatment='preserve'" on an ancestor flow object (possibly
>fo:root) and allow it to propagate to the FOs of interest, as it is
>inheritable. The "whitespace-*" properties do not affect the linefeed, and
>suppress-at-line-break can also be left as it is.
>
>But essentially the LF is there to accomplish what you want to do. The
>initial setting of "linefeed-treatment" acts to give us LaTeX-like
>behaviour, but unlike LaTeX we can switch to something different in this
>regard, rather than use new markup.

The answer that you gave is also to be found a few lines down
from the first URL I gave you

	4.	 Forced line-breaks are respected. Specifically, if A
	is the glyph-area generated by a fo:character whose Unicode
	character is U+000A, then A must be the last area in its
	containing subset Si.

I don't mind admitting that as an outsider to the XML standard, this
looks like a bad, even a really bad, idea.

My reading of your commentary is "Whitespace is sometimes respected,
and only a langauge lawyer can tell you when".

How should this be interpreted?

Do you think that HTML would be improved if the <BR> element was
replaced with a feature that said "You can get the effect of a
forced linebreak by setting 'linefeed-treatment' to 'preserve'
in the <body> of the page (or other container as required), which
causes all unix line feeds to be rendered" instead the <br /> element
which is what was done?

>From my POV this has an inhibiting effect on all editors and pretty
printing utilities, which must also respect exisiting white space
(as XSL processors do) and never introduce line feeds, in case this
setting was ever turned on. From my POV, a formatter should always
ignore the formatting of the source, unless notified that it is
preformatted as in the case of <PRE> and CDATA, exempli gratia
<URL: http://archive.ncsa.uiuc.edu/SDG/IT94/Proceedings/Autools/sperberg-mcqueen/sperberg.html >,
(1994) about half way down.

Do you happen to know whether this was ever discussed (id est objections
sought and answered) or whether this was one person's idea that
was incorporated as is.

I have a related 'issue' which is that the normalize-string( ) function
in XSL does two things. It trims leading and trailling newlines
and other whitespace, and it normalises internal white space.
I have a need for an operation that does the former, but not the latter.

(In fact I have an implementation which appears to be buggy
and replaces 'Miss A Burgrave' with Miss ABurgrave', but handles
'Miss A  Burgrave' correctly.

In short, XML processors including ones that produce XML-FO files
should pass through all whitespace, and processors such as fop
which are also XML processors, but adjusted so that they do not
produce XML, should (at least in general) normalise whitespace.
Where the output file format respects whitespace then it should
be supplied as <fo:text> or as some break (as my original suggestion)
The present situation is that the latter type of processor may not
normalise whitespace, because some newlines are significant.

Incidently, you have not made (or reported) a case against my suggestion:
unless it is harmful (or confusing) there is no real reason why both
styles of indicating significant breaks could not be used, is there?

Using FOP derived from version 0.14, I get this report when I tried
the following .fo

	WARNING: property 'linefeed-treatment' ignored
	WARNING: property 'linefeed-treatment' ignored
	setting up fonts
	formatting FOs into areas
	[1]
	rendering areas to PDF

(source)

	<?xml version="1.0" encoding="UTF-8"?>
	<fo:root
			xmlns:fo="http://www.w3.org/1999/XSL/Format"
			text-align="justified" font-size="12pt" font-family="serif"
			linefeed-treatment='preserve' >
		<fo:layout-master-set>
			<fo:simple-page-master
					margin-right="50pt" margin-left="100pt"
					margin-bottom="25pt" margin-top="75pt" master-name="all">
				<fo:region-body margin-bottom="50pt" />
				<fo:region-after extent="25pt" />
			</fo:simple-page-master>
		</fo:layout-master-set>
		<fo:page-sequence id="" hyphenate="true" master-name="all" language="en">
			<fo:flow flow-name="xsl-region-body">
				<fo:block linefeed-treatment='preserve'>
					Bilbo Baggins,
					Bag End,
					Underhill,
					Hobbiton,
					Westfarthing of the Shire.
				</fo:block>
			</fo:flow>
		</fo:page-sequence>
	</fo:root>

line-feed treatment was reported as not working in June last
year, <URL: http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1998 >,
I don't know whether tit is now in.

I now have a linux installation (but not yet CVS), and so I am in a position to
start some development work on FOP. Where should I start? Is there a list
of outstanding tasks?

I wrote that a few days ago, but delayed sending it until I could
see what bugzilla could tell me. In the meantime, bugzilla has sent
me an e-mail giving no fewer than 195 issues. My search on bugzilla
reveals 18 high priority bugs.
<URL: http://nagoya.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&priority=High&email1=&emailtype1=substring&emailassigned_to1=1&email2=&emailtype2=substring&emailreporter2=1&bugidtype=include&bug_id=&changedin=&votes=& >

Nonetheless, my query remains, is there a list of issues which
people can start working on now, that won't need to be re-done
once the redesign is on place.


Ben.







---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: [PROPOSAL] linebreak was Re: REDESIGN: where I have been hiding

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
-----Original Message-----
From: ewitness - Ben Fowler [mailto:bfowler@ewitness.co.uk]
Sent: February 18, 2002 9:36 PM
To: fop-dev@xml.apache.org
Subject: RE: [PROPOSAL] linebreak was Re: REDESIGN: where I have been
hiding

>This would be useful in writing addresses exempli gratia:
>
><?xml version="1.0" encoding="UTF-8"?>
><fo:root text-align="justified" font-size="12pt" font-family="serif">
>	<fo:block>
>		Bilbo Baggins,<fo: newline />
>		Bag End,<fo: newline />
>		Underhill,<fo: newline />
>		Hobbiton,<fo: newline />
>		Westfarthing of the Shire.
>	</fo:block>
></fo:root>
>
>At present, I can get the effect I want with tables.
>
>Ben.
>-----end of Original Message-----
>
>I guess the reason nobody thought <fo:br/> or <fo:newline/> would be
>required is because a U+000A will do the trick.

Thank you. I had assumed that that character would count as white
space, and would be normalised away.

I will try it.

Ben.

---------------------------------------------------------------------

My answer was so terse that maybe it sounded snippy, which was not my
intention.

I also can't say that FOP is up to spec with whitespace handling. I'm
thinking that it's not, but I'll have to check myself. So my comments are
related to the spec only.

In any case, a linefeed (LF) must be honoured, and result in a linebreak.
_If_ the conditions are right. What that means is, the initial value for
"linefeed-treatment" is "treat-as-space", which _does_ do a conversion of
U+000A to U+0020 (space). So you would want to specify
"linefeed-treatment='preserve'" on an ancestor flow object (possibly
fo:root) and allow it to propagate to the FOs of interest, as it is
inheritable. The "whitespace-*" properties do not affect the linefeed, and
suppress-at-line-break can also be left as it is.

But essentially the LF is there to accomplish what you want to do. The
initial setting of "linefeed-treatment" acts to give us LaTeX-like
behaviour, but unlike LaTeX we can switch to something different in this
regard, rather than use new markup.

Regards,
AHS


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: [PROPOSAL] linebreak was Re: REDESIGN: where I have been hiding

Posted by ewitness - Ben Fowler <bf...@ewitness.co.uk>.
>This would be useful in writing addresses exempli gratia:
>
><?xml version="1.0" encoding="UTF-8"?>
><fo:root text-align="justified" font-size="12pt" font-family="serif">
>	<fo:block>
>		Bilbo Baggins,<fo: newline />
>		Bag End,<fo: newline />
>		Underhill,<fo: newline />
>		Hobbiton,<fo: newline />
>		Westfarthing of the Shire.
>	</fo:block>
></fo:root>
>
>At present, I can get the effect I want with tables.
>
>Ben.
>-----end of Original Message-----
>
>I guess the reason nobody thought <fo:br/> or <fo:newline/> would be
>required is because a U+000A will do the trick.

Thank you. I had assumed that that character would count as white
space, and would be normalised away.

I will try it.

Ben.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: [PROPOSAL] linebreak was Re: REDESIGN: where I have been hiding

Posted by Arved Sandstrom <Ar...@chebucto.ns.ca>.
-----Original Message-----
From: ewitness - Ben Fowler [mailto:bfowler@ewitness.co.uk]
Sent: February 18, 2002 11:29 AM
To: fop-dev@xml.apache.org
Subject: [PROPOSAL] linebreak was Re: REDESIGN: where I have been hiding

[ SNIP ]

Could we have a a line break feature.

The layoutmanager should fill a block by stacking lines based
on the current line height. It ends lines based on word length
and hyphenation. I would like to force a line break with an inline
such as <fo:br /> or <fo:newline />.

Since this is such an obvious thing to want, I guess that there
must be reason why it is not part of the specification.

Does anyone know why not?

This is simply the difference expressed in TeX terms between

	\newline, and
	\par

This would be useful in writing addresses exempli gratia:

<?xml version="1.0" encoding="UTF-8"?>
<fo:root text-align="justified" font-size="12pt" font-family="serif">
	<fo:block>
		Bilbo Baggins,<fo: newline />
		Bag End,<fo: newline />
		Underhill,<fo: newline />
		Hobbiton,<fo: newline />
		Westfarthing of the Shire.
	</fo:block>
</fo:root>

At present, I can get the effect I want with tables.

Ben.
-----end of Original Message-----

I guess the reason nobody thought <fo:br/> or <fo:newline/> would be
required is because a U+000A will do the trick.

Regards,
Arved Sandstrom


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


[PROPOSAL] linebreak was Re: REDESIGN: where I have been hiding

Posted by ewitness - Ben Fowler <bf...@ewitness.co.uk>.
>The other thing I've worked on is in the actual LayoutManager logic.
>I've got this concept of a "BreakPosition" and have some code written at
>the inline level (text, inline, line). The idea is that instead of an LM
>calling generateAreas on its child LM, it repeatedly calls
>"getNextBreakPosition". It then uses the returned BreakPosition
>information to decide on the best break. Only then does it ask the child
>LM to actually make the Areas necessary to break at the point. My goal
>is to try to get this stuff ASAP into a state where it will at least
>compile and can be put into the current code base.

It might be possible to rule this out at once, on the grounds
of its being outside the FO standard, see
<URL: http://www.w3.org/TR/2001/REC-xsl-20011015/slice4.html#area-linebuild >
and <URL: http://www.idealliance.org/papers/xml2001/papers/html/03-05-06.html >
but I have a request based on some experience with writing FO files
by hand.

Could we have a a line break feature.

The layoutmanager should fill a block by stacking lines based
on the current line height. It ends lines based on word length
and hyphenation. I would like to force a line break with an inline
such as <fo:br /> or <fo:newline />.

Since this is such an obvious thing to want, I guess that there
must be reason why it is not part of the specification.

Does anyone know why not?

This is simply the difference expressed in TeX terms between

	\newline, and
	\par

This would be useful in writing addresses exempli gratia:

<?xml version="1.0" encoding="UTF-8"?>
<fo:root text-align="justified" font-size="12pt" font-family="serif">
	<fo:block>
		Bilbo Baggins,<fo: newline />
		Bag End,<fo: newline />
		Underhill,<fo: newline />
		Hobbiton,<fo: newline />
		Westfarthing of the Shire.
	</fo:block>
</fo:root>

At present, I can get the effect I want with tables.

Ben.



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org