You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Victor Mote <vi...@outfitr.com> on 2002/07/31 21:23:10 UTC

forward references & related design issues

FOP Developers:

Seeing all of the posts about performance issues, esp. related to forward
references, has brought to mind some design musings I had on a related topic
several months ago, a subset of a bigger issue that is important for work
that one of our clients does. They publish several related documents in a
collection, with cross-references between them -- for example, document A
might say "See page x in document B". Now if document A references 20 such
documents, one could conceivably end up with 21 documents in memory at the
same time, each being processed, and lots of fun dealing with dependencies
between them. I realize that FOP doesn't handle this scenario, but thinking
through the solution to this problem might provide a general solution for
both issues.

FrameMaker handles this issue very nicely. You actually have to have all 21
documents open at the same time (although I doubt that the entire document
is loaded into memory). However, FrameMaker lives in a state where its
layout is already computed and stored, so all it has to do is lookup the
page object related to a target. It seems that what is needed for FOP (and
probably XSL-FO in general) is an XML document that contains this target
information (and perhaps other formatting-specific information as well). FOP
would have to run once to compute the targets & write this XML target file,
then run again to use it. This XML document can be serialized and persist,
so that it can be used in the rendering of other documents that might point
to its parent. The user would have to be responsible to make sure that the
XML document containing the targets is properly updated before being used.
(FOP would need a mode that, instead of rendering the document, merely
recomputed the target file -- thus a series of documents could be recomputed
before any of them are rendered).

Obvious issue #1: The content of the target can change the layout of the
document, and thereby change the location of the target. The layout manager
must explicitly or implicitly estimate the size of the text so that it can
continue with the layout work on subsequent pages. This issue exists whether
the information is stored in memory or written to a file, so this is not
really an issue with the proposed solution.

Obvious issue #2: Some (perhaps most) documents will not benefit from a
2-pass solution and persistent layout information. This could be handled in
a configuration option (for defaults), and a command-line option (for
overrides).

Obvious issue #3: Development time. I am not up to speed on the FOP
implementation yet, so I don't have a good feel for this issue. However, I
see that performance is one of the goals of the redesign, and this might
help with that in some cases.

Obvious issue #4: Standards compliance / extensions. This or something
similar that addresses cross-document references is probably needed as part
of the XSL-FO standard, but I do not know whether or when it will be
addressed at that level, or how far ahead of that process FOP wants to get.
As far as the current standard, I think it could be considered an
implementation detail.

I apologize if this has already been discussed / coded, etc. I haven't
stumbled across it yet.

Victor Mote (mailto:vic@outfitr.com)
Enterprise Outfitters (www.outfitr.com)
2025 Eddington Way
Colorado Springs, Colorado 80916
Voice 719-622-0650, Fax 720-293-0044


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


RE: forward references & related design issues

Posted by "Matthew L. Avizinis" <ml...@gleim.com>.
This issue is at least partially addressed when I posted the code earlier
today for my FOP extension element fop:WritePageNumber which currently I use
to create a Table of contents and index for pdf books.
The books are processed through FOP one "section", e.g. Introduction,
Preface, Chapter 1, at a time.  In each section numerous WritePageNumber
elements are inserted with associated id's.  When FOP encounters these
elements, an XML file is created or appended to which has a series of
elements, their id's, and the page numbers they occur on.

After all sections are processed, I can then generate the Table of Contents
and Index from information in the XML file just created.

With other references across sections a two pass system would work, wherein
if "see page x in Document B" is encountered, a dummy value is inserted to
keep the formatting relatively stable during the first pass.  Then after all
sections are processed, a second pass could grab the page number values from
the xml file with the page numbers.  This process could be made better in
the case where Document B has been processed "before" the current reference
to it, if I could figure out how to create a valid xml file from the start
and merely insert the child elements into it.  You could then check to see
if the id already exists and, if so, insert the page number and if not wait
until the second pass.

This process works for TOCs and indexes so I don't see why it could be
generalized some more.
Anyhow, that's my 2ยข for what it's worth,
   Matthew L. Avizinis <ma...@gleim.com>
Gleim Publications, Inc.
   4201 NW 95th Blvd.
 Gainesville, FL 32606
(352)-375-0772
      www.gleim.com <http://www.gleim.com>



> -----Original Message-----
> From: Victor Mote [mailto:vic@outfitr.com]
> Sent: Wednesday, July 31, 2002 3:23 PM
> To: mailing list fop-dev
> Subject: forward references & related design issues
>
>
> FOP Developers:
>
> Seeing all of the posts about performance issues, esp. related to forward
> references, has brought to mind some design musings I had on a
> related topic
> several months ago, a subset of a bigger issue that is important for work
> that one of our clients does. They publish several related documents in a
> collection, with cross-references between them -- for example, document A
> might say "See page x in document B". Now if document A references 20 such
> documents, one could conceivably end up with 21 documents in memory at the
> same time, each being processed, and lots of fun dealing with dependencies
> between them. I realize that FOP doesn't handle this scenario,
> but thinking
> through the solution to this problem might provide a general solution for
> both issues.
>
> FrameMaker handles this issue very nicely. You actually have to
> have all 21
> documents open at the same time (although I doubt that the entire document
> is loaded into memory). However, FrameMaker lives in a state where its
> layout is already computed and stored, so all it has to do is lookup the
> page object related to a target. It seems that what is needed for FOP (and
> probably XSL-FO in general) is an XML document that contains this target
> information (and perhaps other formatting-specific information as
> well). FOP
> would have to run once to compute the targets & write this XML
> target file,
> then run again to use it. This XML document can be serialized and persist,
> so that it can be used in the rendering of other documents that
> might point
> to its parent. The user would have to be responsible to make sure that the
> XML document containing the targets is properly updated before being used.
> (FOP would need a mode that, instead of rendering the document, merely
> recomputed the target file -- thus a series of documents could be
> recomputed
> before any of them are rendered).
>
> Obvious issue #1: The content of the target can change the layout of the
> document, and thereby change the location of the target. The
> layout manager
> must explicitly or implicitly estimate the size of the text so that it can
> continue with the layout work on subsequent pages. This issue
> exists whether
> the information is stored in memory or written to a file, so this is not
> really an issue with the proposed solution.
>
> Obvious issue #2: Some (perhaps most) documents will not benefit from a
> 2-pass solution and persistent layout information. This could be
> handled in
> a configuration option (for defaults), and a command-line option (for
> overrides).
>
> Obvious issue #3: Development time. I am not up to speed on the FOP
> implementation yet, so I don't have a good feel for this issue. However, I
> see that performance is one of the goals of the redesign, and this might
> help with that in some cases.
>
> Obvious issue #4: Standards compliance / extensions. This or something
> similar that addresses cross-document references is probably
> needed as part
> of the XSL-FO standard, but I do not know whether or when it will be
> addressed at that level, or how far ahead of that process FOP
> wants to get.
> As far as the current standard, I think it could be considered an
> implementation detail.
>
> I apologize if this has already been discussed / coded, etc. I haven't
> stumbled across it yet.
>
> Victor Mote (mailto:vic@outfitr.com)
> Enterprise Outfitters (www.outfitr.com)
> 2025 Eddington Way
> Colorado Springs, Colorado 80916
> Voice 719-622-0650, Fax 720-293-0044
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
> For additional commands, email: fop-dev-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org