You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Keiron Liddle <ke...@aftexsw.com> on 2002/02/07 08:51:57 UTC

Understanding FOP [1]

Welcome to the understanding series following Peter's suggestion. This
will be a series of notes for developers to understand how FOP works.
Questions should be asked. We (yes others too) will attempt to clarify the
processes involved to go from xml(fo) to pdf or other formats. Some areas
will get more complicated as we proceed.

Introduction
------------

FOP takes an xml file does its magic and then writes a document to a
stream.
xml -> [FOP] -> document
The document could be pdf, ps etc. or directed to a printer or the screen.
The principle remains the same.

The xml document must be in the XSL:FO format. For convenience we provide
a mechanism to handle XML+XSL as input.
The xml document is always handled internally as SAX. The SAX events are
used to read the elements, attributes and text data of the FO document.

After the manipulation of the data the renderer writes out the pages in
the appropriate format. It may write as it goes, a page at a time or the
whole document at once. Once finished the document should contain all the
data in the chosen format ready for whatever use.

Stages
------

The fo data goes through a few stages. Each piece of data will generally
go through the process in the same way but some information may be used a
number of times or in a different order. To reduce memory one stage will
start before the previous is completed.
SAX Handler -> FO Tree -> Layout Managers -> Area Tree -> Render -> 
document

In the case of rtf, mif etc.
SAX Handler -> FO Tree -> Structure Listener -> document

The FO Tree is constructed from the xml document. It is an internal 
representation of the xml document and it is like a DOM with some 
differences.
The Layout Managers use the FO Tree do their layout stuff and create an 
Area Tree. The Area Tree is a representation of the final result. It is a 
representation of a set of pages containing the text and other graphics.
The Area Tree is then given to a Renderer. The Renderer can read the Area 
Tree and convert the information into the render format. For example the 
PDF Renderer creates a PDF Document. For each page in the Area Tree the 
renderer creates a PDF Page and places the contents of the page into the 
PDF Page. Once a PDF Page is complete then it can be written to the output 
stream.

For the structure documents the Structure listener will read directly from 
the FO Tree and create the document. These documents do not need the 
layout process or the Area Tree.

Associated Tasks
----------------

Verify Structure Listener concept.


Further Topics
--------------
XML parsing
FO Tree
Properties
Layout Managers
Layout Process
Handling Attributes
Area Tree
Renderers
Images
PDF Library
SVG
...


-----------------------------

Questions are welcome. Stick to the topic, no details, this is an 
introduction.



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Understanding FOP [1] (StructureRenderer)

Posted by Bertrand Delacretaz <bd...@codeconsult.ch>.
On Tuesday 12 February 2002 09:21, Keiron Liddle wrote:
>. . .
> I would say the StructureListener and the LayoutManagers are
> equivalent in terms of using the FO Tree and performing tasks with
> certain parts of the FO Tree. Maybe Structure Renderer is a better
> term.

I agree that StructureRenderer sounds good for objects that process the 
FO Tree.

Might mean renaming (at least in discussions of concepts) the current 
"renderers" to PrintRenderer or PageRenderer?

- Bertrand

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Understanding FOP [1]

Posted by Keiron Liddle <ke...@aftexsw.com>.
On 2002.02.11 16:44 Bertrand Delacretaz wrote:
> On Monday 11 February 2002 16:04, Peter B. West wrote:
> > . . .
> > A good way to build pipelines is with a series of pipes with
> > nice orderly linear data flows.  Then snapping off a LayoutManager
> > and clicking in a StructureListener is trivial.  In any case, surely
> > the output from the FO Tree builder should be identical, whatever is
> > downstream?
> 
> I think so too. Maybe introducing an extra step in Keiron's diagram
> would help, from:
> 
> > > SAX Handler -> FO Tree -> Layout Managers -> Area Tree -> Render ->
> > > document
> 
> to
> SAX Handler -> FO Tree -> StructureListener -> Layout Managers -> etc.
> 
> Saying that LayoutManager "is a" StructureListener (whatever we call
> it) would make things clearer IMHO.  Could maybe just be a matter of
> renaming or commenting a few things in LayoutManager to make it more
> general (or make it clearer that it's already general enough)?
> 
> - Bertrand

For the steps that I put in all the data passes through those steps. The 
Structure listener probably only handles some parts of the data.
I would say the StructureListener and the LayoutManagers are equivalent in 
terms of using the FO Tree and performing tasks with certain parts of the 
FO Tree. Maybe Structure Renderer is a better term.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Understanding FOP [1]

Posted by Bertrand Delacretaz <bd...@codeconsult.ch>.
On Monday 11 February 2002 16:04, Peter B. West wrote:
> . . .
> A good way to build pipelines is with a series of pipes with
> nice orderly linear data flows.  Then snapping off a LayoutManager
> and clicking in a StructureListener is trivial.  In any case, surely
> the output from the FO Tree builder should be identical, whatever is
> downstream?

I think so too. Maybe introducing an extra step in Keiron's diagram 
would help, from:

> > SAX Handler -> FO Tree -> Layout Managers -> Area Tree -> Render ->
> > document

to
SAX Handler -> FO Tree -> StructureListener -> Layout Managers -> etc.

Saying that LayoutManager "is a" StructureListener (whatever we call 
it) would make things clearer IMHO.  Could maybe just be a matter of 
renaming or commenting a few things in LayoutManager to make it more 
general (or make it clearer that it's already general enough)?

- Bertrand

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Understanding FOP [1]

Posted by Keiron Liddle <ke...@aftexsw.com>.
On 2002.02.11 16:04 Peter B. West wrote:
> This is probably for later, too, but I would like to see the plumbing 
> discussed, as a whole, in more detail.  For example, in terms of the 
> existing version, a diagram of nested method calls illustrating the way 
> in which these pipes `->' are realised in practice.  There is probably 
> some clean crisp way to do this in UML, with which I am unfamiliar.

It is something for later. There is a UML diagram that shows method calls 
that may be useful, the problem is that in practice the calls can become 
convoluted.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org


Re: Understanding FOP [1]

Posted by "Peter B. West" <pb...@powerup.com.au>.
Keiron,

Keiron Liddle wrote:
...

> The xml document must be in the XSL:FO format. For convenience we provide
> a mechanism to handle XML+XSL as input.
> The xml document is always handled internally as SAX. The SAX events are
> used to read the elements, attributes and text data of the FO document.

...

>
> Stages
> ------
>
> The fo data goes through a few stages. Each piece of data will generally
> go through the process in the same way but some information may be used a
> number of times or in a different order. To reduce memory one stage will
> start before the previous is completed.
> SAX Handler -> FO Tree -> Layout Managers -> Area Tree -> Render -> 
> document
>
> In the case of rtf, mif etc.
> SAX Handler -> FO Tree -> Structure Listener -> document

This is probably for later, too, but I would like to see the plumbing 
discussed, as a whole, in more detail.  For example, in terms of the 
existing version, a diagram of nested method calls illustrating the way 
in which these pipes `->' are realised in practice.  There is probably 
some clean crisp way to do this in UML, with which I am unfamiliar.

>
> The FO Tree is constructed from the xml document. It is an internal 
> representation of the xml document and it is like a DOM with some 
> differences.
> The Layout Managers use the FO Tree do their layout stuff and create 
> an Area Tree. The Area Tree is a representation of the final result. 
> It is a representation of a set of pages containing the text and other 
> graphics.
> The Area Tree is then given to a Renderer. The Renderer can read the 
> Area Tree and convert the information into the render format. For 
> example the PDF Renderer creates a PDF Document. For each page in the 
> Area Tree the renderer creates a PDF Page and places the contents of 
> the page into the PDF Page. Once a PDF Page is complete then it can be 
> written to the output stream.
>
> For the structure documents the Structure listener will read directly 
> from the FO Tree and create the document. These documents do not need 
> the layout process or the Area Tree. 


I've noticed your discussion with Bertrand about StructureListeners. 
 Forgive me if I continue to nag about this, but it seems to me that the 
parallelism in the pipeline militates against "reading directly from the 
FO Tree."  The LayoutManagers and the StructureListener may need to be 
able to access the FO Tree directly, but the downstream processes in a 
pipeline do need to receive notifications of upstream events.  A good 
way to build pipelines is with a series of pipes with nice orderly 
linear data flows.  Then snapping off a LayoutManager and clicking in a 
StructureListener is trivial.  In any case, surely the output from the 
FO Tree builder should be identical, whatever is downstream?


Peter


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-dev-unsubscribe@xml.apache.org
For additional commands, email: fop-dev-help@xml.apache.org