You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-commits@xmlgraphics.apache.org by vm...@apache.org on 2003/04/24 00:33:04 UTC
cvs commit: xml-fop/src/documentation/content/xdocs/design/understanding book.xml images.xml pdf_library.xml svg.xml xml_parsing.xml
vmote 2003/04/23 15:33:02
Modified: src/documentation/content/xdocs/design book.xml
Added: src/documentation/content/xdocs/design images.xml
parsing.xml pdf-library.xml svg.xml
Removed: src/documentation/content/xdocs/design/understanding
book.xml images.xml pdf_library.xml svg.xml
xml_parsing.xml
Log:
Move remaining design/understanding content to design.
Revision Changes Path
1.18 +4 -9 xml-fop/src/documentation/content/xdocs/design/book.xml
Index: book.xml
===================================================================
RCS file: /home/cvs/xml-fop/src/documentation/content/xdocs/design/book.xml,v
retrieving revision 1.17
retrieving revision 1.18
diff -u -r1.17 -r1.18
--- book.xml 22 Apr 2003 06:12:51 -0000 1.17
+++ book.xml 23 Apr 2003 22:33:02 -0000 1.18
@@ -2,11 +2,6 @@
<!DOCTYPE book PUBLIC "-//APACHE//DTD Cocoon Documentation Book V1.0//EN"
"http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-cocoon2/src/documentation/xdocs/dtd/book-cocoon-v10.dtd">
-<!--
-IF YOU MAKE CHANGES TO THIS FILE, PLEASE MAKE CORRESPONDING CHANGES TO
-understanding/book.xml, WHICH SEE FOR AN EXPLANATION.
--->
-
<book software="FOP"
title="FOP Design"
copyright="@year@ The Apache Software Foundation"
@@ -16,7 +11,7 @@
<menu-item label="Introduction" href="index.html"/>
</menu>
<menu label="Compliance">
- <menu-item label="XML Parsing" href="understanding/xml_parsing.html"/>
+ <menu-item label="XML Parsing" href="parsing.html"/>
<menu-item label="FO Tree" href="fotree.html"/>
<menu-item label="Properties" href="properties.html"/>
<menu-item label="Layout" href="layout.html"/>
@@ -24,9 +19,9 @@
<menu-item label="Renderers" href="renderers.html"/>
</menu>
<menu label="Extras">
- <menu-item label="Images" href="understanding/images.html"/>
- <menu-item label="PDF Library" href="understanding/pdf_library.html"/>
- <menu-item label="SVG" href="understanding/svg.html"/>
+ <menu-item label="Images" href="images.html"/>
+ <menu-item label="PDF Library" href="pdf-library.html"/>
+ <menu-item label="SVG" href="svg.html"/>
</menu>
<menu label="Miscellaneous">
<menu-item label="Embedding" href="embedding.html"/>
1.1 xml-fop/src/documentation/content/xdocs/design/images.xml
Index: images.xml
===================================================================
<?xml version="1.0" standalone="no"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
"http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
<document>
<header>
<title>Images</title>
</header>
<body>
<section>
<title>Images in FOP</title>
<p>Images may only be needed to be loaded when the image is rendered to the
output or to find the dimensions.<br/>
An image url may be invalid, this can be costly to find out so we need to
keep a list of invalid image urls.</p>
<p>We have a number of different caching schemes that are possible.</p>
<p>All images are referred to using the url given in the XSL:FO after
removing "url('')" wrapping. This does
not include any sort of resolving such as relative -> absolute. The
external graphic in the FO Tree and the image area in the Area Tree only
have the url as a reference.
The images are handled through a static interface in ImageFactory.</p>
<section>
<title>Threading</title>
<p>In a single threaded case with one document the image should be released
as soon as the renderer caches it. If there are multiple documents then
the images could be held in a weak cache in case another document needs to
load the same image.</p>
<p>In a multi threaded case many threads could be attempting to get the same
image. We need to make sure an image will only be loaded once at a
particular time. Once a particular document is finished then we can move
all the images to a common weak cache.</p>
</section>
<section>
<title>Caches</title>
<section>
<title>LRU</title>
<p>All images are in a common cache regardless of context. To limit the size
of the cache the LRU image is removed to keep the amount of memory used
low. Each image can supply the amount of data held in memory.</p>
</section>
<section>
<title>Context</title>
<p>Images are cached according to the context, using the FOUserAgent as a key.
Once the context is finished the images are added to a common weak hashmap
so that other contexts can load these images or the data will be garbage
collected if required.</p>
<p>If images are to be used commonly then we cannot dispose of data in the
FopImage when cached by the renderer. Also if different contexts have
different base directories for resolving relative url's then the loading
and caching must be separate. We can have a cache that shares images among
all contexts or only loads an image for a context.</p>
</section>
<p>The cache uses an image loader so that it can synchronize the image
loading on an image by image basis. Finding and adding an image loader to
the cache is also synchronized to prevent thread problems.</p>
</section>
<section>
<title>Invalid Images</title>
<p>
If an image cannot be loaded for some reason, for example the url is
invalid or the image data is corrupt or an unknown type. Then it should
only attempt to load the image once. All other attempts to get the image
should return null so that it can be easily handled.<br/>
This will prevent any extra processing or waiting.</p>
</section>
<section>
<title>Reading</title>
<p>Once a stream is opened for the image url then a set of image readers is
used to determine what type of image it is. The reader can peek at the
image header or if necessary load the image. The reader can also get the
image size at this stage.
The reader then can provide the mime type to create the image object to
load the rest of the information.</p>
</section>
<section>
<title>Data</title>
<p>The data usually need for an image is the size and either a bitmap or the
original data. Images such as jpeg and eps can be embedded into the
document with the original data. SVG images are converted into a DOM which
needs to be rendered to the PDF. Other images such as gif, tiff etc. are
converted into a bitmap.
Data is loaded by the FopImage by calling load(type) where type is the type of data to load.</p>
</section>
<section>
<title>Rendering</title>
<p>Different renderers need to have the information in different forms.</p>
<section>
<title>PDF</title>
<dl><dt>original data</dt> <dd>JPG, EPS</dd>
<dt>bitmap</dt> <dd>gif, tiff, bmp, png</dd>
<dt>other</dt> <dd>SVG</dd></dl>
</section>
<section>
<title>PS</title>
<dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd>
<dt>other</dt> <dd>SVG</dd></dl>
</section>
<section>
<title>awt</title>
<dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd>
<dt>other</dt> <dd>SVG</dd></dl>
</section>
<p>The renderer uses the url to retrieve the image from the ImageFactory and
then load the required data depending on the image mime type. If the
renderer can insert the image into the document and use that data for all
future references of the same image then it can cache the reference in the
renderer and the image can be released from the image cache.</p>
</section>
</section>
</body>
</document>
1.1 xml-fop/src/documentation/content/xdocs/design/parsing.xml
Index: parsing.xml
===================================================================
<?xml version="1.0" standalone="no"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
"http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
<document>
<header>
<title>XML Parsing</title>
</header>
<body>
<section>
<title>XML Input</title>
<p>The xml document is always handled internally as SAX. The SAX events
are used to read the elements, attributes and text data of the FO document.
After the manipulation of the data the renderer writes out the pages in the
appropriate format. It may write as it goes, a page at a time or the whole
document at once. Once finished the document should contain all the data in the
chosen format ready for whatever use.</p>
<p>FOP can take the input XML in a number of ways:</p>
<ul>
<li><strong>SAX Events through SAX Handler</strong>: <code>FOTreeBuilder</code> is the SAX Handler which is obtained through <code>getContentHandler</code> on <code>Driver</code>.</li>
<li><strong>DOM (which is converted into SAX Events)</strong>: The conversion of a DOM tree is done via the <code>render(Document)</code> method on <code>Driver</code>.</li>
<li><strong>Data Source (which is parsed and converted into SAX Events)</strong>: The <code>Driver</code> can take an <code>InputSource</code> as input.
This can use a <code>Stream</code>, <code>String</code> etc.</li>
<li>XML+XSLT Transformation (which is transformed using an XSLT Processor and the result is fired as SAX Events: <code>XSLTInputHandler</code> is used as an <code>InputSource</code> in the render(<code>XMLReader</code>, <code>InputSource</code>) method on <code>Driver</code></li>
</ul>
<p>The SAX Events which are fired on the SAX Handler, class <code>FOTreeBuilder</code>, must represent an XSL:FO document.
If not there will be an error.
Any problems with the XML being well-formed are also handled here.</p>
</section>
<section>
<title>Element Mappings</title>
<p>The element mapping is a hashmap of all the elements in a particular namespace.
This makes it easy to create a different object for each element.
Element mappings are static to save on memory.</p>
<p>To add an extension a developer can put in the classpath a jar that contains the file <code>/META-INF/services/org.apache.fop.fo.ElementMapping</code>.
This must contain a line with the fully qualified name of a class that implements the <em>org.apache.fop.fo.ElementMapping</em> interface.
This will then be loaded automatically at the start.
Internal mappings are: FO, SVG and Extension (pdf bookmarks).</p>
</section>
<section>
<title>Tree Building</title>
<p>The SAX Events will fire all the information for the document with start element, end element, text data etc.
This information is used to build up a representation of the FO document.
To do this for a namespace there is a set of element mappings.
When an element + namepsace mapping is found then it can create an object for that element.
If the element is not found then it creates a dummy object or a generic DOM for unknown namespaces.</p>
<p>The object is then setup and then given attributes for the element.
For the FO Tree the attributes are converted into properties.
The FO objects use a property list mapping to convert the attributes into a list of properties for the element.
For other XML, for example SVG, a DOM of the XML is constructed.
This DOM can then be passed through to the renderer.
Other element mappings can be used in different ways, for example to create elements that create areas during the layout process or setup information for the renderer etc.</p>
<p>While the tree building is mainly about creating the FO Tree there are some stages that can propagate to the renderer.
At the end of a page sequence we know that all pages in the page sequence can be laid out without being effected by any further XML.
The significance of this is that the FO Tree for the page sequence may be able to be disposed of.
The end of the XML document also tells us that we can finalise the output document.
(The layout of individual pages is accomplished by the layout managers page at a time; i.e. they do not need to wait for the end of the page sequence.
The page may not yet be complete, however, containing forward page number references, for example.)</p>
</section>
</body>
</document>
1.1 xml-fop/src/documentation/content/xdocs/design/pdf-library.xml
Index: pdf-library.xml
===================================================================
<?xml version="1.0" standalone="no"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
"http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
<document>
<header>
<title>PDF Library</title>
</header>
<body>
<section>
<title>PDF Library</title>
<p>The PDF Library is an independant package of classes in FOP. These class
provide a simple way to construct documents and add the contents. The
classes are found in <code>org.apache.fop.pdf.*</code>.</p>
<section>
<title>PDF Document</title>
<p>This is where most of the document is created and put together.</p>
<p>It sets up the header, trailer and resources. Each page is made and added to the document.
There are a number of methods that can be used to create/add certain PDF objects to the document.</p>
</section>
<section>
<title>Building PDF</title>
<p>The PDF Document is built by creating a page for each page in the Area Tree.</p>
<p> This page then has all the contents added.
The page is then added to the document and available objects can be written to the output stream.</p>
<p>The contents of the page are things such as text, lines, images etc.
The PDFRenderer inserts the text directly into a pdf stream.
The text consists of markup to set fonts, set text position and add text.</p>
<p>Most of the simple pdf markup is inserted directly into a pdf stream.
Other more complex objects or commonly used objects are added through java classes.
Some pdf objects such as an image consists of two parts.</p>
<p>It has a separate object for the image data and another bit of markup to display the image in a certain position on the page.
</p><p>The java objects that represent a pdf object implement a method that returns the markup for inserting into a stream.
The method is: byte[] toPDF().</p>
</section>
<section>
<title>Features</title>
<section>
<title>Fonts</title>
<p>Support for embedding fonts and using the default Acrobat fonts.
</p></section>
<section>
<title>Images</title>
<p>Images can be inserted into a page. The image can either be inserted as a pixel map or directly insert a jpeg image.
</p></section>
<section>
<title>Stream Filters</title>
<p>A number of filters are available to encode the pdf streams. These filters can compress the data or change it such as converting to hex.
</p></section>
<section>
<title>Links</title>
<p>A pdf link can be added for an area on the page. This link can then point to an external destination or a position on any page in the document.
</p></section>
<section>
<title>Patterns</title>
<p>The fill and stroke of graphical objects can be set with a colour, pattern or gradient.
</p></section>
<p>The are a number of other features for handling pdf markup relevent to creating PDF files for FOP.</p>
</section>
</section>
</body>
</document>
1.1 xml-fop/src/documentation/content/xdocs/design/svg.xml
Index: svg.xml
===================================================================
<?xml version="1.0" standalone="no"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
"http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
<document>
<header>
<title>SVG</title>
</header>
<body>
<section>
<title>SVG</title>
<p>SVG is rendered through Batik.</p><p>The XML from the XSL:FO document
is converted into an SVG DOM with batik. This DOM is then set as the Document
on the Foreign Object area in the Area Tree.</p><p>This DOM is then available to
be rendered by the renderer.</p><p>SVG is rendered in the renderers via an XMLHandler in the FOUserAgent. This XML handler is used to render the SVG. The
SVG is rendered by using batik. Batik converts the SVG DOM into an internal
structure that can be drawn into a Graphics2D. So for PDF we use a
PDFGraphics2D to draw into.</p><p>This creates the necessary PDF information to
create the SVG image in the PDF document.</p><p>Most of the work is done in the
PDFGraphics2D class. There are also a few bridges that are plugged into batik
to provide different behaviour for some SVG elements.</p>
<section>
<title>Text Drawing</title>
<p>Normally batik converts text into a set of curved
shapes. </p><p>This is handled as any other shapes when rendering to the output. This
is not always desirable as the shapes have very fine curves. This can cause the
output to look a bit bad in PDF and PS (it can be drawn properly but is not by
default). These curves also require much more data than the original
text.</p><p>To handle this there is a PDFTextElementBridge that is set when
using the bridge in batik. If the text is simple enough for the text to be
drawn in the PDF as with all other text then this sets the TextPainter to use
the PDFTextPainter. This inserts the text directly into the PDF using the drawString method on the PDFGraphics2D.</p><p>Text is considered simple if the
font is available, the font size is useable and there are no tspans or other
complications. This can make the resulting PDF significantly
smaller.</p>
</section>
<section>
<title>PDF Links</title>
<p>To support links in PDF another batik
element bridge is used. The PDFAElementBridge creates a PDFANode which inserts
a link into the PDF document via the PDFGraphics2D.</p><p>Since links are positioned on the page without any transforms then we need to transform the
coordinates of the link area so that they match the current position of the a
element area. This transform may also need to account for the svg being
positioned on the page.</p>
</section>
<section>
<title>Images</title>
<p>Images are normally drawn
into the PDFGraphics2D. This then creates a bitmap of the image data that can
be inserted into the PDF document. </p><p>As PDF can support jpeg images then another
element bridge is used so that the jpeg can be directly inserted into the PDF.</p>
</section>
<section>
<title>PDF Transcoder</title>
<p>Batik provides a mechanism to
convert SVG into various formats. Through FOP we can convert an SVG document
into a single paged PDF document. The page contains the SVG drawn as best as
possible on the page. There is a PDFDocumentGraphics2D that creates a
standalone PDF document with a single page. This is then drawn into by batik in
the same way as with the PDFGraphics2D.</p>
</section>
<section>
<title>Other Outputs</title>
<p>When rendering to AWT the SVG is simply drawn onto the
awt canvas using batik.</p><p>The PS Renderer uses a similar technique as the
PDF Renderer.</p><p>The SVG Renderer simply embeds the SVG inside an svg
element.</p>
</section>
</section>
</body>
</document>
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-cvs-unsubscribe@xml.apache.org
For additional commands, e-mail: fop-cvs-help@xml.apache.org