You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-commits@xmlgraphics.apache.org by vm...@apache.org on 2003/04/24 00:33:04 UTC

cvs commit: xml-fop/src/documentation/content/xdocs/design/understanding book.xml images.xml pdf_library.xml svg.xml xml_parsing.xml

vmote       2003/04/23 15:33:02

  Modified:    src/documentation/content/xdocs/design book.xml
  Added:       src/documentation/content/xdocs/design images.xml
                        parsing.xml pdf-library.xml svg.xml
  Removed:     src/documentation/content/xdocs/design/understanding
                        book.xml images.xml pdf_library.xml svg.xml
                        xml_parsing.xml
  Log:
  Move remaining design/understanding content to design.
  
  Revision  Changes    Path
  1.18      +4 -9      xml-fop/src/documentation/content/xdocs/design/book.xml
  
  Index: book.xml
  ===================================================================
  RCS file: /home/cvs/xml-fop/src/documentation/content/xdocs/design/book.xml,v
  retrieving revision 1.17
  retrieving revision 1.18
  diff -u -r1.17 -r1.18
  --- book.xml	22 Apr 2003 06:12:51 -0000	1.17
  +++ book.xml	23 Apr 2003 22:33:02 -0000	1.18
  @@ -2,11 +2,6 @@
   <!DOCTYPE book PUBLIC "-//APACHE//DTD Cocoon Documentation Book V1.0//EN"
       "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-cocoon2/src/documentation/xdocs/dtd/book-cocoon-v10.dtd">
   
  -<!--
  -IF YOU MAKE CHANGES TO THIS FILE, PLEASE MAKE CORRESPONDING CHANGES TO
  -understanding/book.xml, WHICH SEE FOR AN EXPLANATION.
  --->
  -
   <book software="FOP"
       title="FOP Design"
       copyright="@year@ The Apache Software Foundation"
  @@ -16,7 +11,7 @@
         <menu-item label="Introduction" href="index.html"/>
       </menu>
       <menu label="Compliance">
  -      <menu-item label="XML Parsing" href="understanding/xml_parsing.html"/>
  +      <menu-item label="XML Parsing" href="parsing.html"/>
         <menu-item label="FO Tree" href="fotree.html"/>
         <menu-item label="Properties" href="properties.html"/>
         <menu-item label="Layout" href="layout.html"/>
  @@ -24,9 +19,9 @@
         <menu-item label="Renderers" href="renderers.html"/>
       </menu>
       <menu label="Extras">
  -      <menu-item label="Images" href="understanding/images.html"/>
  -      <menu-item label="PDF Library" href="understanding/pdf_library.html"/>
  -      <menu-item label="SVG" href="understanding/svg.html"/>
  +      <menu-item label="Images" href="images.html"/>
  +      <menu-item label="PDF Library" href="pdf-library.html"/>
  +      <menu-item label="SVG" href="svg.html"/>
       </menu>
       <menu label="Miscellaneous">
         <menu-item label="Embedding" href="embedding.html"/>
  
  
  
  1.1                  xml-fop/src/documentation/content/xdocs/design/images.xml
  
  Index: images.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
      "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
  
  <document>
      <header>
          <title>Images</title>
      </header>
      <body>
  
    <section>
      <title>Images in FOP</title>
  
      <p>Images may only be needed to be loaded when the image is rendered to the
  output or to find the dimensions.<br/>
  An image url may be invalid, this can be costly to find out so we need to
  keep a list of invalid image urls.</p>
  <p>We have a number of different caching schemes that are possible.</p>
  <p>All images are referred to using the url given in the XSL:FO after
  removing "url('')" wrapping. This does
  not include any sort of resolving such as relative -> absolute. The
  external graphic in the FO Tree and the image area in the Area Tree only
  have the url as a reference.
  The images are handled through a static interface in ImageFactory.</p>
  
  <section>
    <title>Threading</title>
  
  <p>In a single threaded case with one document the image should be released
  as soon as the renderer caches it. If there are multiple documents then
  the images could be held in a weak cache in case another document needs to
  load the same image.</p>
  
  <p>In a multi threaded case many threads could be attempting to get the same
  image. We need to make sure an image will only be loaded once at a
  particular time. Once a particular document is finished then we can move
  all the images to a common weak cache.</p>
  </section>
  
  <section>
    <title>Caches</title>
  <section>
    <title>LRU</title>
  <p>All images are in a common cache regardless of context. To limit the size
  of the cache the LRU image is removed to keep the amount of memory used
  low. Each image can supply the amount of data held in memory.</p>
  </section>
  
  <section>
    <title>Context</title>
  <p>Images are cached according to the context, using the FOUserAgent as a key.
  Once the context is finished the images are added to a common weak hashmap
  so that other contexts can load these images or the data will be garbage
  collected if required.</p>
  <p>If images are to be used commonly then we cannot dispose of data in the
  FopImage when cached by the renderer. Also if different contexts have
  different base directories for resolving relative url's then the loading
  and caching must be separate. We can have a cache that shares images among
  all contexts or only loads an image for a context.</p>
  </section>
  
  <p>The cache uses an image loader so that it can synchronize the image
  loading on an image by image basis. Finding and adding an image loader to
  the cache is also synchronized to prevent thread problems.</p>
  </section>
  
  <section>
    <title>Invalid Images</title>
  
  <p>
  If an image cannot be loaded for some reason, for example the url is
  invalid or the image data is corrupt or an unknown type. Then it should
  only attempt to load the image once. All other attempts to get the image
  should return null so that it can be easily handled.<br/>
  This will prevent any extra processing or waiting.</p>
  </section>
  
  <section>
    <title>Reading</title>
  <p>Once a stream is opened for the image url then a set of image readers is
  used to determine what type of image it is. The reader can peek at the
  image header or if necessary load the image. The reader can also get the
  image size at this stage.
  The reader then can provide the mime type to create the image object to
  load the rest of the information.</p>
  </section>
  
  <section>
    <title>Data</title>
  
  <p>The data usually need for an image is the size and either a bitmap or the
  original data. Images such as jpeg and eps can be embedded into the
  document with the original data. SVG images are converted into a DOM which
  needs to be rendered to the PDF. Other images such as gif, tiff etc. are
  converted into a bitmap.
  Data is loaded by the FopImage by calling load(type) where type is the type of data to load.</p>
  </section>
  
  
  <section>
    <title>Rendering</title>
  
  <p>Different renderers need to have the information in different forms.</p>
  
  
  <section>
    <title>PDF</title>
  <dl><dt>original data</dt>  <dd>JPG, EPS</dd>
  <dt>bitmap</dt>  <dd>gif, tiff, bmp, png</dd>
  <dt>other</dt>  <dd>SVG</dd></dl>
  </section>
  
  <section>
    <title>PS</title>
  <dl><dt>bitmap</dt>  <dd>JPG, gif, tiff, bmp, png</dd>
  <dt>other</dt> <dd>SVG</dd></dl>
  </section>
  
  <section>
    <title>awt</title>
  <dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd>
  <dt>other</dt>  <dd>SVG</dd></dl>
  </section>
  
  <p>The renderer uses the url to retrieve the image from the ImageFactory and
  then load the required data depending on the image mime type. If the
  renderer can insert the image into the document and use that data for all
  future references of the same image then it can cache the reference in the
  renderer and the image can be released from the image cache.</p>
  </section>
  </section>
  
      </body>
  </document>
  
  
  
  
  1.1                  xml-fop/src/documentation/content/xdocs/design/parsing.xml
  
  Index: parsing.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
      "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
  <document>
    <header>
      <title>XML Parsing</title>
    </header>
    <body>
      <section>
        <title>XML Input</title>
        <p>The xml document is always handled internally as SAX. The SAX events
         are used to read the elements, attributes and text data of the FO document.
         After the manipulation of the data the renderer writes out the pages in the
         appropriate format. It may write as it goes, a page at a time or the whole
         document at once. Once finished the document should contain all the data in the
         chosen format ready for whatever use.</p>
        <p>FOP can take the input XML in a number of ways:</p>
        <ul>
          <li><strong>SAX Events through SAX Handler</strong>: <code>FOTreeBuilder</code> is the SAX Handler which is obtained through <code>getContentHandler</code> on <code>Driver</code>.</li>
          <li><strong>DOM (which is converted into SAX Events)</strong>: The conversion of a DOM tree is done via the <code>render(Document)</code> method on <code>Driver</code>.</li>
          <li><strong>Data Source (which is parsed and converted into SAX Events)</strong>: The <code>Driver</code> can take an <code>InputSource</code> as input.
  This can use a <code>Stream</code>, <code>String</code> etc.</li>
          <li>XML+XSLT Transformation (which is transformed using an XSLT Processor and the result is fired as SAX Events: <code>XSLTInputHandler</code> is used as an <code>InputSource</code> in the render(<code>XMLReader</code>, <code>InputSource</code>) method on <code>Driver</code></li>
        </ul>
        <p>The SAX Events which are fired on the SAX Handler, class <code>FOTreeBuilder</code>, must represent an XSL:FO document.
  If not there will be an error.
  Any problems with the XML being well-formed are also handled here.</p>
      </section>
      <section>
        <title>Element Mappings</title>
        <p>The element mapping is a hashmap of all the elements in a particular namespace.
  This makes it easy to create a different object for each element.
  Element mappings are static to save on memory.</p>
        <p>To add an extension a developer can put in the classpath a jar that contains the file <code>/META-INF/services/org.apache.fop.fo.ElementMapping</code>.
  This must contain a line with the fully qualified name of a class that implements the <em>org.apache.fop.fo.ElementMapping</em> interface.
  This will then be loaded automatically at the start.
  Internal mappings are: FO, SVG and Extension (pdf bookmarks).</p>
      </section>
      <section>
        <title>Tree Building</title>
        <p>The SAX Events will fire all the information for the document with start element, end element, text data etc.
  This information is used to build up a representation of the FO document.
  To do this for a namespace there is a set of element mappings.
  When an element + namepsace mapping is found then it can create an object for that element.
  If the element is not found then it creates a dummy object or a generic DOM for unknown namespaces.</p>
        <p>The object is then setup and then given attributes for the element.
  For the FO Tree the attributes are converted into properties.
  The FO objects use a property list mapping to convert the attributes into a list of properties for the element.
  For other XML, for example SVG, a DOM of the XML is constructed.
  This DOM can then be passed through to the renderer.
  Other element mappings can be used in different ways, for example to create elements that create areas during the layout process or setup information for the renderer etc.</p>
        <p>While the tree building is mainly about creating the FO Tree there are some stages that can propagate to the renderer.
  At the end of a page sequence we know that all pages in the page sequence can be laid out without being effected by any further XML.
  The significance of this is that the FO Tree for the page sequence may be able to be disposed of.
  The end of the XML document also tells us that we can finalise the output document.
  (The layout of individual pages is accomplished by the layout managers page at a time; i.e. they do not need to wait for the end of the page sequence.
  The page may not yet be complete, however, containing forward page number references, for example.)</p>
      </section>
    </body>
  </document>
  
  
  
  1.1                  xml-fop/src/documentation/content/xdocs/design/pdf-library.xml
  
  Index: pdf-library.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
      "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
  
  <document>
      <header>
          <title>PDF Library</title>
      </header>
  
      <body>
  <section>
    <title>PDF Library</title>
  
  <p>The PDF Library is an independant package of classes in FOP. These class
  provide a simple way to construct documents and add the contents. The
  classes are found in <code>org.apache.fop.pdf.*</code>.</p>
  
  <section>
    <title>PDF Document</title>
  <p>This is where most of the document is created and put together.</p>
  <p>It sets up the header, trailer and resources. Each page is made and added to the document.
  There are a number of methods that can be used to create/add certain PDF objects to the document.</p>
  </section>
  
  <section>
    <title>Building PDF</title>
  <p>The PDF Document is built by creating a page for each page in the Area Tree.</p>
  <p> This page then has all the contents added.
   The page is then added to the document and available objects can be written to the output stream.</p>
  <p>The contents of the page are things such as text, lines, images etc.
  The PDFRenderer inserts the text directly into a pdf stream.
  The text consists of markup to set fonts, set text position and add text.</p>
  <p>Most of the simple pdf markup is inserted directly into a pdf stream.
  Other more complex objects or commonly used objects are added through java classes.
  Some pdf objects such as an image consists of two parts.</p>
  <p>It has a separate object for the image data and another bit of markup to display the image in a certain position on the page.
  </p><p>The java objects that represent a pdf object implement a method that returns the markup for inserting into a stream.
  The method is: byte[] toPDF().</p>
  
  </section>
  <section>
    <title>Features</title>
  
  <section>
    <title>Fonts</title>
  <p>Support for embedding fonts and using the default Acrobat fonts.
  </p></section>
  
  <section>
    <title>Images</title>
  <p>Images can be inserted into a page. The image can either be inserted as a pixel map or directly insert a jpeg image.
  </p></section>
  
  <section>
    <title>Stream Filters</title>
  <p>A number of filters are available to encode the pdf streams. These filters can compress the data or change it such as converting to hex.
  </p></section>
  
  <section>
    <title>Links</title>
  <p>A pdf link can be added for an area on the page. This link can then point to an external destination or a position on any page in the document.
  </p></section>
  
  <section>
    <title>Patterns</title>
  <p>The fill and stroke of graphical objects can be set with a colour, pattern or gradient.
  </p></section>
  
  <p>The are a number of other features for handling pdf markup relevent to creating PDF files for FOP.</p>
  </section>
  
    </section>
  
      </body>
  </document>
  
  
  
  
  1.1                  xml-fop/src/documentation/content/xdocs/design/svg.xml
  
  Index: svg.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.1//EN"
      "http://cvs.apache.org/viewcvs.cgi/*checkout*/xml-forrest/src/resources/schema/dtd/document-v11.dtd">
  
  <document>
      <header>
          <title>SVG</title>
      </header>
      <body>
  <section>
    <title>SVG</title>
      <p>SVG is rendered through Batik.</p><p>The XML from the XSL:FO document
        is converted into an SVG DOM with batik. This DOM is then set as the Document
        on the Foreign Object area in the Area Tree.</p><p>This DOM is then available to
        be rendered by the renderer.</p><p>SVG is rendered in the renderers via an      XMLHandler in the FOUserAgent. This XML handler is used to render the SVG. The
        SVG is rendered by using batik. Batik converts the SVG DOM into an internal
        structure that can be drawn into a Graphics2D. So for PDF we use a
        PDFGraphics2D to draw into.</p><p>This creates the necessary PDF information to
        create the SVG image in the PDF document.</p><p>Most of the work is done in the
        PDFGraphics2D class. There are also a few bridges that are plugged into batik
        to provide different behaviour for some SVG elements.</p>
  <section>
    <title>Text Drawing</title>
  <p>Normally batik converts text into a set of curved
         shapes. </p><p>This is handled as any other shapes when rendering to the output. This
         is not always desirable as the shapes have very fine curves. This can cause the
         output to look a bit bad in PDF and PS (it can be drawn properly but is not by
         default). These curves also require much more data than the original
         text.</p><p>To handle this there is a PDFTextElementBridge that is set when
         using the bridge in batik. If the text is simple enough for the text to be
         drawn in the PDF as with all other text then this sets the TextPainter to use
         the PDFTextPainter. This inserts the text directly into the PDF using the       drawString method on the PDFGraphics2D.</p><p>Text is considered simple if the
         font is available, the font size is useable and there are no tspans or other
         complications. This can make the resulting PDF significantly
         smaller.</p>
  </section>
  <section>
    <title>PDF Links</title>
  <p>To support links in PDF another batik
         element bridge is used. The PDFAElementBridge creates a PDFANode which inserts
         a link into the PDF document via the PDFGraphics2D.</p><p>Since links are       positioned on the page without any transforms then we need to transform the
         coordinates of the link area so that they match the current position of the a
         element area. This transform may also need to account for the svg being
         positioned on the page.</p>
  </section>
  <section>
    <title>Images</title>
  <p>Images are normally drawn
         into the PDFGraphics2D. This then creates a bitmap of the image data that can
         be inserted into the PDF document. </p><p>As PDF can support jpeg images then another
         element bridge is used so that the jpeg can be directly inserted into the       PDF.</p>
  </section>
  <section>
    <title>PDF Transcoder</title>
  <p>Batik provides a mechanism to
         convert SVG into various formats. Through FOP we can convert an SVG document
         into a single paged PDF document. The page contains the SVG drawn as best as
         possible on the page. There is a PDFDocumentGraphics2D that creates a
         standalone PDF document with a single page. This is then drawn into by batik in
         the same way as with the PDFGraphics2D.</p>
  </section>
  <section>
     <title>Other Outputs</title>
  <p>When rendering to AWT the SVG is simply drawn onto the
         awt canvas using batik.</p><p>The PS Renderer uses a similar technique as the
         PDF Renderer.</p><p>The SVG Renderer simply embeds the SVG inside an svg
         element.</p>
  </section>
  
  </section>
  
      </body>
  </document>
  
  
  
  

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-cvs-unsubscribe@xml.apache.org
For additional commands, e-mail: fop-cvs-help@xml.apache.org