You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Finn Bock <bc...@worldonline.dk> on 2003/12/25 16:12:44 UTC

Output from NIST test suite

Hi,

After 'fixing' the master-reference issue in my copy of the NIST test 
suite, I ran the tests against 0.20.5 and 1.0dev and merged the result 
side by side into a single .pdf file.

You can download the result (1Mb) here:

    http://bckfnn-modules.sf.net/out-0.20.5-1.0.pdf

For some reason the pdf does not display correctly in my browsers, so it 
is better to download it. The merged pdf file is created using iText.

The square to the left contains the output from 0.20.5 and the square on 
the right the output from HEAD.

Here is also a merge between the pdf files that comes with the NIST 
suite and head:

    http://bckfnn-modules.sf.net/out-nist-1.0.pdf

There is still a few issues left to fix <wink>.


Another way of using the test suite could be to compare a binary image 
of the pages against some kind of reference. Has such a approach been 
tried? Does anyone know of available software that can render a PDF as 
an image file?

regards,
finn


Re: Output from NIST test suite

Posted by Finn Bock <bc...@worldonline.dk>.
>>After 'fixing' the master-reference issue in my copy of the NIST test 
>>suite, I ran the tests against 0.20.5 and 1.0dev and merged the result 
>>side by side into a single .pdf file.

[John Austin]

> Interesting technique.
> 
> What tool do you use to make the side-by-side comparison ?

This java program:

    http://bckfnn-modules.sf.net/PairPdf.java

to drive iText:

    http://www.lowagie.com/iText/

which does all the heavy lifting.

regards,
finn



Re: Output from NIST test suite

Posted by John Austin <jw...@sympatico.ca>.
On Thu, 2003-12-25 at 11:42, Finn Bock wrote:
> Hi,
> 
> After 'fixing' the master-reference issue in my copy of the NIST test 
> suite, I ran the tests against 0.20.5 and 1.0dev and merged the result 
> side by side into a single .pdf file.

Interesting technique.

What tool do you use to make the side-by-side comparison ?
-- 
John Austin <jw...@sympatico.ca>

Re: Output from NIST test suite

Posted by Finn Bock <bc...@worldonline.dk>.
[Bernd Brandstetter]

> in your GhostScript installation there should also exist a gswin32c.exe 
> which runs in console mode and therefore doesn't open a GUI window every 
> time.

And indeed there is. Boy, I feel stupid now. Thank you for the pointer.

regards,
finn


Re: Output from NIST test suite

Posted by Bernd Brandstetter <bb...@freenet.de>.
Hi,

in your GhostScript installation there should also exist a gswin32c.exe 
which runs in console mode and therefore doesn't open a GUI window every 
time.

Bye,
Bernd


On Saturday 03 January 2004 00:35, Finn Bock wrote:
> [Jeremias Maerki]
>
> I drive ghostscript with a bash script like this:
>
> for f in $*; do
>      r="${f/.pdf/.png}"
>      echo $f $r
>      gswin32 -q -dSAFER -dBATCH -dNOPAUSE -sDEVICE=png16m \
>               -sOutputFile=$r $f
> done
>
> and it is quite slow and cause the ghostscript console to flicker and
> grab the focus all the time. Annoying.
>
> Does anyone here know of a better (and maybe faster) way of using
> ghostscript to convert 615 pdf files to images?


Re: Output from NIST test suite

Posted by Finn Bock <bc...@worldonline.dk>.
[Jeremias Maerki]

> You may also want to check the mailing list archives. Especially Joerg
> and I have discussed this a number of times. GhostScript paired with an
> image differ would be my favourite approach.

Thanks for the tip about Ghostscript, I got it working and together with 
the NIST suite it have been a great help in detecting changes.

I drive ghostscript with a bash script like this:

for f in $*; do
     r="${f/.pdf/.png}"
     echo $f $r
     gswin32 -q -dSAFER -dBATCH -dNOPAUSE -sDEVICE=png16m \
              -sOutputFile=$r $f
done

and it is quite slow and cause the ghostscript console to flicker and 
grab the focus all the time. Annoying.

Does anyone here know of a better (and maybe faster) way of using 
ghostscript to convert 615 pdf files to images?

regards,
finn


Re: Output from NIST test suite

Posted by Jeremias Maerki <de...@greenmail.ch>.
You may also want to check the mailing list archives. Especially Joerg
and I have discussed this a number of times. GhostScript paired with an
image differ would be my favourite approach.

On 25.12.2003 16:12:44 Finn Bock wrote:
> Another way of using the test suite could be to compare a binary image 
> of the pages against some kind of reference. Has such a approach been 
> tried? Does anyone know of available software that can render a PDF as 
> an image file?


Jeremias Maerki


Re: AW: Regression tests was: Re: Output from NIST test suite

Posted by John Austin <jw...@sympatico.ca>.
On Fri, 2003-12-26 at 05:29, Peter Kullmann wrote:
> J. Pietschmann wrote:
> > 
> > John Austin wrote:
> > > RedHat 9.0 (my system anyhow) includes a command 'pdftopbm' 
> > that will
> > > convert a PDF to multiple PBM (protable Bit Map) files that might be
> > > comparable.
> > ...
> >  >It would certainly help detect pixel-sized changes.
> > > That might help regression testing.

I wasn't thinking of using graphics as the primary means of 
comparing output. It was just a thought that one could use
visualization in some circumstances: 

	+ pixels that were white in both images would be
	  rendered as white
	+ pixels that were black in both images would be
	  rendered as black
	+ black pixels in the first image that were white
	  in the second could be rendered as red
	+ white pixels in the first image that were black
	  in the second could be rendered as blue

I thought of the idea of overlaying images for comparison when
I was scrolling through the side-by-side renderings of PDF's
that Finn posted yesterday (what does 'yesterday' mean in a
discussion that crosses the International Date Line ?)

Of course, this color-based scheme breaks down for test cases
that use color.

> > 
> > We need regression tests badly. Some problems to ponder:
> > a) Tests need to be automated for actually being useful.
> >   JUnit seems the way to go. Unfortuanately, it's still
> >   underutiliyed in FOP.
> > b) We don't have much *unit* tests. There's only the
> >   UtilityCodeTestSuite.java. We need much more tests for
> >   basic functionality. The problem seems to be however
> >   that an elaborated test harness needs to be written in
> >   order to do unt tests for, e.g. layout managers.
> > c) In order to test the whole engine at once, from FO input
> >   to generating PDF/whatever, well, a binary compare with
> >   a pregenerated PDF would be as sufficient as comparing
> >   bitmap images. Problems here:
> >   + The files to compare against are binary, and consume
> >    a lot of space. Well, take a look at GenericFOPTestCase.java
> >    which uses MD5 sums, one for the FO in order to detect
> >    accidental changes to the source, and one for the result.
> >   + Even small changes have potential to break the whole test
> >    suite, even if nothing important changed, let's say the
> >    order of entries in a PDF dictionary. Rendering bitmaps
> >    from PDF eliminates this, but then you wont find regressions
> >    in non-visible stuff.
> > All in all, if there are 143 template PDFs and a change causes
> > mismatches for all, what will you do? Examine everything,
> > comparing pixels, check whether there are visible differences
> > at all, and then judge whether the original or the newly
> > generated PDF is at fault? I don't think this will be done
> > often.

Use tests for binary equality to detect differences. Visualization
might be one tool, useful in following up on detected differences.

I might want to use the technique to compare the effects of changes 
to a document. For example:

What happens on page 7 when I change space-before="10pt" to
space-before="15pt" ?

A colorized visualization would give me a better idea than separate
files. Remember that our brains are all quite different. Your
rote visual memory ability is probably much better than mine. You
might learn more from a side-by-side comparison than I would.

Crap. Now I have to give an example. Perhaps it won't take that 
long.

> > 
> > Ideas welcome!
> > 
> > J.Pietschmann
> > 
> 
> As an alternative approach for c) one could create tests along 
> the following lines: Suppose you want to test left margin 
> properties of a block. For this a simple fo file is rendered as 
> a bitmap. The bitmap will not be compared to a reference bitmap
> but some elementary assertions are calculated. For instance one
> such assertion could be: "The rectangle of width 1 inch of the
> left edge is blank." I don't know of a tool that can do this
> but it should be pretty straight forward to implement. 

Probably not that hard to do once you get inside an image file
in a program. Especially if you know the colors will be black
(0,0,0) and white (255,255,255) or a small number of selected 
colors.

> So, in the test suit one has a piece of fo containing a test 
> document and some assertions in java or coded in xml that should
> be fulfilled by the rendered image of the fo. 
> 
> Assertions could contain some of the following pieces:
> - a specified rectangle is blank (or of some specific color)
> - a specified rectangle has only colors x and y (to test background
> and foreground colors of a block).
> - a given line contains the color pattern white, yellow, 
> (white, black)*, green, white. IE. a regular expression on the colors
> along a line. This could be used to test border colors along 
> a horizontal or vertical line through a bordered block.
> - along a given line the size of the first and last black region
> is at least xxx inches (to test border thickness)
> 
> The advantage of this approach seems to me that it is relatively
> easy to maintain. The test suite is small (no binaries). It can
> easily be automated in junit. 
> 
> On the other hand, the approach is limited to relatively simple
> test cases. For instance it will not be possible to test font 
> height, font style and text adjustments easily.
> 
> Peter
-- 
John Austin <jw...@sympatico.ca>

AW: Regression tests was: Re: Output from NIST test suite

Posted by "J.U. Anderegg" <ha...@bluewin.ch>.
> Peter Kullmann wrote:
>
> As an alternative approach for c) one could create tests along
> the following lines: Suppose you want to test left margin
> properties of a block. For this a simple fo file is rendered as
> a bitmap. The bitmap will not be compared to a reference bitmap
> but some elementary assertions are calculated. For instance one
> such assertion could be: "The rectangle of width 1 inch of the
> left edge is blank." I don't know of a tool that can do this
> but it should be pretty straight forward to implement.
>

There are 2 test points: renderer input and renderer output:

- Renderer input: establish the SVG renderer as reference and use an 'XML
file compare' program. The XSL Committee will hopefully prepare samples to
demonstrate and validate concepts and rules.

- Renderer output: Acrobat has a 'PDF document compare' function. This type
of the tool validates the PDF renderer. The AWT renderer can be validated
too by having the Java Printing System generate a PDF document with a PDF
printer driver. The Java Printing System can generate PCL files as well, if
better suited.

There will always be problems, because equivalent, valid results may be
achieved by different graphic objects or different sequences of graphic
calls. Therefore there will always be automated and manual/visual methods.

Hansuli Anderegg



AW: Regression tests was: Re: Output from NIST test suite

Posted by Peter Kullmann <p....@arenae.ch>.
J. Pietschmann wrote:
> 
> John Austin wrote:
> > RedHat 9.0 (my system anyhow) includes a command 'pdftopbm' 
> that will
> > convert a PDF to multiple PBM (protable Bit Map) files that might be
> > comparable.
> ...
>  >It would certainly help detect pixel-sized changes.
> > That might help regression testing.
> 
> We need regression tests badly. Some problems to ponder:
> a) Tests need to be automated for actually being useful.
>   JUnit seems the way to go. Unfortuanately, it's still
>   underutiliyed in FOP.
> b) We don't have much *unit* tests. There's only the
>   UtilityCodeTestSuite.java. We need much more tests for
>   basic functionality. The problem seems to be however
>   that an elaborated test harness needs to be written in
>   order to do unt tests for, e.g. layout managers.
> c) In order to test the whole engine at once, from FO input
>   to generating PDF/whatever, well, a binary compare with
>   a pregenerated PDF would be as sufficient as comparing
>   bitmap images. Problems here:
>   + The files to compare against are binary, and consume
>    a lot of space. Well, take a look at GenericFOPTestCase.java
>    which uses MD5 sums, one for the FO in order to detect
>    accidental changes to the source, and one for the result.
>   + Even small changes have potential to break the whole test
>    suite, even if nothing important changed, let's say the
>    order of entries in a PDF dictionary. Rendering bitmaps
>    from PDF eliminates this, but then you wont find regressions
>    in non-visible stuff.
> All in all, if there are 143 template PDFs and a change causes
> mismatches for all, what will you do? Examine everything,
> comparing pixels, check whether there are visible differences
> at all, and then judge whether the original or the newly
> generated PDF is at fault? I don't think this will be done
> often.
> 
> Ideas welcome!
> 
> J.Pietschmann
> 

As an alternative approach for c) one could create tests along 
the following lines: Suppose you want to test left margin 
properties of a block. For this a simple fo file is rendered as 
a bitmap. The bitmap will not be compared to a reference bitmap
but some elementary assertions are calculated. For instance one
such assertion could be: "The rectangle of width 1 inch of the
left edge is blank." I don't know of a tool that can do this
but it should be pretty straight forward to implement. 

So, in the test suit one has a piece of fo containing a test 
document and some assertions in java or coded in xml that should
be fulfilled by the rendered image of the fo. 

Assertions could contain some of the following pieces:
- a specified rectangle is blank (or of some specific color)
- a specified rectangle has only colors x and y (to test background
and foreground colors of a block).
- a given line contains the color pattern white, yellow, 
(white, black)*, green, white. IE. a regular expression on the colors
along a line. This could be used to test border colors along 
a horizontal or vertical line through a bordered block.
- along a given line the size of the first and last black region
is at least xxx inches (to test border thickness)

The advantage of this approach seems to me that it is relatively
easy to maintain. The test suite is small (no binaries). It can
easily be automated in junit. 

On the other hand, the approach is limited to relatively simple
test cases. For instance it will not be possible to test font 
height, font style and text adjustments easily.

Peter



Regression tests was: Re: Output from NIST test suite

Posted by "J.Pietschmann" <j3...@yahoo.de>.
John Austin wrote:
> RedHat 9.0 (my system anyhow) includes a command 'pdftopbm' that will
> convert a PDF to multiple PBM (protable Bit Map) files that might be
> comparable.
...
 >It would certainly help detect pixel-sized changes.
> That might help regression testing.

We need regression tests badly. Some problems to ponder:
a) Tests need to be automated for actually being useful.
  JUnit seems the way to go. Unfortuanately, it's still
  underutiliyed in FOP.
b) We don't have much *unit* tests. There's only the
  UtilityCodeTestSuite.java. We need much more tests for
  basic functionality. The problem seems to be however
  that an elaborated test harness needs to be written in
  order to do unt tests for, e.g. layout managers.
c) In order to test the whole engine at once, from FO input
  to generating PDF/whatever, well, a binary compare with
  a pregenerated PDF would be as sufficient as comparing
  bitmap images. Problems here:
  + The files to compare against are binary, and consume
   a lot of space. Well, take a look at GenericFOPTestCase.java
   which uses MD5 sums, one for the FO in order to detect
   accidental changes to the source, and one for the result.
  + Even small changes have potential to break the whole test
   suite, even if nothing important changed, let's say the
   order of entries in a PDF dictionary. Rendering bitmaps
   from PDF eliminates this, but then you wont find regressions
   in non-visible stuff.
All in all, if there are 143 template PDFs and a change causes
mismatches for all, what will you do? Examine everything,
comparing pixels, check whether there are visible differences
at all, and then judge whether the original or the newly
generated PDF is at fault? I don't think this will be done
often.

Ideas welcome!

J.Pietschmann



Re: Output from NIST test suite

Posted by John Austin <jw...@sympatico.ca>.
On Thu, 2003-12-25 at 11:42, Finn Bock wrote:
> Hi,
> 
> After 'fixing' the master-reference issue in my copy of the NIST test 
> suite, I ran the tests against 0.20.5 and 1.0dev and merged the result 
> side by side into a single .pdf file.
> 
> You can download the result (1Mb) here:
> 
>     http://bckfnn-modules.sf.net/out-0.20.5-1.0.pdf
> 
> For some reason the pdf does not display correctly in my browsers, so it 
> is better to download it. The merged pdf file is created using iText.
> 
> The square to the left contains the output from 0.20.5 and the square on 
> the right the output from HEAD.
> 
> Here is also a merge between the pdf files that comes with the NIST 
> suite and head:
> 
>     http://bckfnn-modules.sf.net/out-nist-1.0.pdf
> 
> There is still a few issues left to fix <wink>.
> 
> 
> Another way of using the test suite could be to compare a binary image 
> of the pages against some kind of reference. Has such a approach been 
> tried? Does anyone know of available software that can render a PDF as 
> an image file?

RedHat 9.0 (my system anyhow) includes a command 'pdftopbm' that will
convert a PDF to multiple PBM (protable Bit Map) files that might be
comparable. They would be convertable in to other formats such as PNG
(or GIF for the patent-minded). 

I found the result pretty poor (ugly text badly in need of
anti-aliasing). That might help contribute to keeping images 
similar. It would certainly help detect pixel-sized changes.
That might help regression testing.

There are suggestions on the Net that Ghostcript can do this sort of 
conversion as well.

GIMP can read a PDF as well. When I tried it, I got a graphic for every
pair of pages (my doc was over 133 pages). Perhaps some script-fu ... ?
 

> regards,
> finn
-- 
John Austin <jw...@sympatico.ca>

Re: Output from NIST test suite

Posted by Finn Bock <bc...@worldonline.dk>.
[Carmelo Montanez]

> Hi Folks:
> 
> Somehow the final copy of the test suite was not uploaded to either our
> server at NIST or W3C.  I uploaded a copy of the latest suite version
> to the following link:
> 
> http://xw2k.sdct.itl.nist.gov/carmelo/formattingObjectsSuite123103.zip
> 
> I will make sure it is at the W3c and NIST server soon.

Wow, that is quite a developement between the two versions.

I've tested FOP with the new suite, and I think there is a few issues 
with the test suite itself.

- NIST/rendering-model/renderingmodel2 expect nist.gif to be available.
- NIST/table-header/thfoSingleCell1.xml there is a <fo:block/> element
   but the fo namespace is undeclared, so the test can't be transformed.
- NIST/table-header/thfoDoubleCell.xml same at thfoSingleCell1.xml
- NIST/table-header/thfoBorderStyle9.xsl, a closing quote is missing on
   line 12:
     <xsl:attribute name = "border-collapse>collapse</xsl:attribute>
- NIST/wrapper/wrfoBlockColor1.xsl, an extra '<' char at line 135.
- NIST/page-sequence/psfopagemaster1 isn't using master-reference:
      <fo:page-sequence master-name="test-page-master1">
- NIST/area-dimension/adp-height1, the oransq.jpg is not included.
- NIST/miscellaneous/misc-list1, the file crosshair1.jpg isn't included.

regards,
finn


Re: Output from NIST test suite

Posted by Carmelo Montanez <ca...@nist.gov>.
Hi Folks:

Somehow the final copy of the test suite was not uploaded to either our
server at NIST or W3C.  I uploaded a copy of the latest suite version
to the following link:

http://xw2k.sdct.itl.nist.gov/carmelo/formattingObjectsSuite123103.zip

I will make sure it is at the W3c and NIST server soon.


Thanks,
Carmelo


At 04:12 PM 12/25/2003 +0100, you wrote:
>Hi,
>
>After 'fixing' the master-reference issue in my copy of the NIST test 
>suite, I ran the tests against 0.20.5 and 1.0dev and merged the result 
>side by side into a single .pdf file.
>
>You can download the result (1Mb) here:
>
>    http://bckfnn-modules.sf.net/out-0.20.5-1.0.pdf
>
>For some reason the pdf does not display correctly in my browsers, so it 
>is better to download it. The merged pdf file is created using iText.
>
>The square to the left contains the output from 0.20.5 and the square on 
>the right the output from HEAD.
>
>Here is also a merge between the pdf files that comes with the NIST suite 
>and head:
>
>    http://bckfnn-modules.sf.net/out-nist-1.0.pdf
>
>There is still a few issues left to fix <wink>.
>
>
>Another way of using the test suite could be to compare a binary image of 
>the pages against some kind of reference. Has such a approach been tried? 
>Does anyone know of available software that can render a PDF as an image file?
>
>regards,
>finn
>
>