You are viewing a plain text version of this content. The canonical link for it is here.

Posted to fop-dev@xmlgraphics.apache.org by Glen Mazza <gm...@apache.org> on 2005/04/16 15:23:17 UTC

validating fo:table-body: more issues

Team,

I have been thinking more about our decision to not
validate by default the fo:table-body within the
fo:table (i.e., our FOUserAgent.setStrictValidation()
is initialized to "false".)  I have found newer
concerns since then that I want to make sure the team
is at least aware of.

As a review:

The XSL content model for fo:table-body is this:
(table-row+|table-cell+)   (+ = 1 or more)

FOP (both versions, as well as every commercial
version Jeremias could find) is using this instead:
(table-row*|table-cell*)   (* = 0 or more)

(i.e., not halting, but giving an error message in the
logfile if no rows/cells found.)

My newer concerns are:

fo:table-rows are commonly generated by a xsl:template
separate from the xsl:template that generates the
fo:table (because the rows have common formatting
characteristics, etc.)  If the user makes a mistake
with the xsl:template "match" attribute that activates
the fo:table-row template, or if there is an error in
(say) the SQL which generated the input XML, no rows
will be matched, and the user will incorrectly have an
empty table.  These are pretty common errors in XSLT
that would cause the output document to need to be
regenerated anyway.

Take two scenarios here:

1.) 200-page report containing dozens of tables.  If
the user messes up the template match for any of those
tables, that 200-page report just printed will need to
be redone.  But if we validate (i.e. halt) by default
we may very well have saved someone 200 sheets of
paper.  

Relying on the user to scan FOP's output log after a
document run to determine whether or not their
document had every table generated properly would be
IMO too irritating for the majority of users.  They
are more likely to print a document without noticing
that the table on page 157 is erroneously empty.  For
these users, I think they would have been happier had
FOP halted and informed them of their bug instead.

2.) Invoice statements going out to large numbers of
customers.  It can happen for SQL errors to occur that
erroneously result in no XML rows being available for
certain of the invoices.  Under these conditions,
wouldn't most people in charge of the invoices be
happier if FOP halted with the error message, rather
than send erroneous invoices to those customers?  (It
wouldn't matter if an error message was written to the
logfile--that invoice would still be going out.)  Not
validating can introduce some nasty risks with
invoices--it is usually safer not to print it at all
then to send a bad invoice.

Not validating by default is certainly survivable for
us--indeed, none of the implementations in use
currently do, as Jeremias has noted.  But many/most
people are not going to be aware of the
setStrictValidation() method until *after* a bad
production error occurred.  The team may wish to
consider if it would be better for everyone if we told
users who were annoyed about using xsl:if[1] to
suppress invalid empty tables to just set sSV(false),
rather than be in the unfortunate position of telling
others who have just sent out hundreds of bad
invoices/documents to set sSV(true) next time.  

Thanks,
Glen

[1]
http://marc.theaimsgroup.com/?l=fop-user&m=110969361317977&w=2

Re: validating fo:table-body: more issues

Posted by Jeremias Maerki <de...@greenmail.ch>.

Almost sounds like we end up with 3 operating modes:

1. production (strict validation, early halt)
2. production-relaxed (relaxed validation, no halt but with error
messages)
3. development (strict validation, exception/halt at the end after the
output is generated)

(following Glen, 1 would be the default)

Just as an idea: this could also be set through a custom attribute in
the FO file instead of (or in addition to) setting a parameter from the
command-line or through the API.

Anyway, I think Glen and Chris both raise valid points. Now we need to
decide what we really want to do. Personally, I was always happy with
the mode 2 above, so I don't have a strong opinion towards any solution,
as long as mode 2 is available. Doing it like described above is ok with
me.

On 18.04.2005 10:24:30 Chris Bowditch wrote:
> Glen Mazza wrote:
> > Team,
> 
> <snip/>
> 
> > 
> > Take two scenarios here:
> > 
> > 1.) 200-page report containing dozens of tables.  If
> > the user messes up the template match for any of those
> > tables, that 200-page report just printed will need to
> > be redone.  But if we validate (i.e. halt) by default
> > we may very well have saved someone 200 sheets of
> > paper.  
> > 
> > Relying on the user to scan FOP's output log after a
> > document run to determine whether or not their
> > document had every table generated properly would be
> > IMO too irritating for the majority of users.  They
> > are more likely to print a document without noticing
> > that the table on page 157 is erroneously empty.  For
> > these users, I think they would have been happier had
> > FOP halted and informed them of their bug instead.
> > 
> > 2.) Invoice statements going out to large numbers of
> > customers.  It can happen for SQL errors to occur that
> > erroneously result in no XML rows being available for
> > certain of the invoices.  Under these conditions,
> > wouldn't most people in charge of the invoices be
> > happier if FOP halted with the error message, rather
> > than send erroneous invoices to those customers?  (It
> > wouldn't matter if an error message was written to the
> > logfile--that invoice would still be going out.)  Not
> > validating can introduce some nasty risks with
> > invoices--it is usually safer not to print it at all
> > then to send a bad invoice.
> 
> Both of these points are valid concerns. What the software developed by my 
> company does, is it gives the user a flag: Error on Warning, which places the 
> choice firmly with the user. The software then scans the log and halts the job 
> before it is sent to the printer if this flag is set. However, there are many 
> sceanrios where it is not desirable to just halt, hence why this decision 
> should be up to the user not the FOP development team. An example of a 
> situation where just always halting would be bad:
> 
> Our software allows users to preview what their document looks like before it 
> goes into production. I would imagine most stylesheet designers would preview 
> their work before releasing it to a production environment. If the user is 
> just presented with an error message, with no output then the user will have 
> to trawl through various places to work out what theyve done wrong. If they 
> have the output and an error, it is easier for the user to locate the problem.
> 
> <snip/>
> 
> Chris



Jeremias Maerki

Re: validating fo:table-body: more issues

Posted by Chris Bowditch <bo...@hotmail.com>.

Glen Mazza wrote:
> Team,

<snip/>

> 
> Take two scenarios here:
> 
> 1.) 200-page report containing dozens of tables.  If
> the user messes up the template match for any of those
> tables, that 200-page report just printed will need to
> be redone.  But if we validate (i.e. halt) by default
> we may very well have saved someone 200 sheets of
> paper.  
> 
> Relying on the user to scan FOP's output log after a
> document run to determine whether or not their
> document had every table generated properly would be
> IMO too irritating for the majority of users.  They
> are more likely to print a document without noticing
> that the table on page 157 is erroneously empty.  For
> these users, I think they would have been happier had
> FOP halted and informed them of their bug instead.
> 
> 2.) Invoice statements going out to large numbers of
> customers.  It can happen for SQL errors to occur that
> erroneously result in no XML rows being available for
> certain of the invoices.  Under these conditions,
> wouldn't most people in charge of the invoices be
> happier if FOP halted with the error message, rather
> than send erroneous invoices to those customers?  (It
> wouldn't matter if an error message was written to the
> logfile--that invoice would still be going out.)  Not
> validating can introduce some nasty risks with
> invoices--it is usually safer not to print it at all
> then to send a bad invoice.

Both of these points are valid concerns. What the software developed by my 
company does, is it gives the user a flag: Error on Warning, which places the 
choice firmly with the user. The software then scans the log and halts the job 
before it is sent to the printer if this flag is set. However, there are many 
sceanrios where it is not desirable to just halt, hence why this decision 
should be up to the user not the FOP development team. An example of a 
situation where just always halting would be bad:

Our software allows users to preview what their document looks like before it 
goes into production. I would imagine most stylesheet designers would preview 
their work before releasing it to a production environment. If the user is 
just presented with an error message, with no output then the user will have 
to trawl through various places to work out what theyve done wrong. If they 
have the output and an error, it is easier for the user to locate the problem.

<snip/>

Chris