You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Jeremias Maerki <de...@jeremias-maerki.ch> on 2009/02/16 17:11:33 UTC

Issues for after the IF branch merge

Follow-up issues for after the merge where I'd be glad for feedback:

- The new implementations currently all use a MIME suffix ";mode=painter".
They are now sufficiently tested that I'm confident that we can switch
over to them by default. One idea would be to put a switch in
RendererFactory so we can say: prefer IFDocumentHandler instead of
Renderer implementation. Or rather the opposite: by default use the
IFDocumentHandler but allow to switch to the old mode should there be an
unexpected incompatibility. Good idea or bad?

- Given the performance figures I think it would be possible to
deprecate the following implementations: PDF, PS, PCL and AFP. As for
the Java2D implementations: the PNG, Print and AWT Preview parts are not
implemented, yet, so a deprecation here is premature. TXT is probably
good enough like it is. No change necessary there. After all, we won't
deprecate the Renderer interface itself. Good idea or bad?

- A note primarily for Max: For JEuclid to work with the new
implementations, it will require an ImageConverter implementation (from
XML Graphics Commons' image loading framework). I haven't investigated
how hard it would be to provide a compatibility layer so the old
implementations could be used. The old stuff is in many parts tied into
the Renderer architecture. For Barcode4J, I've written such an
implementation already. I can also try to find time to do it for you, if
you like. The good thing is that the ImageConverter implementation is
not FOP-specific but can also be used by anyone using the image loading
framework. Actually, that part might influence the decision for the
first and second points above!


Jeremias Maerki


Re: AW: Issues for after the IF branch merge

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
I guess I can improve that explanation a bit:

Maybe "direct-via-if" is misleading. For a "direct" user, the IF is
never converted to the actual IF XML format. All calls from the
IFRenderer are made directly against the IFDocumentHandler and IFPainter
interfaces (for example PDFDocumentHandler and PDFPainter). You can look
at the IFDocumentHandler/IPFPainter pair like at the SAX ContentHandler.
It's used to tie together a XML processing pipeline without serializing
the actual XML. Only the info set is transported but in an in-memory
form native to Java. The same is done with IFDocumentHandler/IFPainter.
The idea is not to cause a performance penalty for "direct" users but
still allow the advanced functionality for advanced use cases.

On 18.02.2009 10:29:54 Jeremias Maerki wrote:
> Hi Georg
> 
> On 18.02.2009 10:13:21 Georg Datterl wrote:
> > Hi Jeremias,
> > 
> > > I've also made performance measurements as part of this effort which highlights why it was done in the first place:
> > > http://people.apache.org/~jeremias/fop/benchmark-2009-02-13/
> > 
> > I'm probably missing something important here, but regarding the first graph:
> > 
> > Isn't "direct" what ordinary users of fop do? Take a fo-file and then render it? 
> 
> Yes.
> 
> > Isn't "direct-via-if" what ordinary users of fop will do in the future, if IF is the default? 
> 
> Yes.
> 
> > Isn't the work done in "direct-via-if" the sum of the work done in "to-if" and "from-if"? 
> 
> No. "to-if" renders the FO and uses IFSerializer (called by IFRenderer) to
> write an intermediate file. "from-if" parses the intermediate file
> (using IFParser) and generates a series of calls against an
> IFDocumentHandler and IFPainter implementation. So:
> 
> "direct-via-if" = "to-if" + "from-if" - serializing IF - parsing IF
> or
> "to-if" + "from-if" = "direct-via-if" + serializing IF + parsing IF
> 
> The main motivation for the new IF is the fact that "serializing AT XML"
> and especially "parsing AT XML" is very costly.
> 
> See also http://people.apache.org/~jeremias/fop/renderer-design-new.png
> which shows the two different paths.
> 
> > And, if all the above answers are yes, why is the performance gain
> > noticed in "from-if" not more obviously related to the performance gain
> > in "direct-via-if"? 
> 
> There was a "no" above so this is not applicable anymore. I hope I could
> clear up the gap.
> 
> 
> 
> Jeremias Maerki
> 




Jeremias Maerki


Re: AW: AW: Issues for after the IF branch merge

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Right. "ordinary" users should not notice any negative effects. Users of
the Area Tree XML format won't be bothered, either, since nothing
changes there.

The only issue I see is for users who use extensions like Barcode4J or
JEuclid. They will have to upgrade to a newer version of those
extensions. The old extensions were just too tied into the Renderer
interface. The new ones are even usable outside of FOP (i.e.
applications which use the image loading framework in Apache XML
Graphics Commons).

Note to fop-devs: But if we actually do the "priority switch" in
RendererFactory that we discussed in another place even that could by
worked around for a transition time.

On 18.02.2009 10:44:37 Georg Datterl wrote:
> Hi Jeremias,
>  
> I'm getting better and better. I actually understood all parts of your
> mail. :-) So for ordinary users, the change is not very much in terms
> of performance gain, but better than nothing. For me, who will read
> information from the AT in my multi-pass solution to the "duplicate
> cell content problem" it doesn't make any difference either, since both IF
> and the AT-xml I see are generated from the same AreaTreeModel. Right?
> 
> Regards,
>  
> Georg Datterl
>  
> ------ Kontakt ------
>  
> Georg Datterl
>  
> Geneon media solutions gmbh
> Gutenstetter Straße 8a
> 90449 Nürnberg
>  
> HRB Nürnberg: 17193
> Geschäftsführer: Yong-Harry Steiert 
> 
> Tel.: 0911/36 78 88 - 26
> Fax: 0911/36 78 88 - 20
>  
> www.geneon.de
>  
> Weitere Mitglieder der Willmy MediaGroup:
>  
> IRS Integrated Realization Services GmbH:    www.irs-nbg.de 
> Willmy PrintMedia GmbH:                            www.willmy.de
> Willmy Consult & Content GmbH:                 www.willmycc.de 
> -----Ursprüngliche Nachricht-----
> Von: Jeremias Maerki [mailto:dev@jeremias-maerki.ch] 
> Gesendet: Mittwoch, 18. Februar 2009 10:30
> An: fop-dev@xmlgraphics.apache.org
> Betreff: Re: AW: Issues for after the IF branch merge
> 
> Hi Georg
> 
> On 18.02.2009 10:13:21 Georg Datterl wrote:
> > Hi Jeremias,
> > 
> > > I've also made performance measurements as part of this effort which highlights why it was done in the first place:
> > > http://people.apache.org/~jeremias/fop/benchmark-2009-02-13/
> > 
> > I'm probably missing something important here, but regarding the first graph:
> > 
> > Isn't "direct" what ordinary users of fop do? Take a fo-file and then render it? 
> 
> Yes.
> 
> > Isn't "direct-via-if" what ordinary users of fop will do in the future, if IF is the default? 
> 
> Yes.
> 
> > Isn't the work done in "direct-via-if" the sum of the work done in "to-if" and "from-if"? 
> 
> No. "to-if" renders the FO and uses IFSerializer (called by IFRenderer) to write an intermediate file. "from-if" parses the intermediate file (using IFParser) and generates a series of calls against an IFDocumentHandler and IFPainter implementation. So:
> 
> "direct-via-if" = "to-if" + "from-if" - serializing IF - parsing IF or "to-if" + "from-if" = "direct-via-if" + serializing IF + parsing IF
> 
> The main motivation for the new IF is the fact that "serializing AT XML"
> and especially "parsing AT XML" is very costly.
> 
> See also http://people.apache.org/~jeremias/fop/renderer-design-new.png
> which shows the two different paths.
> 
> > And, if all the above answers are yes, why is the performance gain 
> > noticed in "from-if" not more obviously related to the performance 
> > gain in "direct-via-if"?
> 
> There was a "no" above so this is not applicable anymore. I hope I could clear up the gap.
> 
> 
> 
> Jeremias Maerki
> 




Jeremias Maerki


AW: AW: Issues for after the IF branch merge

Posted by Georg Datterl <ge...@geneon.de>.
Hi Jeremias,
 
I'm getting better and better. I actually understood all parts of your mail. :-) So for ordinary users, the change is not very much in terms of performance gain, but better than nothing. For me, who will read information from the AT in my multi-pass solution to the "duplicate cell content problem" it doesn't make any difference either, since both IF and the AT-xml I see are generated from the same AreaTreeModel. Right?

Regards,
 
Georg Datterl
 
------ Kontakt ------
 
Georg Datterl
 
Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg
 
HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert 

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20
 
www.geneon.de
 
Weitere Mitglieder der Willmy MediaGroup:
 
IRS Integrated Realization Services GmbH:    www.irs-nbg.de 
Willmy PrintMedia GmbH:                            www.willmy.de
Willmy Consult & Content GmbH:                 www.willmycc.de 
-----Ursprüngliche Nachricht-----
Von: Jeremias Maerki [mailto:dev@jeremias-maerki.ch] 
Gesendet: Mittwoch, 18. Februar 2009 10:30
An: fop-dev@xmlgraphics.apache.org
Betreff: Re: AW: Issues for after the IF branch merge

Hi Georg

On 18.02.2009 10:13:21 Georg Datterl wrote:
> Hi Jeremias,
> 
> > I've also made performance measurements as part of this effort which highlights why it was done in the first place:
> > http://people.apache.org/~jeremias/fop/benchmark-2009-02-13/
> 
> I'm probably missing something important here, but regarding the first graph:
> 
> Isn't "direct" what ordinary users of fop do? Take a fo-file and then render it? 

Yes.

> Isn't "direct-via-if" what ordinary users of fop will do in the future, if IF is the default? 

Yes.

> Isn't the work done in "direct-via-if" the sum of the work done in "to-if" and "from-if"? 

No. "to-if" renders the FO and uses IFSerializer (called by IFRenderer) to write an intermediate file. "from-if" parses the intermediate file (using IFParser) and generates a series of calls against an IFDocumentHandler and IFPainter implementation. So:

"direct-via-if" = "to-if" + "from-if" - serializing IF - parsing IF or "to-if" + "from-if" = "direct-via-if" + serializing IF + parsing IF

The main motivation for the new IF is the fact that "serializing AT XML"
and especially "parsing AT XML" is very costly.

See also http://people.apache.org/~jeremias/fop/renderer-design-new.png
which shows the two different paths.

> And, if all the above answers are yes, why is the performance gain 
> noticed in "from-if" not more obviously related to the performance 
> gain in "direct-via-if"?

There was a "no" above so this is not applicable anymore. I hope I could clear up the gap.



Jeremias Maerki


Re: AW: Issues for after the IF branch merge

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Hi Georg

On 18.02.2009 10:13:21 Georg Datterl wrote:
> Hi Jeremias,
> 
> > I've also made performance measurements as part of this effort which highlights why it was done in the first place:
> > http://people.apache.org/~jeremias/fop/benchmark-2009-02-13/
> 
> I'm probably missing something important here, but regarding the first graph:
> 
> Isn't "direct" what ordinary users of fop do? Take a fo-file and then render it? 

Yes.

> Isn't "direct-via-if" what ordinary users of fop will do in the future, if IF is the default? 

Yes.

> Isn't the work done in "direct-via-if" the sum of the work done in "to-if" and "from-if"? 

No. "to-if" renders the FO and uses IFSerializer (called by IFRenderer) to
write an intermediate file. "from-if" parses the intermediate file
(using IFParser) and generates a series of calls against an
IFDocumentHandler and IFPainter implementation. So:

"direct-via-if" = "to-if" + "from-if" - serializing IF - parsing IF
or
"to-if" + "from-if" = "direct-via-if" + serializing IF + parsing IF

The main motivation for the new IF is the fact that "serializing AT XML"
and especially "parsing AT XML" is very costly.

See also http://people.apache.org/~jeremias/fop/renderer-design-new.png
which shows the two different paths.

> And, if all the above answers are yes, why is the performance gain
> noticed in "from-if" not more obviously related to the performance gain
> in "direct-via-if"? 

There was a "no" above so this is not applicable anymore. I hope I could
clear up the gap.



Jeremias Maerki


AW: Issues for after the IF branch merge

Posted by Georg Datterl <ge...@geneon.de>.
Hi Jeremias,

> I've also made performance measurements as part of this effort which highlights why it was done in the first place:
> http://people.apache.org/~jeremias/fop/benchmark-2009-02-13/

I'm probably missing something important here, but regarding the first graph:

Isn't "direct" what ordinary users of fop do? Take a fo-file and then render it? 
Isn't "direct-via-if" what ordinary users of fop will do in the future, if IF is the default? 
Isn't the work done in "direct-via-if" the sum of the work done in "to-if" and "from-if"? 

And, if all the above answers are yes, why is the performance gain noticed in "from-if" not more obviously related to the performance gain in "direct-via-if"? 

Regards,
 
Georg Datterl
 
------ Kontakt ------
 
Georg Datterl
 
Geneon media solutions gmbh
Gutenstetter Straße 8a
90449 Nürnberg
 
HRB Nürnberg: 17193
Geschäftsführer: Yong-Harry Steiert 

Tel.: 0911/36 78 88 - 26
Fax: 0911/36 78 88 - 20
 
www.geneon.de
 
Weitere Mitglieder der Willmy MediaGroup:
 
IRS Integrated Realization Services GmbH:    www.irs-nbg.de 
Willmy PrintMedia GmbH:                            www.willmy.de
Willmy Consult & Content GmbH:                 www.willmycc.de 

Re: Issues for after the IF branch merge

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Hi Martin

On 18.02.2009 07:33:55 Martin Edge wrote:
> Hi Jeremias,
> 
> Is this a IF XML structure change? Or merely the way it's being processed?

I'm not sure how much you picked up from the discussions so I'll start
at the beginning (just to be sure):

The "IF branch" introduces a completely new XML-based intermediate
formats for FOP. The old one (the Area Tree XML, the one you use) still
exists with no changes and will continue to be available. I know this
might create some confusion which is why I tried to make it especially
clear in the documentation how they stand to each other and which one
should be used in which case. Both have their use cases. I've updated
the documentation already but it's not live on the website as it's still
only found in the branch. However, you can read through the
documentation sources here:
http://svn.apache.org/viewvc/xmlgraphics/fop/branches/Temp_AreaTreeNewDesign/src/documentation/content/xdocs/trunk/intermediate.xml?view=markup
(please don't look to closely at the name of the branch. It's misleading.)

We continue to use the Area Tree XML format as our primary format for
layout engine tests as it contains more information than the new one.

So in short: the old IF is renamed to "Area Tree XML" (with no changes
to it) and the new IF becomes our main "IF" (a completely new format
optimized for speed).

The API for working with it is a different one although similar to the
Area Tree XML format.

> I make changes to the IF prior to rendering my final document as I need to
> put information on each page generated and be aware of the total amount of
> pages rendered (a barcode as it was). 

These IF-related changes are 100% backwards-compatible. Don't worry. The
only thing that might require someone's attention is if someone has a
subclass of one of the deprecated Renderers.

But depending on what you're doing you might want to investigate if the
newly won performance (with the new IF) is actually something you could
profit from. Of course, that will mean some work for you because it's a
different format. But given the performance increase it might be worth
it.

> When you say deprecate following implementations, do you mean the fact that
> the IF previously held some reference on what type of output the IF was
> created for?

What we discuss to deprecate are the Renderer implementation for PDF, PS,
AFP, PCL and TIFF, i.e. PDFRenderer, PSRenderer, AFPRenderer,
PCLRenderer and TIFFRenderer. These classes would be taken off the
service registration file or the RendererFactory class will simply let
the new IFDocumentHandler/IFPainter implementations take precedence if
they are available.

If you select the output format in code through its MIME type, you won't
notice the change. An example: if you select "application/postscript"
now, RendererFactory will instantiate an instance of PSRenderer. After
these changes, RendererFactory will create an IFRenderer instance which
will delegate to the new class PSDocumentHandler (and PSPainter).

> Where can I get more information on this change?

The original idea was developed on this Wiki page:
http://wiki.apache.org/xmlgraphics-fop/AreaTreeIntermediateXml/NewDesign

The updated documentation (as already mentioned above) can currently be
read from here:
http://svn.apache.org/viewvc/xmlgraphics/fop/branches/Temp_AreaTreeNewDesign/src/documentation/content/xdocs/trunk/intermediate.xml?view=markup

I've also made performance measurements as part of this effort which
highlights why it was done in the first place:
http://people.apache.org/~jeremias/fop/benchmark-2009-02-13/


I hope that clears up everything. If not, please don't hesitate to say
so. I'm actually very happy to see stakeholders speak up. The more
feedback there is the easier it is to make everyone happy (or as happy
as is possible).

> Thanks :-)
> 
> Martin.
> 
>  
> 
> -----Original Message-----
> From: Jeremias Maerki [mailto:dev@jeremias-maerki.ch] 
> Sent: Tuesday, February 17, 2009 3:12 AM
> To: fop-dev@xmlgraphics.apache.org
> Subject: Issues for after the IF branch merge
> 
> Follow-up issues for after the merge where I'd be glad for feedback:
> 
> - The new implementations currently all use a MIME suffix ";mode=painter".
> They are now sufficiently tested that I'm confident that we can switch over
> to them by default. One idea would be to put a switch in RendererFactory so
> we can say: prefer IFDocumentHandler instead of Renderer implementation. Or
> rather the opposite: by default use the IFDocumentHandler but allow to
> switch to the old mode should there be an unexpected incompatibility. Good
> idea or bad?
> 
> - Given the performance figures I think it would be possible to deprecate
> the following implementations: PDF, PS, PCL and AFP. As for the Java2D
> implementations: the PNG, Print and AWT Preview parts are not implemented,
> yet, so a deprecation here is premature. TXT is probably good enough like it
> is. No change necessary there. After all, we won't deprecate the Renderer
> interface itself. Good idea or bad?
> 
> - A note primarily for Max: For JEuclid to work with the new
> implementations, it will require an ImageConverter implementation (from XML
> Graphics Commons' image loading framework). I haven't investigated how hard
> it would be to provide a compatibility layer so the old implementations
> could be used. The old stuff is in many parts tied into the Renderer
> architecture. For Barcode4J, I've written such an implementation already. I
> can also try to find time to do it for you, if you like. The good thing is
> that the ImageConverter implementation is not FOP-specific but can also be
> used by anyone using the image loading framework. Actually, that part might
> influence the decision for the first and second points above!
> 
> 
> Jeremias Maerki
> 




Jeremias Maerki


RE: Issues for after the IF branch merge

Posted by Martin Edge <Ma...@asmorphic.net.au>.
Hi Jeremias,

Is this a IF XML structure change? Or merely the way it's being processed?

I make changes to the IF prior to rendering my final document as I need to
put information on each page generated and be aware of the total amount of
pages rendered (a barcode as it was). 

When you say deprecate following implementations, do you mean the fact that
the IF previously held some reference on what type of output the IF was
created for?

Where can I get more information on this change?

Thanks :-)

Martin.

 

-----Original Message-----
From: Jeremias Maerki [mailto:dev@jeremias-maerki.ch] 
Sent: Tuesday, February 17, 2009 3:12 AM
To: fop-dev@xmlgraphics.apache.org
Subject: Issues for after the IF branch merge

Follow-up issues for after the merge where I'd be glad for feedback:

- The new implementations currently all use a MIME suffix ";mode=painter".
They are now sufficiently tested that I'm confident that we can switch over
to them by default. One idea would be to put a switch in RendererFactory so
we can say: prefer IFDocumentHandler instead of Renderer implementation. Or
rather the opposite: by default use the IFDocumentHandler but allow to
switch to the old mode should there be an unexpected incompatibility. Good
idea or bad?

- Given the performance figures I think it would be possible to deprecate
the following implementations: PDF, PS, PCL and AFP. As for the Java2D
implementations: the PNG, Print and AWT Preview parts are not implemented,
yet, so a deprecation here is premature. TXT is probably good enough like it
is. No change necessary there. After all, we won't deprecate the Renderer
interface itself. Good idea or bad?

- A note primarily for Max: For JEuclid to work with the new
implementations, it will require an ImageConverter implementation (from XML
Graphics Commons' image loading framework). I haven't investigated how hard
it would be to provide a compatibility layer so the old implementations
could be used. The old stuff is in many parts tied into the Renderer
architecture. For Barcode4J, I've written such an implementation already. I
can also try to find time to do it for you, if you like. The good thing is
that the ImageConverter implementation is not FOP-specific but can also be
used by anyone using the image loading framework. Actually, that part might
influence the decision for the first and second points above!


Jeremias Maerki



Re: Issues for after the IF branch merge

Posted by Chris Bowditch <bo...@hotmail.com>.
The Web Maestro wrote:

> On Mon, Feb 23, 2009 at 5:52 AM, Jeremias Maerki <de...@jeremias-maerki.ch> wrote:
> 
>>I'm a little hesitant to just go ahead here without more than one
>>opinion. I could interpret the silence (and the +1 votes on the merge) as
>>lazy assent but I'd be more comfortable with a bit more nodding. Thanks.
>>
>>Jeremias Maerki

I'm not keen to deprecate so soon after the introduction of the new IF 
Painters. We still have a large customer base working with the Renderers 
and I need to support them for the next 2-3 years. I realise that you're 
not removing them yet, so I won't veto the proposal. OTOH I realise that 
the project doesn't want to encourage new users of the old Renderers, so 
  my vote is +0.

Thanks,

Chris



Re: Issues for after the IF branch merge

Posted by The Web Maestro <th...@gmail.com>.
On Mon, Feb 23, 2009 at 5:52 AM, Jeremias Maerki <de...@jeremias-maerki.ch> wrote:
> I'm a little hesitant to just go ahead here without more than one
> opinion. I could interpret the silence (and the +1 votes on the merge) as
> lazy assent but I'd be more comfortable with a bit more nodding. Thanks.
>
> Jeremias Maerki

This sounds like a move in a positive direction. +1 from me.

Regards,

The Web Maestro
-- 
<th...@gmail.com> - <http://ourlil.com/>
My religion is simple. My religion is kindness.
- HH The 14th Dalai Lama of Tibet

Re: Issues for after the IF branch merge

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 16.02.2009 17:11:33 Jeremias Maerki wrote:
> Follow-up issues for after the merge where I'd be glad for feedback:
> 
> - The new implementations currently all use a MIME suffix ";mode=painter".
> They are now sufficiently tested that I'm confident that we can switch
> over to them by default. One idea would be to put a switch in
> RendererFactory so we can say: prefer IFDocumentHandler instead of
> Renderer implementation. Or rather the opposite: by default use the
> IFDocumentHandler but allow to switch to the old mode should there be an
> unexpected incompatibility. Good idea or bad?

http://svn.apache.org/viewvc?rev=747010&view=rev
http://svn.apache.org/viewvc?rev=747015&view=rev

I've done this now, but in the more careful variant. People can switch
back to using Renderer if they need to do so.

fopFactory.getRendererFactory().setRendererPreferred(true);
or
<prefer-renderer>true</prefer-renderer>

However, if this is necessary for anyone I'd like to hear about it. The
only reason I can currently imagine is if someone uses a FOP extension
such as Barcode4J or JEuclid. The necessary new extension is already
available from Barcode4J's CVS HEAD. Max, if you want help or have
trouble updating JEuclid, please let me know.

> - Given the performance figures I think it would be possible to
> deprecate the following implementations: PDF, PS, PCL and AFP. As for
> the Java2D implementations: the PNG, Print and AWT Preview parts are not
> implemented, yet, so a deprecation here is premature. TXT is probably
> good enough like it is. No change necessary there. After all, we won't
> deprecate the Renderer interface itself. Good idea or bad?

I'm a little hesitant to just go ahead here without more than one
opinion. I could interpret the silence (and the +1 votes on the merge) as
lazy assent but I'd be more comfortable with a bit more nodding. Thanks.

<snip/>

Jeremias Maerki


Re: Issues for after the IF branch merge

Posted by Vincent Hennebert <vh...@gmail.com>.
Hi Jeremias,

Jeremias Maerki wrote:
> Hi Vincent
> 
> On 17.02.2009 11:22:05 Vincent Hennebert wrote:
>> Hi Jeremias,
>>
<snip/>
>>> - Given the performance figures I think it would be possible to
>>> deprecate the following implementations: PDF, PS, PCL and AFP. As for
>>> the Java2D implementations: the PNG, Print and AWT Preview parts are not
>>> implemented, yet, so a deprecation here is premature. TXT is probably
>>> good enough like it is. No change necessary there. After all, we won't
>>> deprecate the Renderer interface itself. Good idea or bad?
>> Well, I think it’s perfect time for reconsidering which outputs we
>> really want to support. For me, that should be outputs for which there
>> is commercial support,
> 
> I do commercial support for all of them if I get asked. ;-) Anyway, I
> don't think it's in the ASF spirit to make that dependent on the
> availability of commercial support.

Right. My wording was a bit unfortunate, but I basically agree with what
you say below. In short: as long as that works, fine. As soon as that
starts creating maintenance issues, let’s consider to get rid of it.


>> or for which there is a large user base, or an
> 
> Only measurable by user survey. And I'm not sure that would be
> representative. But then, who doesn't participate doesn't get to have a
> say.

Exactly. That’s more what I was meaning actually. It’s also a matter of
communication. We can clearly state on the website: “Due to our limited
resources we are sorry to inform you that we have to drop support for
this and this...” Open-source, everyone free to participate, etc.


>> active committer willing to maintain them. Which means: do we care about
>> PNG and TXT?
> 
> Having PNG is a no-brainer if TIFF is already around. Just a few lines
> of code. I'm certain there are people who can use that. It just wasn't
> high on my priority list which is why I put that off. Give me a rainy
> Sunday and it's back in.
> 
> As for TXT, I got the impression that there are still people who use
> that. Not many, granted. And also nobody complained when some special
> features were cut out a while back (did anyone other than me notice?).
> Basically, as long as no changes in the renderer layer are necessary I
> see no reason to just drop that support. The TXTRenderer did get some
> attention in the past year.
> 
>> Both can easily be obtained from PDF.
> 
> With software under a similar license as Apache FOP? I don't think so.

That’s where I can clarify a bit: I believe distinction can be made here
between corporate users and others. Corporate users shipping FOP with
their commercial product can invest time/money in the maintenance of
those renderers that they need. I tend to think that licenses are not
really a problem for individual users (may be wrong though). The example
I have in mind: if for some reason I want to put a text output of some
sample document on the wiki, I’m happy to use pdftotext (Poppler, GPLv2)
instead of FOP.


> PDFBox is getting there but right now I wouldn't just position it as a
> replacement. Furthermore, a two-step process is always slower than a
> direct approach.
> 
>> Let’s make a poll on
>> fop-users, and abandon those that don’t attract enough interest. Those
>> users whose business critically depends on a particular output can
> 
> I don't think we should just focus on the business side. There are other
> users, too.
> 
>> always make sure that it gets moved into the ‘commercially supported’
>> category. (Same question about Print and AWT, although I believe they
>> fall into one of the first two categories.)
> 
> Print and AWT should be relatively easy to port. Maybe another rainy
> Sunday... Until then, the existing code works just fine.

I’m ok with that but the concerns I have is: what do we do when it stops
working just fine? If some changes are made to the area tree that impact
the renderers? I’m afraid of the maintenance burden on a middle/long
term. But that’s probably a question for later.


>> Why wouldn’t we deprecate the Renderer interface? Does it have any
>> feature that the new interface can’t provide? Let’s deprecate it, and
>> remove it after ‘enough time’ has passed.
> 
> That's not so easy. The Renderer layer contains functionality that the
> IF layer doesn't have: it converts the relative placement of areas to
> absolute marks. You could of course move that into the layout managers
> but that would change (or make obsolete) the area tree. Good on one side,
> but on the other side, that's a lot of work and our entire test suite
> depends on that. Imagine updating the checks for 500 tests.

Hmmm... no ;-)
Good point indeed. I forgot about the layoutengine tests.


> It also needs to be noted, that the new IF contains less information 
> than the Area Tree XML. For layout engine tests, the Area Tree XML is 
> still more attractive (although you can now write tests for both). 
> People doing two-pass processing might still prefer the area tree XML 
> in certain cases (I don't know).

Let me refine my concerns: granted, Area Tree XML is still needed, if
only for layout tests. But do ‘external’ people still need to use it,
now that IF is available? I don’t know. Probably not.
In other terms: do we want the Area Tree XML to be usable from outside
the project, hence maintain features and backwards compatibility? Or can
we keep it internal, in which case we don’t have to worry about cutting
functionalities away or whatever. If that makes sense at all.


> Just an idea: What about moving out the various output formats
> (especially AFP, since it's so big and used only by a few) to separate
> source trees? A "FOP Core" plus n output plug-ins.

This is a step towards modularization. I’m all for it!


> TXTRenderer could be
> on of them. It would be moved out and if for some reason it lagged
> behind because of any changes in the core, someone really interested
> could bring it back more easily than when we just remove it.

Good, good, good.


> I don't
> really like removing functionality that some people have invested a lot
> of time in. If it gets in the way, something can be done about it but I
> don't think TXTRenderer is in our way. At least I don't have a problem
> with it sticking around. And I see no imminent reasons for changing the
> Renderer layer now that we have the new IF.
> 
> <snip what="JEuclid part"/>
> 
> 
> Jeremias Maerki

Thanks,
Vincent

Re: Issues for after the IF branch merge

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Hi Vincent

On 17.02.2009 11:22:05 Vincent Hennebert wrote:
> Hi Jeremias,
> 
> My general view: we want to get rid of Renderers as soon as possible. We
> don’t have enough resources to maintain two different rendering paths.
> As far as I’m concerned, I don’t have knowledge about the rendering
> stage. But if I ever had to work on that area, and now that
> IFDocumentHandler exists, I would be strongly reluctant to invest any
> time in the Renderer path.

I mostly agree. However, I'd say: I'd like to get rid of the Renderers
for which we have equivalent (if not better) IF implementations.

> As to your more specific questions:
> 
> Jeremias Maerki wrote:
> > Follow-up issues for after the merge where I'd be glad for feedback:
> > 
> > - The new implementations currently all use a MIME suffix ";mode=painter".
> > They are now sufficiently tested that I'm confident that we can switch
> > over to them by default. One idea would be to put a switch in
> > RendererFactory so we can say: prefer IFDocumentHandler instead of
> > Renderer implementation. Or rather the opposite: by default use the
> > IFDocumentHandler but allow to switch to the old mode should there be an
> > unexpected incompatibility. Good idea or bad?
> 
> Is that realistic to do it this way: no switch, we directly use
> IFDocumentHandler, any rendering difference is a bug that shall be
> corrected sooner rather than later. Anyway, we will first release a beta
> version for that very purpose IIC.

I believe that's a possible and realistic way. Not a bad one either.
Just a bit more daring than my proposal. Maybe I'm too cautious.

> If that really is unfeasible, then +1 to your latter proposal:
> IFDocumentHandler by default, with a possibility to fall back to the old
> Renderer if really necessary.
> 
> 
> > - Given the performance figures I think it would be possible to
> > deprecate the following implementations: PDF, PS, PCL and AFP. As for
> > the Java2D implementations: the PNG, Print and AWT Preview parts are not
> > implemented, yet, so a deprecation here is premature. TXT is probably
> > good enough like it is. No change necessary there. After all, we won't
> > deprecate the Renderer interface itself. Good idea or bad?
> 
> Well, I think it’s perfect time for reconsidering which outputs we
> really want to support. For me, that should be outputs for which there
> is commercial support,

I do commercial support for all of them if I get asked. ;-) Anyway, I
don't think it's in the ASF spirit to make that dependent on the
availability of commercial support.

> or for which there is a large user base, or an

Only measurable by user survey. And I'm not sure that would be
representative. But then, who doesn't participate doesn't get to have a
say.

> active committer willing to maintain them. Which means: do we care about
> PNG and TXT?

Having PNG is a no-brainer if TIFF is already around. Just a few lines
of code. I'm certain there are people who can use that. It just wasn't
high on my priority list which is why I put that off. Give me a rainy
Sunday and it's back in.

As for TXT, I got the impression that there are still people who use
that. Not many, granted. And also nobody complained when some special
features were cut out a while back (did anyone other than me notice?).
Basically, as long as no changes in the renderer layer are necessary I
see no reason to just drop that support. The TXTRenderer did get some
attention in the past year.

> Both can easily be obtained from PDF.

With software under a similar license as Apache FOP? I don't think so.
PDFBox is getting there but right now I wouldn't just position it as a
replacement. Furthermore, a two-step process is always slower than a
direct approach.

> Let’s make a poll on
> fop-users, and abandon those that don’t attract enough interest. Those
> users whose business critically depends on a particular output can

I don't think we should just focus on the business side. There are other
users, too.

> always make sure that it gets moved into the ‘commercially supported’
> category. (Same question about Print and AWT, although I believe they
> fall into one of the first two categories.)

Print and AWT should be relatively easy to port. Maybe another rainy
Sunday... Until then, the existing code works just fine.

> Why wouldn’t we deprecate the Renderer interface? Does it have any
> feature that the new interface can’t provide? Let’s deprecate it, and
> remove it after ‘enough time’ has passed.

That's not so easy. The Renderer layer contains functionality that the
IF layer doesn't have: it converts the relative placement of areas to
absolute marks. You could of course move that into the layout managers
but that would change (or make obsolete) the area tree. Good on one side,
but on the other side, that's a lot of work and our entire test suite
depends on that. Imagine updating the checks for 500 tests. It also
needs to be noted, that the new IF contains less information than the
Area Tree XML. For layout engine tests, the Area Tree XML is still more
attractive (although you can now write tests for both). People doing
two-pass processing might still prefer the area tree XML in certain
cases (I don't know).

Just an idea: What about moving out the various output formats
(especially AFP, since it's so big and used only by a few) to separate
source trees? A "FOP Core" plus n output plug-ins. TXTRenderer could be
on of them. It would be moved out and if for some reason it lagged
behind because of any changes in the core, someone really interested
could bring it back more easily than when we just remove it. I don't
really like removing functionality that some people have invested a lot
of time in. If it gets in the way, something can be done about it but I
don't think TXTRenderer is in our way. At least I don't have a problem
with it sticking around. And I see no imminent reasons for changing the
Renderer layer now that we have the new IF.

<snip what="JEuclid part"/>


Jeremias Maerki


Re: Issues for after the IF branch merge

Posted by Vincent Hennebert <vh...@gmail.com>.
Hi Jeremias,

My general view: we want to get rid of Renderers as soon as possible. We
don’t have enough resources to maintain two different rendering paths.
As far as I’m concerned, I don’t have knowledge about the rendering
stage. But if I ever had to work on that area, and now that
IFDocumentHandler exists, I would be strongly reluctant to invest any
time in the Renderer path.

As to your more specific questions:

Jeremias Maerki wrote:
> Follow-up issues for after the merge where I'd be glad for feedback:
> 
> - The new implementations currently all use a MIME suffix ";mode=painter".
> They are now sufficiently tested that I'm confident that we can switch
> over to them by default. One idea would be to put a switch in
> RendererFactory so we can say: prefer IFDocumentHandler instead of
> Renderer implementation. Or rather the opposite: by default use the
> IFDocumentHandler but allow to switch to the old mode should there be an
> unexpected incompatibility. Good idea or bad?

Is that realistic to do it this way: no switch, we directly use
IFDocumentHandler, any rendering difference is a bug that shall be
corrected sooner rather than later. Anyway, we will first release a beta
version for that very purpose IIC.

If that really is unfeasible, then +1 to your latter proposal:
IFDocumentHandler by default, with a possibility to fall back to the old
Renderer if really necessary.


> - Given the performance figures I think it would be possible to
> deprecate the following implementations: PDF, PS, PCL and AFP. As for
> the Java2D implementations: the PNG, Print and AWT Preview parts are not
> implemented, yet, so a deprecation here is premature. TXT is probably
> good enough like it is. No change necessary there. After all, we won't
> deprecate the Renderer interface itself. Good idea or bad?

Well, I think it’s perfect time for reconsidering which outputs we
really want to support. For me, that should be outputs for which there
is commercial support, or for which there is a large user base, or an
active committer willing to maintain them. Which means: do we care about
PNG and TXT? Both can easily be obtained from PDF. Let’s make a poll on
fop-users, and abandon those that don’t attract enough interest. Those
users whose business critically depends on a particular output can
always make sure that it gets moved into the ‘commercially supported’
category. (Same question about Print and AWT, although I believe they
fall into one of the first two categories.)

Why wouldn’t we deprecate the Renderer interface? Does it have any
feature that the new interface can’t provide? Let’s deprecate it, and
remove it after ‘enough time’ has passed.


> - A note primarily for Max: For JEuclid to work with the new
> implementations, it will require an ImageConverter implementation (from
> XML Graphics Commons' image loading framework). I haven't investigated
> how hard it would be to provide a compatibility layer so the old
> implementations could be used. The old stuff is in many parts tied into
> the Renderer architecture. For Barcode4J, I've written such an
> implementation already. I can also try to find time to do it for you, if
> you like. The good thing is that the ImageConverter implementation is
> not FOP-specific but can also be used by anyone using the image loading
> framework. Actually, that part might influence the decision for the
> first and second points above!
> 
> 
> Jeremias Maerki

Vincent