You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Jeremias Maerki <de...@greenmail.ch> on 2005/05/23 09:35:48 UTC

Markers: Determining the last generated area for a LM

As you may have seen I've been working through the layoutengine
testcases to fix various failures/bugs last week. One of the last
problems that need to be fixed is markers. Markers already work fine
under the new page breaking mechanism when an FO is not broken over the
page/column boundaries.

The problem is getting the two last booleans on getCurrentPV().addMarkers()
right. Currently the calls are hardcoded to:
getCurrentPV().addMarkers(markers, true, true, false);
and
getCurrentPV().addMarkers(markers, false, false, true);

The isfirst and islast parameters must be set correctly. Currently, I
don't see a reliable way to determine these values. For example, there's
some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
the layout context but I found this doesn't work reliably. I've
experimented with two other approaches both of which were not good
enough. One (flags on Position instances) failed because the first n
elements at the beginning of the element list may be removed which also
removed the marker for the first element in the list. The other
(counting Position instances) failed because the element list may be
modified after the initial generation thus throwing off counters. I
discarded this mainly because I didn't want to make the code more
complicated just to get the indices right again.

The only thing that sounds like worth pursuing right now is to do
look-behind and look-ahead in the Position iterator, which is in a way
extending the approach that is currently visible in AreaAdditionUtils.
This approach checks whether the current LM changes or not.

Maybe someone has another idea on how to approach this problem. I'll let
it rest for a moment until I've made keeps and breaks work on tables.

Jeremias Maerki


Re: Markers: Determining the last generated area for a LM

Posted by Glen Mazza <gm...@apache.org>.
Also, one more point--I think it may be a good idea for us to abstract 
out AreaTreeModel from PSLM and encapsulate it back into AreaTreeHandler 
(i.e. RootLayoutManager), including moving resolveRetrieveMarker() 
there.  IIRC I was the guilty party who moved ATM into PSLM to begin 
with, quite erroneously thinking that ATH might be proven superfluous 
over time, and so trying to make direct ATM<-->PSLM linkages.  ATH is 
here to stay, though, and resolveRetrieveMarker() is something that 
cycles through the results of several PSLM instances so it seems more 
natural/intuitive to have it in the higher, root-level processing class 
here.  Thoughts?

Thanks,
Glen


Glen Mazza wrote:

> Jeremias, I think we do something like this for ID's already -- I 
> wonder if we can use a similar approach here.
>
> We already have a PSLM.getFirstPVWithID() method, which due to the 
> (Map/List) data structure that contains this information in 
> AreaTreeHandler, can probably be easily converted to a 
> PSLM.getLastPVWithID().  Note that with this method, when we add PV's 
> having a given ID, we don't bother needing to send "is first" or "is 
> last" indications, that is easily determinable by the List when it is 
> complete for that property ID.
>
> Can we do a similar thing for markers?  I.e., feed a data structure 
> without needing to give first/last indications, and rely on the state 
> of that structure to subsequently find out what is first/last?
> Thanks,
> Glen
>
>
> Jeremias Maerki wrote:
>
>> As you may have seen I've been working through the layoutengine
>> testcases to fix various failures/bugs last week. One of the last
>> problems that need to be fixed is markers. Markers already work fine
>> under the new page breaking mechanism when an FO is not broken over the
>> page/column boundaries.
>>
>> The problem is getting the two last booleans on 
>> getCurrentPV().addMarkers()
>> right. Currently the calls are hardcoded to:
>> getCurrentPV().addMarkers(markers, true, true, false);
>> and
>> getCurrentPV().addMarkers(markers, false, false, true);
>>
>> The isfirst and islast parameters must be set correctly. Currently, I
>> don't see a reliable way to determine these values. For example, there's
>> some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
>> the layout context but I found this doesn't work reliably. I've
>> experimented with two other approaches both of which were not good
>> enough. One (flags on Position instances) failed because the first n
>> elements at the beginning of the element list may be removed which also
>> removed the marker for the first element in the list. The other
>> (counting Position instances) failed because the element list may be
>> modified after the initial generation thus throwing off counters. I
>> discarded this mainly because I didn't want to make the code more
>> complicated just to get the indices right again.
>>
>> The only thing that sounds like worth pursuing right now is to do
>> look-behind and look-ahead in the Position iterator, which is in a way
>> extending the approach that is currently visible in AreaAdditionUtils.
>> This approach checks whether the current LM changes or not.
>>
>> Maybe someone has another idea on how to approach this problem. I'll let
>> it rest for a moment until I've made keeps and breaks work on tables.
>>
>> Jeremias Maerki
>>
>>
>>  
>>
>
>


Re: Markers: Determining the last generated area for a LM

Posted by Jeremias Maerki <de...@greenmail.ch>.
Sadly, that won't work. You'd have to make FOP a two-pass system to use
that approach where side regions are layed out in the second pass. With
your idea getLastPVWithID() will only result in a correct value after an
FO is fully distributed to PageViewports. That would, for example, kill
the ability to do out-of-line rendering of pages that can immediately be
fully resolved. The approach I'm currently working on takes very little
additional processing power and just a little bit more memory per
Position instance. Only for special cases additional processing is
needed. I'm currently trying to get markers on table-body for which
there is no separate LM anymore. Cases like this make the whole thing a
little more complicated but there's room for optimization, i.e. the
additional processing can be skipped if a table-body has no markers
(which is probably a common case anyway).

On 28.05.2005 07:13:20 Glen Mazza wrote:
> Jeremias, I think we do something like this for ID's already -- I wonder 
> if we can use a similar approach here.
> 
> We already have a PSLM.getFirstPVWithID() method, which due to the 
> (Map/List) data structure that contains this information in 
> AreaTreeHandler, can probably be easily converted to a 
> PSLM.getLastPVWithID().  Note that with this method, when we add PV's 
> having a given ID, we don't bother needing to send "is first" or "is 
> last" indications, that is easily determinable by the List when it is 
> complete for that property ID.
> 
> Can we do a similar thing for markers?  I.e., feed a data structure 
> without needing to give first/last indications, and rely on the state of 
> that structure to subsequently find out what is first/last? 
> 
> Thanks,
> Glen
> 
> 
> Jeremias Maerki wrote:
> 
> >As you may have seen I've been working through the layoutengine
> >testcases to fix various failures/bugs last week. One of the last
> >problems that need to be fixed is markers. Markers already work fine
> >under the new page breaking mechanism when an FO is not broken over the
> >page/column boundaries.
> >
> >The problem is getting the two last booleans on getCurrentPV().addMarkers()
> >right. Currently the calls are hardcoded to:
> >getCurrentPV().addMarkers(markers, true, true, false);
> >and
> >getCurrentPV().addMarkers(markers, false, false, true);
> >
> >The isfirst and islast parameters must be set correctly. Currently, I
> >don't see a reliable way to determine these values. For example, there's
> >some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
> >the layout context but I found this doesn't work reliably. I've
> >experimented with two other approaches both of which were not good
> >enough. One (flags on Position instances) failed because the first n
> >elements at the beginning of the element list may be removed which also
> >removed the marker for the first element in the list. The other
> >(counting Position instances) failed because the element list may be
> >modified after the initial generation thus throwing off counters. I
> >discarded this mainly because I didn't want to make the code more
> >complicated just to get the indices right again.
> >
> >The only thing that sounds like worth pursuing right now is to do
> >look-behind and look-ahead in the Position iterator, which is in a way
> >extending the approach that is currently visible in AreaAdditionUtils.
> >This approach checks whether the current LM changes or not.
> >
> >Maybe someone has another idea on how to approach this problem. I'll let
> >it rest for a moment until I've made keeps and breaks work on tables.
> >
> >Jeremias Maerki
> >
> >
> >  
> >



Jeremias Maerki


Re: Markers: Determining the last generated area for a LM

Posted by Glen Mazza <gm...@apache.org>.
Jeremias, I think we do something like this for ID's already -- I wonder 
if we can use a similar approach here.

We already have a PSLM.getFirstPVWithID() method, which due to the 
(Map/List) data structure that contains this information in 
AreaTreeHandler, can probably be easily converted to a 
PSLM.getLastPVWithID().  Note that with this method, when we add PV's 
having a given ID, we don't bother needing to send "is first" or "is 
last" indications, that is easily determinable by the List when it is 
complete for that property ID.

Can we do a similar thing for markers?  I.e., feed a data structure 
without needing to give first/last indications, and rely on the state of 
that structure to subsequently find out what is first/last? 

Thanks,
Glen


Jeremias Maerki wrote:

>As you may have seen I've been working through the layoutengine
>testcases to fix various failures/bugs last week. One of the last
>problems that need to be fixed is markers. Markers already work fine
>under the new page breaking mechanism when an FO is not broken over the
>page/column boundaries.
>
>The problem is getting the two last booleans on getCurrentPV().addMarkers()
>right. Currently the calls are hardcoded to:
>getCurrentPV().addMarkers(markers, true, true, false);
>and
>getCurrentPV().addMarkers(markers, false, false, true);
>
>The isfirst and islast parameters must be set correctly. Currently, I
>don't see a reliable way to determine these values. For example, there's
>some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
>the layout context but I found this doesn't work reliably. I've
>experimented with two other approaches both of which were not good
>enough. One (flags on Position instances) failed because the first n
>elements at the beginning of the element list may be removed which also
>removed the marker for the first element in the list. The other
>(counting Position instances) failed because the element list may be
>modified after the initial generation thus throwing off counters. I
>discarded this mainly because I didn't want to make the code more
>complicated just to get the indices right again.
>
>The only thing that sounds like worth pursuing right now is to do
>look-behind and look-ahead in the Position iterator, which is in a way
>extending the approach that is currently visible in AreaAdditionUtils.
>This approach checks whether the current LM changes or not.
>
>Maybe someone has another idea on how to approach this problem. I'll let
>it rest for a moment until I've made keeps and breaks work on tables.
>
>Jeremias Maerki
>
>
>  
>


Re: Markers: Determining the last generated area for a LM

Posted by Jeremias Maerki <de...@greenmail.ch>.
After a lot of thinking and experimenting I finally resolved to take up
the idea below again. When I started distinguishing between Positions
that indirectly generate area and those that do not, I was suddenly able
to create a relatively easy and (hopefully) stable machanism to
determine the first and last areas of a LayoutManager. It already works
on my machine for flow, block and block-container (markers6b passes).
Now I'm trying to add marker support for tables which is a bit special
since we don't have the rigid hierarchy of LMs like before. But I'm
pretty sure this is also doable without to much effort.

There's a downside with all this. There was the idea earlier of not
nesting Positions anymore, but with the above approach I need at integer
member variable on Position. That means we'll have to stick with the
nesting if noone comes up with a better idea. The LM only needs two
integers, one determining the first index ever passed through to an
addArea() method and an integer that has the double function of serving
as a running counter of Positions (seeds the Position.setIndex(int)) and
of helping determine if a Position is the last. At least, this way the
nested Position have more of a reason to exist and take up memory.

I think this approach should be pretty stable against the
getChangedKnuthElements() stage, though I could be wrong. This stage is
a topic I'm still not 100% familiar with, yet.

I'll wait a bit before I commit, so you'll have a chance to veto if
anyone sees a serious problem with this. After all, I still have to deal
with markers on tables first.

On 23.05.2005 09:35:48 Jeremias Maerki wrote:
> The other
> (counting Position instances) failed because the element list may be
> modified after the initial generation thus throwing off counters. I
> discarded this mainly because I didn't want to make the code more
> complicated just to get the indices right again.


Jeremias Maerki


Re: FAQ'ish questions (was: Re: XHTML 2 PDF)

Posted by Jeremias Maerki <de...@greenmail.ch>.
Actually, it's still accurate, although that will change as soon as the
migration to SVN is done. ...which reminds me.... :-)

On 24.05.2005 01:51:21 Victor Mote wrote:
> J.Pietschmann wrote:
> 
> > - What's the current publishing process? Wasn't there a Wiki
> >    page about this?
> 
> It is on the web site, under the "Development" tab, Deploy/Doc Mgmt menu:
> http://xml.apache.org/fop/dev/doc.html
> It may have started as a Wiki -- I don't remember. I don't know whether it
> is up-to-date or not.
> 
> Victor Mote



Jeremias Maerki


Re: FAQ'ish questions (was: Re: XHTML 2 PDF)

Posted by The Web Maestro <th...@gmail.com>.
On May 23, 2005, at 3:20 PM, J.Pietschmann wrote:
> Jeremias Maerki wrote:
>> Look in the archives:
>> http://marc.theaimsgroup.com/?l=fop-user&w=2&r=1&s=xhtml+pdf&q=b
>
> This is becoming FAQ material.
> To my great surprise, the various (x)html2fo tools are neither in
> the FAQ nor in the additional ressources list.

I'll look into adding something.

> Related questions:
> - Where do I edit the FAQ's xdoc source: HEAD or maintenance branch?

xml-fop/src/documentation/content/xdocs/faq.xml

the HEAD branch is what is used to build the site.

> - Why is the FAQ TOC gone? This makes it difficult use direct links
>   to individual FAQ entries in mails. Should I open a forrest
>   requirement for a TOC per section, preferably in a customizable way?

Forrest has the ability to generate one... but there's some problem on 
the FOP site. All other pages generate one except the FAQ page. Our 
page has exactly the same structure as the Forrest FAQ, as does our 
skinconf.xml. Unfortunately, it works for them, but not us.

I asked on the forrest user list, and received an answer which 
unfortunately didn't help much. I'd since forgotten about it. I'll see 
if I can come up with a solution.

> - What's the current publishing process? Wasn't there a Wiki
>   page about this?
> - What about moving the FAQ to the Wiki, or establishing a supplement
>   FAQ in the Wiki? (same for "additional ressources")

That'd be a good idea. That would certainly make it easier for 
fop-committers to edit those pages. It could still live in our sidebar 
as well.

> Bonus points: Is there anybody out there willing to work on canonical
> non-rude FAQ answers? (See
>  http://www.joelonsoftware.com/articles/FogBugzII.html
> section "snippets")
> I've already installed Thunderbird QuickText (great stuff...).
>
> J.Pietschmann

That sounds like a great tool and suggestion...

Regards,

Web Maestro Clay
-- 
<th...@gmail.com> - <http://homepage.mac.com/webmaestro/>
My religion is simple. My religion is kindness.
- HH The 14th Dalai Lama of Tibet


RE: FAQ'ish questions (was: Re: XHTML 2 PDF)

Posted by Victor Mote <vi...@outfitr.com>.
J.Pietschmann wrote:

> - What's the current publishing process? Wasn't there a Wiki
>    page about this?

It is on the web site, under the "Development" tab, Deploy/Doc Mgmt menu:
http://xml.apache.org/fop/dev/doc.html
It may have started as a Wiki -- I don't remember. I don't know whether it
is up-to-date or not.

Victor Mote


FAQ'ish questions (was: Re: XHTML 2 PDF)

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Jeremias Maerki wrote:
> Look in the archives:
> http://marc.theaimsgroup.com/?l=fop-user&w=2&r=1&s=xhtml+pdf&q=b

This is becoming FAQ material.
To my great surprise, the various (x)html2fo tools are neither in
the FAQ nor in the additional ressources list.

Related questions:
- Where do I edit the FAQ's xdoc source: HEAD or maintenance branch?
- Why is the FAQ TOC gone? This makes it difficult use direct links
   to individual FAQ entries in mails. Should I open a forrest
   requirement for a TOC per section, preferably in a customizable way?
- What's the current publishing process? Wasn't there a Wiki
   page about this?
- What about moving the FAQ to the Wiki, or establishing a supplement
   FAQ in the Wiki? (same for "additional ressources")

Bonus points: Is there anybody out there willing to work on canonical
non-rude FAQ answers? (See
  http://www.joelonsoftware.com/articles/FogBugzII.html
section "snippets")
I've already installed Thunderbird QuickText (great stuff...).

J.Pietschmann

Re: XHTML 2 PDF

Posted by Jeremias Maerki <de...@greenmail.ch>.
Look in the archives:
http://marc.theaimsgroup.com/?l=fop-user&w=2&r=1&s=xhtml+pdf&q=b

And please send questions to fop-users in the future, not fop-dev. Thank
you.

On 23.05.2005 10:00:13 Dirk Bromberg wrote:
> Is there a good / easy way to get a pdf document form an xhtml website?
> 
> xhtml -> xsl -> fo -> fop -> pdf ?


Jeremias Maerki


Re: XHTML 2 PDF

Posted by Dirk Bromberg <br...@tzi.de>.
Ok,

Thanks for quick answers!

Dirk



Manuel Mall wrote:

>Seems to me that this is more of a fop-user questions.
>
>Any way, there are a few stylesheets on the web which do xhtml to fo 
>transformations. Just "Google" for 'xhtml fo stylesheet' and you get a few 
>sensible hits.
>
>For example there is one published by James Tauber 
>(http://blogs.pingpoet.com/overflow/archive/2004/09/03/768.aspx).
>
>Manuel
>
>On Mon, 23 May 2005 04:00 pm, Dirk Bromberg wrote:
>  
>
>>Hi,
>>
>>Is there a good / easy way to get a pdf document form an xhtml website?
>>
>>xhtml -> xsl -> fo -> fop -> pdf ?
>>
>>
>>Thanks
>>
>>Dirk
>>    
>>

Re: XHTML 2 PDF

Posted by Manuel Mall <mm...@arcus.com.au>.
Seems to me that this is more of a fop-user questions.

Any way, there are a few stylesheets on the web which do xhtml to fo 
transformations. Just "Google" for 'xhtml fo stylesheet' and you get a few 
sensible hits.

For example there is one published by James Tauber 
(http://blogs.pingpoet.com/overflow/archive/2004/09/03/768.aspx).

Manuel

On Mon, 23 May 2005 04:00 pm, Dirk Bromberg wrote:
> Hi,
>
> Is there a good / easy way to get a pdf document form an xhtml website?
>
> xhtml -> xsl -> fo -> fop -> pdf ?
>
>
> Thanks
>
> Dirk

Re: XHTML 2 PDF

Posted by Pasi Nummisalo <pa...@davisor.com>.
Hi

If you like to use HTML+CSS and tables without fixed widths,
Webisor (www.davisor.com/webisor) might be tool for you.

Regards,
Pasi

On Mon, 23 May 2005, Dirk Bromberg wrote:

> Hi,
>
> Is there a good / easy way to get a pdf document form an xhtml website?
>
> xhtml -> xsl -> fo -> fop -> pdf ?
>
>
> Thanks
>
> Dirk
>


XHTML 2 PDF

Posted by Dirk Bromberg <br...@tzi.de>.
Hi,

Is there a good / easy way to get a pdf document form an xhtml website?

xhtml -> xsl -> fo -> fop -> pdf ?


Thanks

Dirk






Re: Markers: Determining the last generated area for a LM

Posted by Jeremias Maerki <de...@greenmail.ch>.
On 24.05.2005 18:41:39 Luca Furini wrote:
> Jeremias Maerki wrote:
> 
> > The isfirst and islast parameters must be set correctly. Currently, I
> > don't see a reliable way to determine these values. For example, there's
> > some code in AreaAdditionUtils that sets IS_FIRST and IS_LAST flags on
> > the layout context but I found this doesn't work reliably.
> 
> Did you find out why this does not work? I mean, do you think it is an
> incorrect approach, or there's something wrong somewhere in the code?

The problem is that in a break situation, AreaAdditionUtils is called
more than once, each time signalling a first and last area instead of
signalling a first area once and a last area once over multiple calls.

Hmm, I think I need to check if the LayoutContext instance remains the
same of multiple addArea calls on the same LM. If that's the case the
problem should be solveable.

> I remember that some time ago I had problems with these flags, and it was
> because of some LM that did not set / propagate the correct values when
> creating LayoutContexts for children LMs.


Jeremias Maerki