You are viewing a plain text version of this content. The canonical link for it is here.

Posted to fop-users@xmlgraphics.apache.org by Sebastien <ka...@gmail.com> on 2008/12/01 17:11:49 UTC

Unresolved ID and page-number-citation allocated width

Hi !
I took a look at a few discussions around this issue on the mailing list and
at the current implementation in FOP Trunk.
Currently i'm having a PDF with several chapters whose content may have some
links pointing to some content in another chapter. These chapters can be
generated separately so that some links are unresolved and that's ok.
But my problem is that i attach a page-number-citation to each link like
this:
<fo:basic-link color="blue" internal-destination="myId">
  A link <fo:inline baseline-shift="super" font-size="8pt">
[<fo:page-number-citation ref-id="myId"/>]</fo:inline>
</fo:basic-link>

The behaviour that i was expecting was to have empty brackets [] when the ID
couldn't be resolved. Instead, i got empty brackets but with a large space
between them [        ].
After a quick look at the code, i saw that page-number-citation allocates a
space corresponding to the string "MMM" if the ID can't be immediately
resolved.
My FOP understanding is quite limited so i can't understand how FOP manages
to shrink the space previously allocated when it succeeds in resolving the
ID and why it leaves this MMM-space when it fails to resolve the ID.

So I have two questions:
 1) Is there a way to squeeze that space in case of unresolved ID ? I don't
want to hack the code and replace the "MMM" string by a "|" string for
instance, it's ugly and it will surely lead to severe mistakes in the
layout. Is it difficult to implement such a behaviour ? I'm guessing that if
this hasn't been implemented yet it must be because of some tricky
implications and/or issues ?
 2) I can bear not to have any page-number-citation at all in case of
unresolved ID but in this case, i will need to know when my ID can be
resolved and when it can't. It leads to some xsl-testing (i guess) and i
don't think an XSLT processor can be aware of such things... But i may be
wrong and perhaps this cool feature exists ?

Thanks again for your help.

  Seb

Re: Unresolved ID and page-number-citation allocated width

Posted by Andreas Delmelle <an...@telenet.be>.

On 01 Dec 2008, at 21:39, Andreas Delmelle wrote:

> <snip /> there's no way to guarantee that it will not turn up  
> further in the stream, I'm afraid... unless by having looked at the  
> entire source-document (which, strictly speaking, the XSLT processor  
> has,

I realize this should have been 'possibly has'. In full streaming  
mode, it is possible for the XSLT processor to send FOP a  
startElement("root") long before it has processed the full XML source.

The point is that you could build an xsl:key map to search for the  
values of elements/attributes in the entire document. In doing so, you  
will force the processor to look at the whole document before FOP  
receives the first startElement() event, which could be put to work to  
your advantage, I think...

HTH!

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: Unresolved ID and page-number-citation allocated width

Posted by Sebastien <ka...@gmail.com>.

Ok I'll take a look at this solution.
Thank you very much for your hints ;)

    Seb

On Tue, Dec 2, 2008 at 7:38 PM, Andreas Delmelle <
andreas.delmelle@telenet.be> wrote:

> On 02 Dec 2008, at 18:46, Andreas Delmelle wrote:
>
>  <xsl:if test="chapter-exists($someRefId)">
>>  <fo:page-number-citation ref-id="$someRefId" />
>> </xsl:if>
>>
>
> Correction: this should obviously be
>
> <xsl:if test="key('chapter-exists',$someRefId)">
>
> etc.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>
>

Re: Unresolved ID and page-number-citation allocated width

Posted by Andreas Delmelle <an...@telenet.be>.

On 02 Dec 2008, at 18:46, Andreas Delmelle wrote:

> <xsl:if test="chapter-exists($someRefId)">
>  <fo:page-number-citation ref-id="$someRefId" />
> </xsl:if>

Correction: this should obviously be

<xsl:if test="key('chapter-exists',$someRefId)">

etc.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: Unresolved ID and page-number-citation allocated width

Posted by Andreas Delmelle <an...@telenet.be>.

On 01 Dec 2008, at 22:21, Sebastien wrote:

> What about a mechanism which squeeze all the unresolved page-number- 
> citation dummy space at the end of the processing ? I didn't really  
> look into FOP's code so it might very well be a stupid idea and/or  
> an impossible thing to do...
> I just think that since FOP is obviously able to replace a forward  
> reference when it encounters it, it might be able to replace/squeeze  
> an unresolved reference as well. But again, i'm surely missing some  
> internal details...

Not stupid, and perhaps not even impossible... Only far from trivial,  
I'm afraid. The problem from FOP's point-of-view, as I indicated, is  
that it can never guarantee whether a link will be resolved yes or no.  
FOP can only judge that accurately if and when the document is  
completely processed. At that point, we could still decide to skip / 
rendering/ for the corresponding dummy areas, but in case of justified  
alignment, for example, you could then end up with some lines that are  
not entirely filled. An issue that arises for some cases currently,  
where the actual page-number is smaller than the reserved space, and  
the breaks were computed based on the space for "MMM".

> <snip />
> Actually i named my FO ids with a convenient name which is  
> "[chapter]_[number]" (eg: actions_52) so i might be able to get a  
> list of chapters which are generated from my application. Then i  
> would split the ref-id with the '_' delimiter and compare the first  
> part with the list of generated chapters and decide whether to add  
> the page-number-citation or not... But considering all the links in  
> the document, doing a string split + an array search + a test for  
> each and every one of them seems like a real pain and i would hate  
> to do that.

It depends. If the chapter name and number exist somewhere as an  
attribute value in the source file, you could build a key map like:

<xsl:key name="chapter-exists" match="someElement"  
use="concat(@name,'_',@number)" />

and correspondingly:

<xsl:if test="chapter-exists($someRefId)">
   <fo:page-number-citation ref-id="$someRefId" />
</xsl:if>

No need for all the complicated stuff you describe above, but that's  
obviously only in case the chapter-reference can be easily extracted  
from the source...

BUT: if the document is sufficiently large, then this could have a  
significant impact on the duration of the XSL transform phase.

If all else fails, you could still pre-process the FO document, by  
applying a second transformation before sending it to FOP... This  
would basically be an identity transform, with only a matching  
template for fo:page-number-citation, which checks in a similar manner  
as described above, whether the value for the ref-id attribute is used  
as an id by any other element node.

HTH!

Cheers

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org

Re: Unresolved ID and page-number-citation allocated width

Posted by Sebastien <ka...@gmail.com>.

On Mon, Dec 1, 2008 at 9:39 PM, Andreas Delmelle <
andreas.delmelle@telenet.be> wrote:

> On 01 Dec 2008, at 17:11, Sebastien wrote:
>
>  My FOP understanding is quite limited so i can't understand how FOP
>> manages to shrink the space previously allocated when it succeeds in
>> resolving the ID and why it leaves this MMM-space when it fails to resolve
>> the ID.
>>
>> So I have two questions:
>>  1) Is there a way to squeeze that space in case of unresolved ID ? I
>> don't want to hack the code and replace the "MMM" string by a "|" string for
>> instance, it's ugly and it will surely lead to severe mistakes in the
>> layout. Is it difficult to implement such a behaviour ? I'm guessing that if
>> this hasn't been implemented yet it must be because of some tricky
>> implications and/or issues ?
>>
>
> One could try go into the direction of avoiding the addition of the area if
> the link is unresolved. The tricky part is that it's possible the dummy area
> is added, and replaced later, so you have no guarantee, unless you know in
> advance whether the ID will occur on a later page in the document...
>

What about a mechanism which squeeze all the unresolved page-number-citation
dummy space at the end of the processing ? I didn't really look into FOP's
code so it might very well be a stupid idea and/or an impossible thing to
do...
I just think that since FOP is obviously able to replace a forward reference
when it encounters it, it might be able to replace/squeeze an unresolved
reference as well. But again, i'm surely missing some internal details...

>
>
>>  2) I can bear not to have any page-number-citation at all in case of
>> unresolved ID but in this case, i will need to know when my ID can be
>> resolved and when it can't. It leads to some xsl-testing (i guess) and i
>> don't think an XSLT processor can be aware of such things... But i may be
>> wrong and perhaps this cool feature exists ?
>>
>
> Well, there is a possibility... Assuming that there is some element- or
> attribute-value in the XML source that determines the value of the ref-id/id
> pair, then theoretically, it should be possible to check, at the time the
> page-number-citation node is generated, whether 'myId' in your example, will
> appear anywhere in the result document. If not, then you can simply skip
> generation of the element.

I'm not sure i understood your solution. How would you check that ? (ok, i
saw your second mail, i understand now ;) )

Actually i named my FO ids with a convenient name which is
"[chapter]_[number]" (eg: actions_52) so i might be able to get a list of
chapters which are generated from my application. Then i would split the
ref-id with the '_' delimiter and compare the first part with the list of
generated chapters and decide whether to add the page-number-citation or
not... But considering all the links in the document, doing a string split +
an array search + a test for each and every one of them seems like a real
pain and i would hate to do that.
Anyways, if i'm stuck with this problem, this will be the way to go i
guess...

Thanks for your answers anyways ;)

  Seb

Re: Unresolved ID and page-number-citation allocated width

Posted by Andreas Delmelle <an...@telenet.be>.

On 01 Dec 2008, at 17:11, Sebastien wrote:

Hi

> <snip />
> The behaviour that i was expecting was to have empty brackets []  
> when the ID couldn't be resolved. Instead, i got empty brackets but  
> with a large space between them [        ].
>
> After a quick look at the code, i saw that page-number-citation  
> allocates a space corresponding to the string "MMM" if the ID can't  
> be immediately resolved.

Right. A sort of pessimistic guess. If you're in luck, and the page- 
number is resolved before the actual breaks for the part in question  
are determined, then this will lead to nice results (= most cases,  
since either the link points forwards and the reserved space is more  
than enough to fit the actual text, or the link points backwards,  
which avoids the generation of the dummy element altogether; once the  
page-count grows too large, there might also be undesired side- 
effects, IIC).

> My FOP understanding is quite limited so i can't understand how FOP  
> manages to shrink the space previously allocated when it succeeds in  
> resolving the ID and why it leaves this MMM-space when it fails to  
> resolve the ID.
>
> So I have two questions:
>  1) Is there a way to squeeze that space in case of unresolved ID ?  
> I don't want to hack the code and replace the "MMM" string by a "|"  
> string for instance, it's ugly and it will surely lead to severe  
> mistakes in the layout. Is it difficult to implement such a  
> behaviour ? I'm guessing that if this hasn't been implemented yet it  
> must be because of some tricky implications and/or issues ?

One could try go into the direction of avoiding the addition of the  
area if the link is unresolved. The tricky part is that it's possible  
the dummy area is added, and replaced later, so you have no guarantee,  
unless you know in advance whether the ID will occur on a later page  
in the document...

>
>  2) I can bear not to have any page-number-citation at all in case  
> of unresolved ID but in this case, i will need to know when my ID  
> can be resolved and when it can't. It leads to some xsl-testing (i  
> guess) and i don't think an XSLT processor can be aware of such  
> things... But i may be wrong and perhaps this cool feature exists ?

Well, there is a possibility... Assuming that there is some element-  
or attribute-value in the XML source that determines the value of the  
ref-id/id pair, then theoretically, it should be possible to check, at  
the time the page-number-citation node is generated, whether 'myId' in  
your example, will appear anywhere in the result document. If not,  
then you can simply skip generation of the element.

As I see it, FOP knows no more about that than you can find out with  
the help of your XSLT processor. In some of the most common cases, FOP  
never sees the document in its entirety, so it is not able to  
determine whether the id will or will not be resolved, eventually. It  
can tell you, at a given point, whether a FO with the given id already  
exists. If it does not, then there's no way to guarantee that it will  
not turn up further in the stream, I'm afraid... unless by having  
looked at the entire source-document (which, strictly speaking, the  
XSLT processor has, albeit before it was transformed into FO)

Cheers

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org