You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Holger Bast <Ho...@gmx.de> on 2017/04/19 12:48:42 UTC

Proposal for changing the tagging behavior for fo:blocks

Hi there,
we're using docbook5 to write our technical documents which needs to be published as accessible PDF files and we would like to use FOP as fo-processor.
FOP already supports generating tagged PDF content but I'm not happy with the results and would like to discuss this topic further.

Docbook5 provides XSL sheets to convert docbook to fo which then can be processed by fo-processors. The XSL sheets often generate deep nested fo:block structures like the following example:

<fo:block>
<fo:block>
<fo:block ...>
<fo:block keep-with-next.within-column="always">
<fo:block ...>
<fo:marker marker-class-name="section.head.marker">Level 1</fo:marker>
<fo:block font-size="20.735999999999997pt">1.1. Level 1</fo:block>
</fo:block>
</fo:block>
</fo:block>
</fo:block>
</fo:block>
</fo:block>

This code also generates a deep nested p(aragraph) structure in the pdf file, because every fo:block automatically is
tagged as paragraph. I would like to get rid of this to get a flat document structure.

I propose that fo:blocks are not automatically recognized as paragraphs because they can contain different kinds of content, not only paragraph-like content. So in my opinion they should not be affected by the tagging mechanism automatically, so they are not included in the structural information. The user should decide (opt-in) how to treat fo:blocks (like p, h1 or something else).

What do you think about this approach?
Is this something that can be (easily) achieved in FOP?

If you need further information, I'll can provide sample documents and files.

Any information relating this topic is appreciated.
thx & bye, Holger

Re: Proposal for changing the tagging behavior for fo:blocks

Posted by Holger Bast <Ho...@gmx.de>.
Hey Chris,
thanks for your reply.
Implementing this kind of behavior still means opt-out. fo:block is not defined to always represent a paragraph and FOP shouldn't assume this fact.
With fo:block you can structure your document and in some cases it will be interpreted as paragraph. I still think the best way would be leaving the decision up to the user.
And I don't think that this problem is only bundled with the docbook XSL. This problem will always appear when fo code is auto-generated.

Anyway. If I understand the current implementation correct, then everything nested inside a "artifact" structure is ignored.
It must be possible that inner structures can be tagged as a necessary structure again, like:

<fo:block role="artifact">
  <fo:block role="artifact">
    <fo:block role="p">my text...</fo:block>
  </fo:block>
</fo:block>

Bye, Holger

> Gesendet: Montag, 24. April 2017 um 10:26 Uhr
> Von: Chris <bo...@hotmail.com>
> An: "fop-dev@xmlgraphics.apache.org" <fo...@xmlgraphics.apache.org>
> Betreff: Re: Proposal for changing the tagging behavior for fo:blocks
>
> Hi Holger,
> 
> I don't agree that we should change default behaviour of FOP to suit 
> docbook users. Instead I propose that we implement support for role 
> attribute on fo:block, such that it would be possible to specify 
> role="artifact" in docbook xsl so that the nested structure is not 
> represented as nested p tags in accessibility structure.
> 
> Thanks,
> 
> Chris
> 
> On 19/04/2017 13:48, Holger Bast wrote:
> > Hi there,
> > we're using docbook5 to write our technical documents which needs to be published as accessible PDF files and we would like to use FOP as fo-processor.
> > FOP already supports generating tagged PDF content but I'm not happy with the results and would like to discuss this topic further.
> >
> > Docbook5 provides XSL sheets to convert docbook to fo which then can be processed by fo-processors. The XSL sheets often generate deep nested fo:block structures like the following example:
> >
> > <fo:block>
> > <fo:block>
> > <fo:block ...>
> > <fo:block keep-with-next.within-column="always">
> > <fo:block ...>
> > <fo:marker marker-class-name="section.head.marker">Level 1</fo:marker>
> > <fo:block font-size="20.735999999999997pt">1.1. Level 1</fo:block>
> > </fo:block>
> > </fo:block>
> > </fo:block>
> > </fo:block>
> > </fo:block>
> > </fo:block>
> >
> > This code also generates a deep nested p(aragraph) structure in the pdf file, because every fo:block automatically is
> > tagged as paragraph. I would like to get rid of this to get a flat document structure.
> >
> > I propose that fo:blocks are not automatically recognized as paragraphs because they can contain different kinds of content, not only paragraph-like content. So in my opinion they should not be affected by the tagging mechanism automatically, so they are not included in the structural information. The user should decide (opt-in) how to treat fo:blocks (like p, h1 or something else).
> >
> > What do you think about this approach?
> > Is this something that can be (easily) achieved in FOP?
> >
> > If you need further information, I'll can provide sample documents and files.
> >
> > Any information relating this topic is appreciated.
> > thx & bye, Holger

Re: Proposal for changing the tagging behavior for fo:blocks

Posted by Chris <bo...@hotmail.com>.
Hi Holger,

I don't agree that we should change default behaviour of FOP to suit 
docbook users. Instead I propose that we implement support for role 
attribute on fo:block, such that it would be possible to specify 
role="artifact" in docbook xsl so that the nested structure is not 
represented as nested p tags in accessibility structure.

Thanks,

Chris

On 19/04/2017 13:48, Holger Bast wrote:
> Hi there,
> we're using docbook5 to write our technical documents which needs to be published as accessible PDF files and we would like to use FOP as fo-processor.
> FOP already supports generating tagged PDF content but I'm not happy with the results and would like to discuss this topic further.
>
> Docbook5 provides XSL sheets to convert docbook to fo which then can be processed by fo-processors. The XSL sheets often generate deep nested fo:block structures like the following example:
>
> <fo:block>
> <fo:block>
> <fo:block ...>
> <fo:block keep-with-next.within-column="always">
> <fo:block ...>
> <fo:marker marker-class-name="section.head.marker">Level 1</fo:marker>
> <fo:block font-size="20.735999999999997pt">1.1. Level 1</fo:block>
> </fo:block>
> </fo:block>
> </fo:block>
> </fo:block>
> </fo:block>
> </fo:block>
>
> This code also generates a deep nested p(aragraph) structure in the pdf file, because every fo:block automatically is
> tagged as paragraph. I would like to get rid of this to get a flat document structure.
>
> I propose that fo:blocks are not automatically recognized as paragraphs because they can contain different kinds of content, not only paragraph-like content. So in my opinion they should not be affected by the tagging mechanism automatically, so they are not included in the structural information. The user should decide (opt-in) how to treat fo:blocks (like p, h1 or something else).
>
> What do you think about this approach?
> Is this something that can be (easily) achieved in FOP?
>
> If you need further information, I'll can provide sample documents and files.
>
> Any information relating this topic is appreciated.
> thx & bye, Holger
> .
>