You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Andreas Delmelle <an...@telenet.be> on 2009/02/06 23:29:13 UTC

Re: [Xmlgraphics-fop Wiki] Update of "PDF Accessibility" by JeremiasMaerki

On 06 Feb 2009, at 08:19, Apache Wiki wrote:

Hi Jost, Jeremias

>  '''(R3)''' Decide how the language should be defined. Other  
> implementations specify the @xml:lang on fo:root level. The same  
> attribute is set for descendant's to override the default language.  
> There is also the common FO property {{{country}}} and  
> {{{language}}} to consider.
>
> +    * [JM] XSL defines {{{xml:lang}}} as a shorthand for  
> {{{country}}}/{{{language}}}/{{{script}}}. So both are equivalent  
> from a user's perspective. It should be verified that {{{xml:lang}}}  
> is properly mapped to the other three properties. Internally, the  
> code should work off the basic XSL properties, not the shorthand.

FWIW: I think I have made sure at one point that the mapping of  
xml:lang to country/language should work (see a.o.  
org.apache.fop.fo.properties.XMLLangShorthandParser). The mapping to  
'script' is not yet implemented.
Very basically, however, since there is no validation whatsoever at  
parse-time of whether the specified value conforms to the ISO  
specification.
I see the possibility of adding this, and it doesn't even seem too  
complicated.

Note that none of those properties apply to fo:root, so we'll probably  
have to assume that the natural language of the document is the one of  
the first fo:page-sequence. In theory, we always have access to the  
specified value on fo:root, but accessing that from the renderer  
context may get a bit messy (it exists only in the  
'currentPropertyList' that is available in FOTreeBuilder)


Regards

Andreas

Re: [Xmlgraphics-fop Wiki] Update of "PDF Accessibility" by JeremiasMaerki

Posted by Andreas Delmelle <an...@telenet.be>.
On 06 Feb 2009, at 23:29, Andreas Delmelle wrote:

>> +    * [JM] XSL defines {{{xml:lang}}} as a shorthand for  
>> {{{country}}}/{{{language}}}/{{{script}}}. So both are equivalent  
>> from a user's perspective. It should be verified that  
>> {{{xml:lang}}} is properly mapped to the other three properties.  
>> Internally, the code should work off the basic XSL properties, not  
>> the shorthand.
>
> FWIW: I think I have made sure at one point that the mapping of  
> xml:lang to country/language should work (see a.o.  
> org.apache.fop.fo.properties.XMLLangShorthandParser). The mapping to  
> 'script' is not yet implemented.

Just did a quick re-read of the definition, and it seems that 'script'  
is at most indirectly set by xml:lang (implied by country/language).
The xml:lang property itself only consists of <language-country>.  
Literally, the XSL-FO Rec mentions (7.31.24):
"XSL treats xml:lang as a shorthand and uses it to set the country and  
language properties.

Note:

In general, linguistic services (line-justification strategy, line- 
breaking and hyphenation) may depend on a combination of the  
"language", "script", and "country" properties."



Regards



Andreas

Re: [Xmlgraphics-fop Wiki] Update of "PDF Accessibility" by JeremiasMaerki

Posted by Andreas Delmelle <an...@telenet.be>.
On 07 Feb 2009, at 08:33, Jeremias Maerki wrote:

> <snip />
>> In theory, we always have access to the
>> specified value on fo:root,
>
> ...if the user bothered to specify it there...

Yep, can be convenient, since inheritance makes sure that you don't  
need to specify it one every page-sequence separately.

>> but accessing that from the renderer
>> context may get a bit messy (it exists only in the
>> 'currentPropertyList' that is available in FOTreeBuilder)
>
> We can always expose the values from org.apache.fop.fo.pagination.Root
> (by adding property variable and adjusting bind()). Not messy at all I
> would think. But that would be a proprietary extension.

Indeed. Never even occurred to me to do it that way. Very simple  
indeed. :-)

> If possible I think we should stick to the above rule: Use the first
> page-sequence for the document language. That should cover 99% of the
> cases and follows the spec as I understand it. And if anyone needs a
> different document-level language, we can simply expose language/ 
> country
> on fo:root, right?

Right. The only small hole in the rule, is when the user specifies a  
different language on the root and the first page-sequence. Unless the  
values on the root are exposed, we would not easily be able to  
distinguish between them.


Regards

Andreas

Re: [Xmlgraphics-fop Wiki] Update of "PDF Accessibility" by JeremiasMaerki

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 06.02.2009 23:29:13 Andreas Delmelle wrote:
> On 06 Feb 2009, at 08:19, Apache Wiki wrote:
> 
> Hi Jost, Jeremias
> 
> >  '''(R3)''' Decide how the language should be defined. Other  
> > implementations specify the @xml:lang on fo:root level. The same  
> > attribute is set for descendant's to override the default language.  
> > There is also the common FO property {{{country}}} and  
> > {{{language}}} to consider.
> >
> > +    * [JM] XSL defines {{{xml:lang}}} as a shorthand for  
> > {{{country}}}/{{{language}}}/{{{script}}}. So both are equivalent  
> > from a user's perspective. It should be verified that {{{xml:lang}}}  
> > is properly mapped to the other three properties. Internally, the  
> > code should work off the basic XSL properties, not the shorthand.
> 
> FWIW: I think I have made sure at one point that the mapping of  
> xml:lang to country/language should work (see a.o.  
> org.apache.fop.fo.properties.XMLLangShorthandParser). The mapping to  
> 'script' is not yet implemented.
> Very basically, however, since there is no validation whatsoever at  
> parse-time of whether the specified value conforms to the ISO  
> specification.
> I see the possibility of adding this, and it doesn't even seem too  
> complicated.

I remember you working on this some time ago. I just made a note to
recheck because accessibility relies on it. However, I don't think
supporting "script" is high on the priority list given the restricted
capabilities of FOP concerning eastern scripts.

> Note that none of those properties apply to fo:root, so we'll probably  
> have to assume that the natural language of the document is the one of  
> the first fo:page-sequence.

Agreed.

> In theory, we always have access to the  
> specified value on fo:root,

...if the user bothered to specify it there...

> but accessing that from the renderer  
> context may get a bit messy (it exists only in the  
> 'currentPropertyList' that is available in FOTreeBuilder)

We can always expose the values from org.apache.fop.fo.pagination.Root 
(by adding property variable and adjusting bind()). Not messy at all I
would think. But that would be a proprietary extension.

If possible I think we should stick to the above rule: Use the first
page-sequence for the document language. That should cover 99% of the
cases and follows the spec as I understand it. And if anyone needs a
different document-level language, we can simply expose language/country
on fo:root, right?

Jeremias Maerki