You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Kamal Bhatt <kb...@tt.com.au> on 2008/05/28 03:05:50 UTC

Getting hyphenation to work

Hi,
I have downloaded they hyphenation files from: 
http://offo.sourceforge.net/hyphenation/index.html

I have copied these files into c:/fop-0.93/hyph

I have updated my config file to point to this directory for hyphenation 
base: <hyphenation-base>C:/fop-0.93/hyph/</hyphenation-base>

I have updated my FO to include language="en". There is an 
C:/fop-0.93/hyph/en.xml file (both attached)

However, my line does not hyphenate. Instead, it remains a mess (attached)

What am I doing wrong?

Thanks.

-- 
Kamal Bhatt


Re: Getting hyphenation to work

Posted by Kamal Bhatt <kb...@tt.com.au>.
Andreas Delmelle wrote:
> On May 29, 2008, at 04:28, Kamal Bhatt wrote:
>
>> J.Pietschmann wrote:
>>> paul womack wrote:
>>>> Kamal Bhatt wrote:
>>>>> Thanks. Works a treat. One more question, where can I get these
>>>>> hyphenation files for asian languages (such as japanese).
>>>>
>>>> I'm far from sure hyphenation is even a valid concept
>>>> in Japanese.
>>>
>>> Well, digging around:
>>>  http://marc.info/?l=fop-dev&m=102992807207069&w=2
>>> While this is mostly about line breaking, hyphenation may happen under
>>> certain, apparently very rare circumstances. Some more info is in the
>>> wikipedia:  http://en.wikipedia.org/wiki/Kinsoku_shori
>>> I wonder whether the rules there are reflected in the Unicode line
>>> breaking properties, which would mean that FOP 0.95ff would handle
>>> this properly.
> <snip />
>
> Only partially, if I'm correct. As a fall-back, the UAX#14 
> implementation in FOP allows breaks between /any/ pair of CJK ideographs.
> (see: 
> http://markmail.org/search/?q=CJK+linebreaking#query:CJK%20linebreaking+page:1+mid:uykln4qu3gz6r45d+state:results) 
>
OK, I didn't understand that post, but by the looks of it FOP does not 
support Asian languages. Seeing as most Asian languages (I would think) 
are monospaced, am I correct in saying that it would be something that 
would be best done at the XSLT level (ie, build your own hyphenation 
rules into an XSLT and put them in blocks)?
>
> As mentioned in the post, this is simplistic, and may very well lead 
> to a layout that is undesirable/suboptimal from the POV of someone who 
> is versed well enough in either one of the related languages. (see the 
> Korean example attached to the OP in that thread)
>
> Someone once started more detailed work on this, and an initial patch 
> is available in Bugzilla #36977, but this hasn't been incorporated in 
> the codebase yet, since there were concerns about the approach 
> operating 'on top of' UAX#14 (unnecessarily increasing iterations over 
> the same character sequences).
>
>> Also, how does one convert a Tex file into a valid hyphenation XML file?
>
> I don't think there's a tool available to do so directly, although, 
> from the last time I researched this topic, I do seem to remember that 
> there are tools that convert TeX hyphenation files into a generic XML 
> format. Subsequently, with a minimum of effort, such an XML file could 
> then be transformed via XSLT to match the format used by FOP.
>
>
>
> HTH!
>

-- 
Kamal Bhatt


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting hyphenation to work

Posted by Andreas Delmelle <an...@telenet.be>.
On May 29, 2008, at 04:28, Kamal Bhatt wrote:

> J.Pietschmann wrote:
>> paul womack wrote:
>>> Kamal Bhatt wrote:
>>>> Thanks. Works a treat. One more question, where can I get these
>>>> hyphenation files for asian languages (such as japanese).
>>>
>>> I'm far from sure hyphenation is even a valid concept
>>> in Japanese.
>>
>> Well, digging around:
>>  http://marc.info/?l=fop-dev&m=102992807207069&w=2
>> While this is mostly about line breaking, hyphenation may happen  
>> under
>> certain, apparently very rare circumstances. Some more info is in the
>> wikipedia:  http://en.wikipedia.org/wiki/Kinsoku_shori
>> I wonder whether the rules there are reflected in the Unicode line
>> breaking properties, which would mean that FOP 0.95ff would handle
>> this properly.
<snip />

Only partially, if I'm correct. As a fall-back, the UAX#14  
implementation in FOP allows breaks between /any/ pair of CJK  
ideographs.
(see: http://markmail.org/search/?q=CJK+linebreaking#query:CJK% 
20linebreaking+page:1+mid:uykln4qu3gz6r45d+state:results)

As mentioned in the post, this is simplistic, and may very well lead  
to a layout that is undesirable/suboptimal from the POV of someone  
who is versed well enough in either one of the related languages.  
(see the Korean example attached to the OP in that thread)

Someone once started more detailed work on this, and an initial patch  
is available in Bugzilla #36977, but this hasn't been incorporated in  
the codebase yet, since there were concerns about the approach  
operating 'on top of' UAX#14 (unnecessarily increasing iterations  
over the same character sequences).

> Also, how does one convert a Tex file into a valid hyphenation XML  
> file?

I don't think there's a tool available to do so directly, although,  
from the last time I researched this topic, I do seem to remember  
that there are tools that convert TeX hyphenation files into a  
generic XML format. Subsequently, with a minimum of effort, such an  
XML file could then be transformed via XSLT to match the format used  
by FOP.



HTH!

Cheers

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting hyphenation to work

Posted by Kamal Bhatt <kb...@tt.com.au>.
J.Pietschmann wrote:
> paul womack wrote:
>> Kamal Bhatt wrote:
>>> Thanks. Works a treat. One more question, where can I get these
>>> hyphenation files for asian languages (such as japanese).
>>
>> I'm far from sure hyphenation is even a valid concept
>> in Japanese.
>
> Well, digging around:
>  http://marc.info/?l=fop-dev&m=102992807207069&w=2
> While this is mostly about line breaking, hyphenation may happen under
> certain, apparently very rare circumstances. Some more info is in the
> wikipedia:  http://en.wikipedia.org/wiki/Kinsoku_shori
> I wonder whether the rules there are reflected in the Unicode line
> breaking properties, which would mean that FOP 0.95ff would handle
> this properly.
>
>
> A quick Google search also turns to a book "Understanding Japanese
> Information Processing" which mentions hyphenation in Japanese.
>
> J.Pietschmann
>
Thanks for the reply. Yes, by "hyphenation" I mean breaking the text on 
separate lines. Am I right in understanding that 0.95 has better support 
for Asian fonts, or will we have to chop the text ourselves?

Also, how does one convert a Tex file into a valid hyphenation XML file?

Cheers.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting hyphenation to work

Posted by "J.Pietschmann" <pi...@databaar.ch>.
paul womack wrote:
> Kamal Bhatt wrote:
>> Thanks. Works a treat. One more question, where can I get these
>> hyphenation files for asian languages (such as japanese).
>
> I'm far from sure hyphenation is even a valid concept
> in Japanese.

Well, digging around:
  http://marc.info/?l=fop-dev&m=102992807207069&w=2
While this is mostly about line breaking, hyphenation may happen under
certain, apparently very rare circumstances. Some more info is in the
wikipedia:  http://en.wikipedia.org/wiki/Kinsoku_shori
I wonder whether the rules there are reflected in the Unicode line
breaking properties, which would mean that FOP 0.95ff would handle
this properly.


A quick Google search also turns to a book "Understanding Japanese
Information Processing" which mentions hyphenation in Japanese.

J.Pietschmann

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting hyphenation to work

Posted by paul womack <pw...@papermule.co.uk>.
Kamal Bhatt wrote:
> Thanks. Works a treat. One more question, where can I get these 
> hyphenation files for asian languages (such as japanese).

I'm far from sure hyphenation is even a valid concept
in Japanese.

   BugBear

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting hyphenation to work

Posted by Kamal Bhatt <kb...@tt.com.au>.
Thanks. Works a treat. One more question, where can I get these 
hyphenation files for asian languages (such as japanese).
>
> On May 28, 2008, at 03:05, Kamal Bhatt wrote:
>
> Hi
>
>> I have downloaded they hyphenation files from: 
>> http://offo.sourceforge.net/hyphenation/index.html
>>
>> I have copied these files into c:/fop-0.93/hyph
>>
>> I have updated my config file to point to this directory for 
>> hyphenation base: <hyphenation-base>C:/fop-0.93/hyph/</hyphenation-base>
>>
>> I have updated my FO to include language="en". There is an 
>> C:/fop-0.93/hyph/en.xml file (both attached)
>>
>> However, my line does not hyphenate. Instead, it remains a mess 
>> (attached)
>>
>> What am I doing wrong?
>
> The only thing you seem to be forgetting is:
>
>> <fo:block>ALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOY 
>>
>>           <fo:block>
>
> You need:
>
> <fo:block hyphenate="true">...
>
> Or alternatively, specify the property on the fo:root, if you want it 
> to apply to all fo:blocks in the document.
>
>
> HTH!
>
> Andreas
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>
>


-- 
Kamal Bhatt


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting hyphenation to work

Posted by Andreas Delmelle <an...@telenet.be>.
On May 28, 2008, at 03:05, Kamal Bhatt wrote:

Hi

> I have downloaded they hyphenation files from: http:// 
> offo.sourceforge.net/hyphenation/index.html
>
> I have copied these files into c:/fop-0.93/hyph
>
> I have updated my config file to point to this directory for  
> hyphenation base: <hyphenation-base>C:/fop-0.93/hyph/</hyphenation- 
> base>
>
> I have updated my FO to include language="en". There is an C:/ 
> fop-0.93/hyph/en.xml file (both attached)
>
> However, my line does not hyphenate. Instead, it remains a mess  
> (attached)
>
> What am I doing wrong?

The only thing you seem to be forgetting is:

> <fo:block>ALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMAL 
> ADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALAD 
> ULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADUL 
> LBOYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLB 
> OYALLWORKANDNOPLAYMAKESKAMALADULLBOYALLWORKANDNOPLAYMAKESKAMALADULLBOY 
> ALLWORKANDNOPLAYMAKESKAMALADULLBOY
>           <fo:block>

You need:

<fo:block hyphenate="true">...

Or alternatively, specify the property on the fo:root, if you want it  
to apply to all fo:blocks in the document.


HTH!

Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org