You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by ������ <ha...@hanmail.net> on 2000/12/18 19:15:48 UTC

cjk font support.

Hi.
I have some questions about FOP and CJK font.
How can I make fop work with cjk font?




==================================================
우리 인터넷, Daum
평생 쓰는 무료 E-mail 주소 한메일넷
지구촌 한글 검색서비스 Daum FIREBALL
NO SPAM 캠페인! : http://www.daum.net/event/nospam
http://www.daum.net

Re: cjk font support.

Posted by Ross <ro...@apexinternetsoftware.com>.

Carlos Villegas wrote:

> Ross wrote:
> >
> > $B@|<:1b(B wrote:
> >
> > > Hi.
> > > I have some questions about FOP and CJK font.
> > > How can I make fop work with cjk font?
> > >
> > > ==================================================
> > > $B?l8.(B $B@NEM3](B, Daum
> > > $BFr;}(B $B>24B(B $B9+7a(B E-mail $BAV<R(B $BGQ8^@O3](B
> > > $BAv18CL(B $BGQ1[(B $B0K;v<-:q=:(B Daum FIREBALL
> > > NO SPAM $BD7Fd@N(B! : http://www.daum.net/event/nospam
> > > http://www.daum.net
> >
> > Can you provide a list of the character encodings and fonts that you
> > use?
> >
> > PDF supports Unicode, which is to be the character set of all glyphs
> > then used.  So, you can convert native character encoding to PDF Unicode
> > character encoding.  Then, write that in PDF file embedded font, or
> > embedded subset font, or reference to installed font.
> >
> > For one byte character encoding languages like most Western languages,
> > you can convert the native characters to unicode or use the regular font
> > with font mapping.
> >
> > The Adobe PDF Specification has some lists of character encodings.
> > Other codepages are at ISO, Microsoft, or IBM web site, or actually
> > mostly from Unicode web site.
>
> I'm afraid it's not as simple as that. Even if you get CJK fonts setup
> and are able to display some CJK characters, there's is no support
> in FOP for CJK typesetting. For example, a japanese or chinese paragraph
> does not contain any spaces at all (even wide spaces). There is no
> concept of words as in western languages. So the current line-breaking
> procedure in FOP will put the whole paragraph in one line since it
> will be taken as a single word and cannot be broken. I have plans
> for proper CJK typesetting but that will take a while to say the
> least! However, if all you need is to display some characters inline,
> then setting up the fonts is probably most of what is needed.
>
> Carlos Villegas

If only anything was simple.

I look at the PDF specification, it says for Type Zero fonts that various font
encodings can be used for CJK fonts.  I copy the table from the specification,
7.10, page 215:

Tabl e 7.20 Predefined CJK CMap names
Name Description
Chinese (Simplified)
GB-EUC-H Microsoft Code Page 936 (lfCharSet 0x86), GB 2312-80 character set,
EUC-CN
encoding
GB-EUC-V Vertical version of GB-EUC-H
GBpc-EUC-H Macintosh, GB 2312-80 character set, EUC-CN encoding, Script
Manager code 2
GBpc-EUC-V Vertical version of GBpc-EUC-H
GBK-EUC-H Microsoft Code Page 936 (lfCharSet 0x86), GBK character set, GBK
encoding
GBK-EUC-V Vertical version of GBK-EUC-V
UniGB-UCS2-H Unicode (UCS-2) encoding for the Adobe-GB1 character collection
UniGB-UCS2-V Vertical version of UniGB-UCS2-H.
Chinese (Traditional)
B5pc-H Macintosh, Big Five character set, Big Five encoding, Script Manager
code 2
B5pc-V Vertical version of B5pc-H
ETen-B5-H Microsoft Code Page 950 (lfCharSet 0x88), Big Five character set
with ETen
extensions
ETen-B5-V Vertical version of ETen-B5-H
ETenms-B5-H Microsoft Code Page 950 (lfCharSet 0x88), Big Five character set
with ETen
extensions; this uses proportional forms for half-width Latin characters.
ETenms-B5-V Vertical version of ETenms-B5-H
CNS-EUC-H CNS 11643-1992 character set, EUC-TW encoding
CNS-EUC-V Vertical version of CNS-EUC-H
UniCNS-UCS2-H Unicode (UCS-2) encoding for the Adobe-CNS1 character collection

UniCNS-UCS2-V Vertical version of UniCNS-UCS2-H.
Japanese
83pv-RKSJ-H Macintosh, JIS X 0208 character set with KanjiTalk6 extensions,
Shift-JIS
encoding, Script Manager code 1
90ms-RKSJ-H Microsoft Code Page 932 (lfCharSet 0x80), JIS X 0208 character set
with NEC
and IBM extensions
90ms-RKSJ-V Vertical version of 90ms-RKSJ-H
90msp-RKSJ-H Same as 90ms-RKSJ-H, but replaces half-width Latin characters
with
proportional forms
90msp-RKSJ-V Vertical version of 90msp-RKSJ-H
90pv-RKSJ-H Macintosh, JIS X 0208 character set with KanjiTalk7 extensions,
Shift-JIS
encoding, Script Manager code 1
Add-RKSJ-H JIS X 0208 character set with Fujitsu FMR extensions, Shift-JIS
encoding
Add-RKSJ-V Vertical version of Add-RKSJ-H
EUC-H JIS X 0208 character set, EUC-JP encoding
EUC-V Vertical version of EUC-H
Ext-RKSJ-H JIS C 6226 (JIS78) character set with NEC extensions, Shift-JIS
encoding
Ext-RKSJ-V Vertical version of Ext-RKSJ-H
H JIS X 0208 character set, ISO-2022-JP encoding
V Vertical version of H
UniJIS-UCS2-H Unicode (UCS-2) encoding for the Adobe-Japan1 character
collection
UniJIS-UCS2-V Vertical version of UniJIS-UCS2-H
UniJIS-UCS2-HW-H Same as UniJIS-UCS2-H, but replaces proportional Latin
characters with half-width
forms
UniJIS-UCS2-HW-V Vertical version of UniJIS-UCS2-HW-H
Korean
KSC-EUC-H KS X 1001:1992 character set, EUC-KR encoding
KSC-EUC-V Ver t i cal ver si on of KSC-EUC-H
KSCms-UHC-H Microsoft Code Page 949 (lfCharSet 0x81), KS X 1001:1992 character
set plus
8,822 additional hangul, Unified Hangul Code (UHC) encoding
KSCms-UHC-V Vertical version of KSCms-UHC-H
KSCms-UHC-HW-H Same as KSCms-UHC-H, but replaces proportional Latin characters
with half-width
forms
KSCms-UHC-HW-V Vertical version of KSCms-UHC-HW-H
KSCpc-EUC-H Macintosh, KS X 1001:1992 character set with MacOS-KH extensions,
Script
Manager Code 3
UniKS-UCS2-H Unicode (UCS-2) encoding for the Adobe-Korea1 character
collection
UniKS-UCS2-V Vertical version of UniKS-UCS2-H




Ross
--
contact Ross
company Apex Internet Software
email info@apexinternetsoftware.com
website http://www.apexinternetsoftware.com/



Re: cjk font support.

Posted by Carlos Villegas <ca...@uniscope.co.jp>.

Ross wrote:
> 
> 穿失奄 wrote:
> 
> > Hi.
> > I have some questions about FOP and CJK font.
> > How can I make fop work with cjk font?
> >
> > ==================================================
> > 酔軒 昔斗掛, Daum
> > 汝持 床澗 巷戟 E-mail 爽社 廃五析掛
> > 走姥談 廃越 伊事辞搾什 Daum FIREBALL
> > NO SPAM 跳凪昔! : http://www.daum.net/event/nospam
> > http://www.daum.net
> 
> Can you provide a list of the character encodings and fonts that you
> use?
> 
> PDF supports Unicode, which is to be the character set of all glyphs
> then used.  So, you can convert native character encoding to PDF Unicode
> character encoding.  Then, write that in PDF file embedded font, or
> embedded subset font, or reference to installed font.
> 
> For one byte character encoding languages like most Western languages,
> you can convert the native characters to unicode or use the regular font
> with font mapping.
> 
> The Adobe PDF Specification has some lists of character encodings.
> Other codepages are at ISO, Microsoft, or IBM web site, or actually
> mostly from Unicode web site.

I'm afraid it's not as simple as that. Even if you get CJK fonts setup
and are able to display some CJK characters, there's is no support
in FOP for CJK typesetting. For example, a japanese or chinese paragraph
does not contain any spaces at all (even wide spaces). There is no
concept of words as in western languages. So the current line-breaking
procedure in FOP will put the whole paragraph in one line since it
will be taken as a single word and cannot be broken. I have plans
for proper CJK typesetting but that will take a while to say the
least! However, if all you need is to display some characters inline,
then setting up the fonts is probably most of what is needed.

Carlos Villegas

RE: cjk font support.

Posted by Khim <la...@patroids.com>.
Hi All,
 Does anyone know why the font does not show out on the pdf even i have
sucessfully compile a new fop jar that contain the new font. here is the
attachment of the pdf i have created. And i notice at compiling the fo to
pdf there is a warning message saying: [1WARNING: unknown font  so defaulted
font to any] is it i have missed out some step? Please help if u know the
answer. Thanx a million. Khim (^0^)


-----Original Message-----
From: Ross [mailto:ross@apexinternetsoftware.com]
Sent: Tuesday, December 19, 2000 3:19 AM
To: fop-dev@xml.apache.org
Subject: Re: cjk font support.


Àü¼º±â wrote:

> Hi.
> I have some questions about FOP and CJK font.
> How can I make fop work with cjk font?
>
> ==================================================
> ¿ì¸® ÀÎÅͳÝ, Daum
> Æò»ý ¾²´Â ¹«·á E-mail ÁÖ¼Ò ÇѸÞÀϳÝ
> Áö±¸ÃÌ ÇÑ±Û °Ë»ö¼­ºñ½º Daum FIREBALL
> NO SPAM Ä·ÆäÀÎ! : http://www.daum.net/event/nospam
> http://www.daum.net

Can you provide a list of the character encodings and fonts that you
use?

PDF supports Unicode, which is to be the character set of all glyphs
then used.  So, you can convert native character encoding to PDF Unicode
character encoding.  Then, write that in PDF file embedded font, or
embedded subset font, or reference to installed font.

For one byte character encoding languages like most Western languages,
you can convert the native characters to unicode or use the regular font
with font mapping.

The Adobe PDF Specification has some lists of character encodings.
Other codepages are at ISO, Microsoft, or IBM web site, or actually
mostly from Unicode web site.

If you have other questions you can ask me.

Ross
--
contact Ross
company Apex Internet Software
email info@apexinternetsoftware.com
website http://www.apexinternetsoftware.com/



Re: cjk font support.

Posted by Ross <ro...@apexinternetsoftware.com>.
Àü¼º±â wrote:

> Hi.
> I have some questions about FOP and CJK font.
> How can I make fop work with cjk font?
>
> ==================================================
> ¿ì¸® ÀÎÅͳÝ, Daum
> Æò»ý ¾²´Â ¹«·á E-mail ÁÖ¼Ò ÇѸÞÀϳÝ
> Áö±¸ÃÌ ÇÑ±Û °Ë»ö¼­ºñ½º Daum FIREBALL
> NO SPAM Ä·ÆäÀÎ! : http://www.daum.net/event/nospam
> http://www.daum.net

Can you provide a list of the character encodings and fonts that you
use?

PDF supports Unicode, which is to be the character set of all glyphs
then used.  So, you can convert native character encoding to PDF Unicode
character encoding.  Then, write that in PDF file embedded font, or
embedded subset font, or reference to installed font.

For one byte character encoding languages like most Western languages,
you can convert the native characters to unicode or use the regular font
with font mapping.

The Adobe PDF Specification has some lists of character encodings.
Other codepages are at ISO, Microsoft, or IBM web site, or actually
mostly from Unicode web site.

If you have other questions you can ask me.

Ross
--
contact Ross
company Apex Internet Software
email info@apexinternetsoftware.com
website http://www.apexinternetsoftware.com/