You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@forrest.apache.org by Børre Gaup <bo...@skolelinux.no> on 2006/07/10 10:35:26 UTC

Non-latin1 characters in pdf?

Hello!

We use forrest as our framework for publishing our documentation 
(http://divvun.no).

One of the languages we publish in is Northern Sami. The characters "č (c 
caron), ŋ (eng) and đ (d slash) show up as #-marks in the pdf documents. 

What causes this behavior, and is it possible to change this so the proper 
characters show up?

regards,
-- 
Børre Gaup

Re: Non-latin1 characters in pdf?

Posted by Børre Gaup <bo...@skolelinux.no>.
Vuos, suoidnemánu 10. b. 2006 15.30, Thorsten Scherler čálii:
> El lun, 10-07-2006 a las 13:37 +0200, Børre Gaup escribió:
> > Vuos, suoidnemánu 10. b. 2006 12.54, Børre Gaup čálii:
> > > Vuos, suoidnemánu 10. b. 2006 11.59, Thorsten Scherler čálii:
> > > > El lun, 10-07-2006 a las 11:17 +0200, Børre Gaup escribió:
> > > > > Vuos, suoidnemánu 10. b. 2006 10.49, Thorsten Scherler čálii:
> > > > > > El lun, 10-07-2006 a las 10:35 +0200, Børre Gaup escribió:
> > > > > > > Hello!
> > > > > > >
> > > > > > > We use forrest as our framework for publishing our
> > > > > > > documentation (http://divvun.no).
> > > > > > >
> > > > > > > One of the languages we publish in is Northern Sami. The
> > > > > > > characters "č (c caron), ŋ (eng) and đ (d slash) show up as
> > > > > > > #-marks in the pdf documents.
> > > > > > >
> > > > > > > What causes this behavior, and is it possible to change this so
> > > > > > > the proper characters show up?
> > > > > >
> > > > > > Not sure, but let us find out what the problem is.
> > > > > >
> > > > > > How does your xml looks like? The most important information lies
> > > > > > in <?xml version="1.0" encoding="UTF-8"?>. Do you have it like
> > > > > > this?
> > > > >
> > > > > Yes.
> > > > >
> > > > > > What is the result of the yourUrl.fo? I mean e.g. you have your
> > > > > > index.xml in Sami, you request http://localhost:8888/index.html,
> > > > > > do you see the Sami characters?
> > > > >
> > > > > Yes.
> > > > >
> > > > > > What happen when you request
> > > > > > http://localhost:8888/index.fo?
> > > > >
> > > > > It gives me a xml document which also shows the problematic
> > > > > characters as they should.
> > > >
> > > > Ok, then the problem is in
> > > > http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html
> > > >
> > > > You may want to search the cocoon archives whether this is a known
> > > > problem and whether there exist a solution/workaround.
> > >
> > > Ok, I'll have a look. If I find a solution I'll post it here.
> >
> > Ok, I found the document
> > http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html.
> >
> > I followed the instructions on that page, and insert the line:
> > <user-config>/Users/boerre/forrest/config.xml</user-config> into the the
> > file
> > $FORREST_HOME/build/plugins/org.apache.forrest.plugin.output.pdf/output.x
> >map, resulting into this stanza:
> >
> >   <map:components>
> >     <map:serializers default="fo2pdf">
> >       <map:serializer   name="fo2pdf"
> >                        
> > src="org.apache.cocoon.serialization.FOPSerializer"
> > mime-type="application/pdf"/>
> >
> > <user-config>/Users/boerre/forrest/lib/fop-fonts/config.xml</user-config>
> >     </map:serializers>
> >   </map:components>
> >
> > When calling http://localhost:8888/index.html, I get this result:
> >
> > Internal Server Error
> > Message: null
> > Description: No details available.
> > Sender: org.apache.cocoon.servlet.CocoonServlet
> > Source: Cocoon Servlet
> > Request URI
> > index.html
> > cause
> > No attribute named "name" is associated with the configuration
> > element "user-config" at
> > file:/Users/boerre/Documents/forrest/build/plugins/org.apache.forrest.plu
> >gin.output.pdf/output.xmap:25:26 request-uri
> > /index.html
> > Apache Cocoon 2.2.0-dev
> >
> > What am I doing wrong?
>
> The tag have to be within the serializer.
>
> See above you have ...mime-type="application/pdf"/> but would need
>
> <map:serializer name="fo2pdf"
>                 src="org.apache.cocoon.serialization.FOPSerializer"
>                 mime-type="application/pdf">
>   <user-config>/Users/boerre/forrest/lib/fop-fonts/config.xml</user-config>
> </map:serializer>
>
> Thanks for keeping us up to date.
>
> BTW if this works (please let us know) I reckon we should add this with
> a locationmap entry.
>
Ah, a typo from my side ...

After I did the correction, I now have embedded fonts working, and I am able 
to see all the sami letters. Thank you for your patience :)

regards,
-- 
Børre Gaup

Re: Non-latin1 characters in pdf?

Posted by Thorsten Scherler <th...@apache.org>.
El lun, 10-07-2006 a las 13:37 +0200, Børre Gaup escribió:
> Vuos, suoidnemánu 10. b. 2006 12.54, Børre Gaup čálii:
> > Vuos, suoidnemánu 10. b. 2006 11.59, Thorsten Scherler čálii:
> > > El lun, 10-07-2006 a las 11:17 +0200, Børre Gaup escribió:
> > > > Vuos, suoidnemánu 10. b. 2006 10.49, Thorsten Scherler čálii:
> > > > > El lun, 10-07-2006 a las 10:35 +0200, Børre Gaup escribió:
> > > > > > Hello!
> > > > > >
> > > > > > We use forrest as our framework for publishing our documentation
> > > > > > (http://divvun.no).
> > > > > >
> > > > > > One of the languages we publish in is Northern Sami. The characters
> > > > > > "č (c caron), ŋ (eng) and đ (d slash) show up as #-marks in the pdf
> > > > > > documents.
> > > > > >
> > > > > > What causes this behavior, and is it possible to change this so the
> > > > > > proper characters show up?
> > > > >
> > > > > Not sure, but let us find out what the problem is.
> > > > >
> > > > > How does your xml looks like? The most important information lies in
> > > > > <?xml version="1.0" encoding="UTF-8"?>. Do you have it like this?
> > > >
> > > > Yes.
> > > >
> > > > > What is the result of the yourUrl.fo? I mean e.g. you have your
> > > > > index.xml in Sami, you request http://localhost:8888/index.html, do
> > > > > you see the Sami characters?
> > > >
> > > > Yes.
> > > >
> > > > > What happen when you request
> > > > > http://localhost:8888/index.fo?
> > > >
> > > > It gives me a xml document which also shows the problematic characters
> > > > as they should.
> > >
> > > Ok, then the problem is in
> > > http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html
> > >
> > > You may want to search the cocoon archives whether this is a known
> > > problem and whether there exist a solution/workaround.
> >
> > Ok, I'll have a look. If I find a solution I'll post it here.
> 
> Ok, I found the document 
> http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html.
> 
> I followed the instructions on that page, and insert the line:
> <user-config>/Users/boerre/forrest/config.xml</user-config> into the the file 
> $FORREST_HOME/build/plugins/org.apache.forrest.plugin.output.pdf/output.xmap, 
> resulting into this stanza:
> 
>   <map:components>
>     <map:serializers default="fo2pdf">
>       <map:serializer   name="fo2pdf" 
>                         src="org.apache.cocoon.serialization.FOPSerializer"
>                         mime-type="application/pdf"/>
>             
> <user-config>/Users/boerre/forrest/lib/fop-fonts/config.xml</user-config>
>     </map:serializers>
>   </map:components>
> 
> When calling http://localhost:8888/index.html, I get this result:
> 
> Internal Server Error
> Message: null
> Description: No details available.
> Sender: org.apache.cocoon.servlet.CocoonServlet
> Source: Cocoon Servlet
> Request URI
> index.html
> cause
> No attribute named "name" is associated with the configuration 
> element "user-config" at 
> file:/Users/boerre/Documents/forrest/build/plugins/org.apache.forrest.plugin.output.pdf/output.xmap:25:26
> request-uri
> /index.html
> Apache Cocoon 2.2.0-dev
> 
> What am I doing wrong?

The tag have to be within the serializer. 

See above you have ...mime-type="application/pdf"/> but would need 

<map:serializer name="fo2pdf"
                src="org.apache.cocoon.serialization.FOPSerializer"
                mime-type="application/pdf">
  <user-config>/Users/boerre/forrest/lib/fop-fonts/config.xml</user-config>
</map:serializer>

Thanks for keeping us up to date. 

BTW if this works (please let us know) I reckon we should add this with
a locationmap entry.

salu2
-- 
thorsten

"Together we stand, divided we fall!" 
Hey you (Pink Floyd)


Re: Non-latin1 characters in pdf?

Posted by Børre Gaup <bo...@skolelinux.no>.
Vuos, suoidnemánu 10. b. 2006 12.54, Børre Gaup čálii:
> Vuos, suoidnemánu 10. b. 2006 11.59, Thorsten Scherler čálii:
> > El lun, 10-07-2006 a las 11:17 +0200, Børre Gaup escribió:
> > > Vuos, suoidnemánu 10. b. 2006 10.49, Thorsten Scherler čálii:
> > > > El lun, 10-07-2006 a las 10:35 +0200, Børre Gaup escribió:
> > > > > Hello!
> > > > >
> > > > > We use forrest as our framework for publishing our documentation
> > > > > (http://divvun.no).
> > > > >
> > > > > One of the languages we publish in is Northern Sami. The characters
> > > > > "č (c caron), ŋ (eng) and đ (d slash) show up as #-marks in the pdf
> > > > > documents.
> > > > >
> > > > > What causes this behavior, and is it possible to change this so the
> > > > > proper characters show up?
> > > >
> > > > Not sure, but let us find out what the problem is.
> > > >
> > > > How does your xml looks like? The most important information lies in
> > > > <?xml version="1.0" encoding="UTF-8"?>. Do you have it like this?
> > >
> > > Yes.
> > >
> > > > What is the result of the yourUrl.fo? I mean e.g. you have your
> > > > index.xml in Sami, you request http://localhost:8888/index.html, do
> > > > you see the Sami characters?
> > >
> > > Yes.
> > >
> > > > What happen when you request
> > > > http://localhost:8888/index.fo?
> > >
> > > It gives me a xml document which also shows the problematic characters
> > > as they should.
> >
> > Ok, then the problem is in
> > http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html
> >
> > You may want to search the cocoon archives whether this is a known
> > problem and whether there exist a solution/workaround.
>
> Ok, I'll have a look. If I find a solution I'll post it here.

Ok, I found the document 
http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html.

I followed the instructions on that page, and insert the line:
<user-config>/Users/boerre/forrest/config.xml</user-config> into the the file 
$FORREST_HOME/build/plugins/org.apache.forrest.plugin.output.pdf/output.xmap, 
resulting into this stanza:

  <map:components>
    <map:serializers default="fo2pdf">
      <map:serializer   name="fo2pdf" 
                        src="org.apache.cocoon.serialization.FOPSerializer"
                        mime-type="application/pdf"/>
            
<user-config>/Users/boerre/forrest/lib/fop-fonts/config.xml</user-config>
    </map:serializers>
  </map:components>

When calling http://localhost:8888/index.html, I get this result:

Internal Server Error
Message: null
Description: No details available.
Sender: org.apache.cocoon.servlet.CocoonServlet
Source: Cocoon Servlet
Request URI
index.html
cause
No attribute named "name" is associated with the configuration 
element "user-config" at 
file:/Users/boerre/Documents/forrest/build/plugins/org.apache.forrest.plugin.output.pdf/output.xmap:25:26
request-uri
/index.html
Apache Cocoon 2.2.0-dev

What am I doing wrong?
-- 
Børre Gaup

Re: Non-latin1 characters in pdf?

Posted by Børre Gaup <bo...@skolelinux.no>.
Vuos, suoidnemánu 10. b. 2006 11.59, Thorsten Scherler čálii:
> El lun, 10-07-2006 a las 11:17 +0200, Børre Gaup escribió:
> > Vuos, suoidnemánu 10. b. 2006 10.49, Thorsten Scherler čálii:
> > > El lun, 10-07-2006 a las 10:35 +0200, Børre Gaup escribió:
> > > > Hello!
> > > >
> > > > We use forrest as our framework for publishing our documentation
> > > > (http://divvun.no).
> > > >
> > > > One of the languages we publish in is Northern Sami. The characters
> > > > "č (c caron), ŋ (eng) and đ (d slash) show up as #-marks in the pdf
> > > > documents.
> > > >
> > > > What causes this behavior, and is it possible to change this so the
> > > > proper characters show up?
> > >
> > > Not sure, but let us find out what the problem is.
> > >
> > > How does your xml looks like? The most important information lies in
> > > <?xml version="1.0" encoding="UTF-8"?>. Do you have it like this?
> >
> > Yes.
> >
> > > What is the result of the yourUrl.fo? I mean e.g. you have your
> > > index.xml in Sami, you request http://localhost:8888/index.html, do you
> > > see the Sami characters?
> >
> > Yes.
> >
> > > What happen when you request
> > > http://localhost:8888/index.fo?
> >
> > It gives me a xml document which also shows the problematic characters as
> > they should.
>
> Ok, then the problem is in
> http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html
>
> You may want to search the cocoon archives whether this is a known
> problem and whether there exist a solution/workaround.
>

Ok, I'll have a look. If I find a solution I'll post it here.
-- 
Børre Gaup

Re: Non-latin1 characters in pdf?

Posted by Thorsten Scherler <th...@apache.org>.
El lun, 10-07-2006 a las 11:17 +0200, Børre Gaup escribió:
> Vuos, suoidnemánu 10. b. 2006 10.49, Thorsten Scherler čálii:
> > El lun, 10-07-2006 a las 10:35 +0200, Børre Gaup escribió:
> > > Hello!
> > >
> > > We use forrest as our framework for publishing our documentation
> > > (http://divvun.no).
> > >
> > > One of the languages we publish in is Northern Sami. The characters "č (c
> > > caron), ŋ (eng) and đ (d slash) show up as #-marks in the pdf documents.
> > >
> > > What causes this behavior, and is it possible to change this so the
> > > proper characters show up?
> >
> > Not sure, but let us find out what the problem is.
> >
> > How does your xml looks like? The most important information lies in
> > <?xml version="1.0" encoding="UTF-8"?>. Do you have it like this?
> >
> Yes.
> > What is the result of the yourUrl.fo? I mean e.g. you have your
> > index.xml in Sami, you request http://localhost:8888/index.html, do you
> > see the Sami characters? 
> Yes.
> 
> > What happen when you request 
> > http://localhost:8888/index.fo?
> >
> 
> It gives me a xml document which also shows the problematic characters as they 
> should.
> 

Ok, then the problem is in
http://cocoon.apache.org/2.1/userdocs/pdf-serializer.html

You may want to search the cocoon archives whether this is a known
problem and whether there exist a solution/workaround.

salu2

> Both documents attached.
> > salu2
> 
-- 
thorsten

"Together we stand, divided we fall!" 
Hey you (Pink Floyd)


Re: Non-latin1 characters in pdf?

Posted by Børre Gaup <bo...@skolelinux.no>.
Vuos, suoidnemánu 10. b. 2006 10.49, Thorsten Scherler čálii:
> El lun, 10-07-2006 a las 10:35 +0200, Børre Gaup escribió:
> > Hello!
> >
> > We use forrest as our framework for publishing our documentation
> > (http://divvun.no).
> >
> > One of the languages we publish in is Northern Sami. The characters "č (c
> > caron), ŋ (eng) and đ (d slash) show up as #-marks in the pdf documents.
> >
> > What causes this behavior, and is it possible to change this so the
> > proper characters show up?
>
> Not sure, but let us find out what the problem is.
>
> How does your xml looks like? The most important information lies in
> <?xml version="1.0" encoding="UTF-8"?>. Do you have it like this?
>
Yes.
> What is the result of the yourUrl.fo? I mean e.g. you have your
> index.xml in Sami, you request http://localhost:8888/index.html, do you
> see the Sami characters? 
Yes.

> What happen when you request 
> http://localhost:8888/index.fo?
>

It gives me a xml document which also shows the problematic characters as they 
should.

Both documents attached.
> salu2

-- 
Børre Gaup

Re: Non-latin1 characters in pdf?

Posted by Thorsten Scherler <th...@apache.org>.
El lun, 10-07-2006 a las 10:35 +0200, Børre Gaup escribió:
> Hello!
> 
> We use forrest as our framework for publishing our documentation 
> (http://divvun.no).
> 
> One of the languages we publish in is Northern Sami. The characters "č (c 
> caron), ŋ (eng) and đ (d slash) show up as #-marks in the pdf documents. 
> 
> What causes this behavior, and is it possible to change this so the proper 
> characters show up?

Not sure, but let us find out what the problem is. 

How does your xml looks like? The most important information lies in
<?xml version="1.0" encoding="UTF-8"?>. Do you have it like this?

What is the result of the yourUrl.fo? I mean e.g. you have your
index.xml in Sami, you request http://localhost:8888/index.html, do you
see the Sami characters? What happen when you request
http://localhost:8888/index.fo?

salu2
-- 
thorsten

"Together we stand, divided we fall!" 
Hey you (Pink Floyd)