You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Paul Tremblay <ph...@iglou.com> on 2006/03/01 00:54:18 UTC

Re: RTF and table/column widths

On Tue, Feb 28, 2006 at 07:07:16PM +0100, Jeremias Maerki wrote:
> 
> Nope, according to the RTF spec, the output should be in "US-ASCII"
> (7-bit) for portability. UTF-8 is definitely not supported by RTF but I
> think it's possible to use various 8-bit character sets and Unicode
> escapes if the proper commands are generated. The Microsoft RTF spec
> lists what is possible.
> 
> 

I've written an rtf2xml program 

http://rtf2xml.sourceforge.net/

and I can state quite definitively that this is correct. RTF must be
7-bit encoded, but can easily handle Unicode by escapes. For example

\u197

represents the unicode character &#197;

However, if the unicode character is greater than 65535, you have to
subtract 65536 from it, so that 

\u-1

becomes 

&#65535;

(I'm a bit fuzzy on this last point. I am looking at my code, which
is:  
        if uni_char < 0:
            uni_char +=  65536

I know the code is correct.
)

This brings up another question I've had about the RTF portion of fop.
How much work is being put into this area? Having worked a lot with
RTF, I've become convinced it is so full of contradictions and such a
mess (such as the unicode example above--what could be less intuitive?)
that I wonder if in the future support for this format should be
dropped altogether? I'm sure RTF support is important because of RTF's
universality, but it would seem that developers' time might be better
spend on developing an Open Office format? 

I realize that my suggestion might come across as someone ungrateful
for all the work of the fop team. I don't mean it to be so, and last
night was very please to find out that fop.91beta supports orphan and
widow controls. The lack of this support had forced me to use a
variant of TeX for a thesis, since the graduate school required no
orphans or widows.

Paul

-- 

************************
*Paul Tremblay         *
*phthenry@iglou.com    *
************************

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: RTF and table/column widths

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
On 01.03.2006 00:54:18 Paul Tremblay wrote:
> On Tue, Feb 28, 2006 at 07:07:16PM +0100, Jeremias Maerki wrote:
> > 
> > Nope, according to the RTF spec, the output should be in "US-ASCII"
> > (7-bit) for portability. UTF-8 is definitely not supported by RTF but I
> > think it's possible to use various 8-bit character sets and Unicode
> > escapes if the proper commands are generated. The Microsoft RTF spec
> > lists what is possible.
> > 
> > 
> 
> I've written an rtf2xml program 
> 
> http://rtf2xml.sourceforge.net/

Very interesting. Could make a nice base for a tool to migrate old RTF
templates used by an RTF generator I used to work with to something new.
It seems to be under the GPL but since I'd use it as a stand-alone tool
I guess that's fine. :-)

> and I can state quite definitively that this is correct. RTF must be
> 7-bit encoded, but can easily handle Unicode by escapes. For example
> 
> \u197
> 
> represents the unicode character &#197;
> 
> However, if the unicode character is greater than 65535, you have to
> subtract 65536 from it, so that 
> 
> \u-1
> 
> becomes 
> 
> &#65535;
> 
> (I'm a bit fuzzy on this last point. I am looking at my code, which
> is:  
>         if uni_char < 0:
>             uni_char +=  65536
> 
> I know the code is correct.
> )
> 
> This brings up another question I've had about the RTF portion of fop.
> How much work is being put into this area?

Not very much although it gets regular attention.

> Having worked a lot with
> RTF, I've become convinced it is so full of contradictions and such a
> mess (such as the unicode example above--what could be less intuitive?)
> that I wonder if in the future support for this format should be
> dropped altogether? I'm sure RTF support is important because of RTF's
> universality, but it would seem that developers' time might be better
> spend on developing an Open Office format? 

OpenDocument, yes. I wish I had the time/opportunity to work on that.
But right now PDF and PS have a higher priority for me. The problem with
OpenDocument is that Microsoft firmly opposes this format and instead
pushed their own into a standardization process. And they will probably
get away with it. WordML is horrible (as is RTF) but the customers give
you the money so they tell you what you have to do. Sigh.

> I realize that my suggestion might come across as someone ungrateful
> for all the work of the fop team. I don't mean it to be so, and last
> night was very please to find out that fop.91beta supports orphan and
> widow controls. The lack of this support had forced me to use a
> variant of TeX for a thesis, since the graduate school required no
> orphans or widows.

I know what you mean and I totally agree with all you said. It's just
not that simple sometimes. Still, if someone started an FOEventHandler
for OpenDocument that would be real cool!

Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org