You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Justin Fetherolf <ju...@radixmeta.com> on 2015/12/21 20:39:29 UTC

User entered text render tips

Hi, I'm looking to see if there is a preferred workflow for rendering text
within a specified area of the PDF page.

Currently, we have users that are allowed to enter comments to our
database, which will then get rendered in PDF form upon request.

My basic flow is as follows:

Get text from DB.
*Sub spaces in for tabs.
*Remove any unicode control characters.
Word by word, use PDFont#getStringWidth() to determine if my growing string
is wider than the area I want my text to be contained in.
If yes, draw the text and continue on.
If no, continue adding words until width is reached.

Aside from any feedback on my general process, I have a specific question:

The asterisks above indicate that this was a problem area for me. Since our
users tend to write these comments in something like Word, and then copy
and paste into our web page, we tend to get some unexpected characters that
end up making PDFont#getStringWidth() throw an IOException. It also seems
that any "line breaking" characters and tabs do the same. Has there been
any discussion on having that method default a width to 0 when it
encounters characters/glyphs that are non-renderable or line breaking?

Thanks for you time and any help you can provide.


Justin D. Fetherolf
Software Engineer
Radix Metasystems

Re: User entered text render tips

Posted by John Hewson <jo...@jahewson.com>.

Hi,

> On 21 Dec 2015, at 11:39, Justin Fetherolf <ju...@radixmeta.com> wrote:
> 
> Hi, I'm looking to see if there is a preferred workflow for rendering text
> within a specified area of the PDF page.
> 
> Currently, we have users that are allowed to enter comments to our
> database, which will then get rendered in PDF form upon request.
> 
> My basic flow is as follows:
> 
> Get text from DB.
> *Sub spaces in for tabs.
> *Remove any unicode control characters.
> Word by word, use PDFont#getStringWidth() to determine if my growing string
> is wider than the area I want my text to be contained in.
> If yes, draw the text and continue on.
> If no, continue adding words until width is reached.

Yep, that's the way to do it.

> Aside from any feedback on my general process, I have a specific question:
> 
> The asterisks above indicate that this was a problem area for me. Since our
> users tend to write these comments in something like Word, and then copy
> and paste into our web page, we tend to get some unexpected characters that
> end up making PDFont#getStringWidth() throw an IOException. It also seems
> that any "line breaking" characters and tabs do the same. Has there been
> any discussion on having that method default a width to 0 when it
> encounters characters/glyphs that are non-renderable or line breaking?

Yes, this has been discussed, the verdict was not to provide bogus character metrics for glyphs we don't have. A missing glyph isn't zero width, it's the width of the fallback "missing glyph" (the empty rectangle I'm sure you've seen), which isn't what you want anyway.

Just as you have to filter strings before showText, you need to perform the same filtering before getStringWidth(). I'd recommend using a whitelist rather than a blacklist approach - just filter out everything you don't approve of.

-- John

> Thanks for you time and any help you can provide.
> 
> 
> Justin D. Fetherolf
> Software Engineer
> Radix Metasystems

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org