You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Aaron Mulder <am...@gmail.com> on 2016/07/24 12:35:33 UTC

Text Field Appearance Streams

I am filling out a form on an existing PDF document.  The base
document has /NeedAppearances true and the result is that the text
looks different on every viewer.  For instance, on some the text is
offset vertically or the text is cut off when using a font that works
fine on a different viewer.

I gather the solution to this is to set /NeedAppearances to false and
provide an appearance stream for every field.  I don't know much about
appearance streams.

In looking at the output of the PDFBox examples, it seems that the
appearance stream actually writes the field's value as text, e.g. for
the form containing the text "Sample field" the appearance stream is:

/Tx BMC
q
1 1 198 48 re
W
n
BT
/Helv 12 Tf
2 20.692 Td
(Sample field) Tj
ET
Q
EMC

Is there a way to not include the specific text in the appearance
stream such that if the user changes the value in the field then the
appearance stream will still work?

Really all I want from the appearance stream is to specify the font
and bounds and offset such that the text is in exactly the same place
on every viewer and the same font size fits into the available space
on every viewer.

And one more question -- the document has let's say 200 fields, and I
only populate maybe 50 of them.  If I specify appearance streams for
*only* the ones I populate, will that work?  If that's the case,
should /NeedAppearances be set to true or false?

Thanks,
       Aaron

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

> Am 25.07.2016 um 04:06 schrieb Aaron Mulder <am...@gmail.com>:
> 
> End of the day update:
> 
> If I remove the /NeedAppearances flag and set an appearance stream on
> a form field, then it seems to render the same across all browsers,
> and Preview, and Acrobat Reader.  Yay!
> 
> However, then all the text positioning and line breaks are on me.

that shouldn't be needed.

> For
> instance, some of the fields use Q 1 (centered).  Without an
> appearance stream, the text is magically centered.  If I specify an
> appearance stream, I have to manually use "Td" commands to position
> the insertion point such that the text will be centered.

which version of PDFBox are you using? You should use 2.x for appearance streams to work properly. Would it be possible that you upload a sample form to a public location to take a look at with a small test to reproduce the issue.

The unit tests with the different alignments are all OK. 

> 
> Likewise, the multiline fields don't work.  If I just specify "Tj"
> with a huge string of text, it comes out on one line and the bulk of
> it is clipped.

same as above.

> So it seems that I'd have to manually calculate the
> correct font size and all the line breaks to fit the text into the
> available rectangle and lay it out by hand in the appearance stream.
> I note that in the MultilineFields.pdf sample PDF, the existing text
> is actually positioned with *each word* using a separate Td and Tj
> command, which seems like overkill.

yes, it could be simplified, but that's replicating how Acrobat generates the appearance stream that's why we do the same.

> 
> And the last issue, at first glance it prints OK with the code in the
> PrintPDF tool, except for printing I need to specify a paper tray and
> duplex method, and I haven't had luck in the past specifying those
> through the Java printing API (granted, it's been a couple years since
> I last tried).  I've been printing non-form PDFs with lpr and that
> works great without forms, but prints all the text fields blank when I
> print a PDF with form fields:
> 
> lpr -P hp_LaserJet_2430 -o InputSlot=Tray3_500 -o Duplex=DuplexNoTumble test.pdf
> 
> Still, this is probably the approach I'll attempt to go forward with.
> 
> 
> 
> For the record, I explored a few other cases:
> 
> 1. /NeedAppearances true but appearance streams set.  Appearance
> stream used by Safari/Preview/Firefox but ignored by Acrobat
> Reader/Chrome.  PrintPDF and PDFToImage act as if the forms are empty.
> 
> 2. /NeedAppearances true and font size of 0 in /DA and no appearance
> stream set.  Acrobat Reader and Chrome work OK, except that the
> selected font size is too small for small fields (like 6pt when I can
> get it to be 9pt with the same Rect in case #3 with Safari/Preview).
> Safari/Preview don't change the text size and just crop any text that
> doesn't fit.  Firefox is horrible.  PrintPDF and PDFToImage act as if
> the forms are empty.
> 
> 3. /NeedAppearances true and a custom-calculated font size in /DA and
> no appearance stream set.  Looks great in Safari/Preview, except needs
> to be vertically offset.  That offset causes other viewers to look
> bad.  Also font sizes that work in Safari/Preview are cropped on other
> viewers.  PrintPDF and PDFToImage act as if the forms are empty.
> 
> I could combine #2 (if I bump out all the field Rects 2 points in each
> direction to avoid the minuscule text) for non-Preview users and #3
> for Preview users, except neither works with printing.
> 
> Thanks,
>       Aaron
> 
> On Sun, Jul 24, 2016 at 1:17 PM, Aaron Mulder <am...@gmail.com> wrote:
>> I hadn't (I guess, obviously) found the page describing the tools yet.
>> Thanks, both of you!
>> 
>> Aaron
>> 
>> On Sun, Jul 24, 2016 at 11:07 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>> Hi,
>>> 
>>>> Am 24.07.2016 um 17:05 schrieb Aaron Mulder <am...@gmail.com>:
>>>> 
>>>> Thank you again.
>>>> 
>>>> I looked at MultilineFields.pdf in a text editor but it seems to be
>>>> linearized with all the streams compressed -- I can't make much sense
>>>> of it.  Is there some convenient tool to emit a "human-readable"
>>>> version with all the streams decompressed, without altering any of the
>>>> object IDs or anything?
>>> 
>>> the easiest is to take a look with the PDFDebugger
>>> 
>>> BR
>>> Maruan
>>> 
>>>> 
>>>> I recall once before using some kind of PDF analysis tool that dumped
>>>> out a lot of information, but it produced sort of a diagnostic log.
>>>> What I'd really like is a tool that produces a completely legitimate
>>>> PDF -- pretty much the same as the original file, just with no
>>>> compression.
>>>> 
>>>> Thanks,
>>>>    Aaron
>>>> 
>>>> On Sun, Jul 24, 2016 at 10:34 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>>>> Hi Aaron,
>>>>> 
>>>>>> Am 24.07.2016 um 16:24 schrieb Aaron Mulder <am...@gmail.com>:
>>>>>> 
>>>>>> OK, thanks.  The font is just the standard Helvetica so it should not
>>>>>> need to be embedded;
>>>>> 
>>>>> If you need to support languages other than western text Arial (or Arial Unicode for a much larger coverage)  might be a better option as they support more characters.
>>>>> 
>>>>>> I just need to specify the appropriate point size
>>>>>> in order for the text to fit in the available space.
>>>>> 
>>>>> If you specify the font size then this should be taken. If you'd like to fit the text into the form field specify a font size of 0 (zero) for auto scaling.
>>>>> 
>>>>>> 
>>>>>> Some of the fields are multiline text fields ("text areas").  For
>>>>>> those fields, do I need to manually line-break the text that I'm going
>>>>>> to put in those fields in order to lay out the text in the appearance
>>>>>> stream appropriately?
>>>>> 
>>>>> that shouldn't be necessary - but include line-breaks where you'd like to have new paragraphs.
>>>>> 
>>>>> You can take a look at the AlignmentTest.pdf and MultilineFields.pdf documents in pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/. The lower half is prefilled by Adobe Acrobat to see how a filled out form should look like. If you run the unit tests you'll also see how PDFBox fills these fields (the upper half).
>>>>> 
>>>>>> I guess I am assuming that I will need to do
>>>>>> that, though I suppose I'll find out shortly.  :)
>>>>> 
>>>>> BR
>>>>> Maruan
>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>>    Aaron
>>>>>> 
>>>>>> On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>>>>>>> 
>>>>>>>> I am filling out a form on an existing PDF document.  The base
>>>>>>>> document has /NeedAppearances true and the result is that the text
>>>>>>>> looks different on every viewer.  For instance, on some the text is
>>>>>>>> offset vertically or the text is cut off when using a font that works
>>>>>>>> fine on a different viewer.
>>>>>>>> 
>>>>>>>> I gather the solution to this is to set /NeedAppearances to false and
>>>>>>>> provide an appearance stream for every field.  I don't know much about
>>>>>>>> appearance streams.
>>>>>>>> 
>>>>>>>> In looking at the output of the PDFBox examples, it seems that the
>>>>>>>> appearance stream actually writes the field's value as text, e.g. for
>>>>>>>> the form containing the text "Sample field" the appearance stream is:
>>>>>>>> 
>>>>>>>> /Tx BMC
>>>>>>>> q
>>>>>>>> 1 1 198 48 re
>>>>>>>> W
>>>>>>>> n
>>>>>>>> BT
>>>>>>>> /Helv 12 Tf
>>>>>>>> 2 20.692 Td
>>>>>>>> (Sample field) Tj
>>>>>>>> ET
>>>>>>>> Q
>>>>>>>> EMC
>>>>>>>> 
>>>>>>>> Is there a way to not include the specific text in the appearance
>>>>>>>> stream such that if the user changes the value in the field then the
>>>>>>>> appearance stream will still work?
>>>>>>> 
>>>>>>> If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>>>>>>> 
>>>>>>>> 
>>>>>>>> Really all I want from the appearance stream is to specify the font
>>>>>>>> and bounds and offset such that the text is in exactly the same place
>>>>>>>> on every viewer and the same font size fits into the available space
>>>>>>>> on every viewer.
>>>>>>> 
>>>>>>> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>>>>>>> 
>>>>>>> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> And one more question -- the document has let's say 200 fields, and I
>>>>>>>> only populate maybe 50 of them.  If I specify appearance streams for
>>>>>>>> *only* the ones I populate, will that work?  If that's the case,
>>>>>>>> should /NeedAppearances be set to true or false?
>>>>>>> 
>>>>>>> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>>>>>>> 
>>>>>>> BR
>>>>>>> Maruan
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>>    Aaron
>>>>>>>> 
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>> 
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Aaron Mulder <am...@gmail.com>.
End of the day update:

If I remove the /NeedAppearances flag and set an appearance stream on
a form field, then it seems to render the same across all browsers,
and Preview, and Acrobat Reader.  Yay!

However, then all the text positioning and line breaks are on me.  For
instance, some of the fields use Q 1 (centered).  Without an
appearance stream, the text is magically centered.  If I specify an
appearance stream, I have to manually use "Td" commands to position
the insertion point such that the text will be centered.

Likewise, the multiline fields don't work.  If I just specify "Tj"
with a huge string of text, it comes out on one line and the bulk of
it is clipped.  So it seems that I'd have to manually calculate the
correct font size and all the line breaks to fit the text into the
available rectangle and lay it out by hand in the appearance stream.
I note that in the MultilineFields.pdf sample PDF, the existing text
is actually positioned with *each word* using a separate Td and Tj
command, which seems like overkill.

And the last issue, at first glance it prints OK with the code in the
PrintPDF tool, except for printing I need to specify a paper tray and
duplex method, and I haven't had luck in the past specifying those
through the Java printing API (granted, it's been a couple years since
I last tried).  I've been printing non-form PDFs with lpr and that
works great without forms, but prints all the text fields blank when I
print a PDF with form fields:

lpr -P hp_LaserJet_2430 -o InputSlot=Tray3_500 -o Duplex=DuplexNoTumble test.pdf

Still, this is probably the approach I'll attempt to go forward with.



For the record, I explored a few other cases:

1. /NeedAppearances true but appearance streams set.  Appearance
stream used by Safari/Preview/Firefox but ignored by Acrobat
Reader/Chrome.  PrintPDF and PDFToImage act as if the forms are empty.

2. /NeedAppearances true and font size of 0 in /DA and no appearance
stream set.  Acrobat Reader and Chrome work OK, except that the
selected font size is too small for small fields (like 6pt when I can
get it to be 9pt with the same Rect in case #3 with Safari/Preview).
Safari/Preview don't change the text size and just crop any text that
doesn't fit.  Firefox is horrible.  PrintPDF and PDFToImage act as if
the forms are empty.

3. /NeedAppearances true and a custom-calculated font size in /DA and
no appearance stream set.  Looks great in Safari/Preview, except needs
to be vertically offset.  That offset causes other viewers to look
bad.  Also font sizes that work in Safari/Preview are cropped on other
viewers.  PrintPDF and PDFToImage act as if the forms are empty.

I could combine #2 (if I bump out all the field Rects 2 points in each
direction to avoid the minuscule text) for non-Preview users and #3
for Preview users, except neither works with printing.

Thanks,
       Aaron

On Sun, Jul 24, 2016 at 1:17 PM, Aaron Mulder <am...@gmail.com> wrote:
> I hadn't (I guess, obviously) found the page describing the tools yet.
> Thanks, both of you!
>
> Aaron
>
> On Sun, Jul 24, 2016 at 11:07 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>> Hi,
>>
>>> Am 24.07.2016 um 17:05 schrieb Aaron Mulder <am...@gmail.com>:
>>>
>>> Thank you again.
>>>
>>> I looked at MultilineFields.pdf in a text editor but it seems to be
>>> linearized with all the streams compressed -- I can't make much sense
>>> of it.  Is there some convenient tool to emit a "human-readable"
>>> version with all the streams decompressed, without altering any of the
>>> object IDs or anything?
>>
>> the easiest is to take a look with the PDFDebugger
>>
>> BR
>> Maruan
>>
>>>
>>> I recall once before using some kind of PDF analysis tool that dumped
>>> out a lot of information, but it produced sort of a diagnostic log.
>>> What I'd really like is a tool that produces a completely legitimate
>>> PDF -- pretty much the same as the original file, just with no
>>> compression.
>>>
>>> Thanks,
>>>     Aaron
>>>
>>> On Sun, Jul 24, 2016 at 10:34 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>>> Hi Aaron,
>>>>
>>>>> Am 24.07.2016 um 16:24 schrieb Aaron Mulder <am...@gmail.com>:
>>>>>
>>>>> OK, thanks.  The font is just the standard Helvetica so it should not
>>>>> need to be embedded;
>>>>
>>>> If you need to support languages other than western text Arial (or Arial Unicode for a much larger coverage)  might be a better option as they support more characters.
>>>>
>>>>> I just need to specify the appropriate point size
>>>>> in order for the text to fit in the available space.
>>>>
>>>> If you specify the font size then this should be taken. If you'd like to fit the text into the form field specify a font size of 0 (zero) for auto scaling.
>>>>
>>>>>
>>>>> Some of the fields are multiline text fields ("text areas").  For
>>>>> those fields, do I need to manually line-break the text that I'm going
>>>>> to put in those fields in order to lay out the text in the appearance
>>>>> stream appropriately?
>>>>
>>>> that shouldn't be necessary - but include line-breaks where you'd like to have new paragraphs.
>>>>
>>>> You can take a look at the AlignmentTest.pdf and MultilineFields.pdf documents in pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/. The lower half is prefilled by Adobe Acrobat to see how a filled out form should look like. If you run the unit tests you'll also see how PDFBox fills these fields (the upper half).
>>>>
>>>>> I guess I am assuming that I will need to do
>>>>> that, though I suppose I'll find out shortly.  :)
>>>>
>>>> BR
>>>> Maruan
>>>>
>>>>>
>>>>> Thanks,
>>>>>     Aaron
>>>>>
>>>>> On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>>>>> Hi,
>>>>>>
>>>>>>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>>>>>>
>>>>>>> I am filling out a form on an existing PDF document.  The base
>>>>>>> document has /NeedAppearances true and the result is that the text
>>>>>>> looks different on every viewer.  For instance, on some the text is
>>>>>>> offset vertically or the text is cut off when using a font that works
>>>>>>> fine on a different viewer.
>>>>>>>
>>>>>>> I gather the solution to this is to set /NeedAppearances to false and
>>>>>>> provide an appearance stream for every field.  I don't know much about
>>>>>>> appearance streams.
>>>>>>>
>>>>>>> In looking at the output of the PDFBox examples, it seems that the
>>>>>>> appearance stream actually writes the field's value as text, e.g. for
>>>>>>> the form containing the text "Sample field" the appearance stream is:
>>>>>>>
>>>>>>> /Tx BMC
>>>>>>> q
>>>>>>> 1 1 198 48 re
>>>>>>> W
>>>>>>> n
>>>>>>> BT
>>>>>>> /Helv 12 Tf
>>>>>>> 2 20.692 Td
>>>>>>> (Sample field) Tj
>>>>>>> ET
>>>>>>> Q
>>>>>>> EMC
>>>>>>>
>>>>>>> Is there a way to not include the specific text in the appearance
>>>>>>> stream such that if the user changes the value in the field then the
>>>>>>> appearance stream will still work?
>>>>>>
>>>>>> If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>>>>>>
>>>>>>>
>>>>>>> Really all I want from the appearance stream is to specify the font
>>>>>>> and bounds and offset such that the text is in exactly the same place
>>>>>>> on every viewer and the same font size fits into the available space
>>>>>>> on every viewer.
>>>>>>
>>>>>> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>>>>>>
>>>>>> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> And one more question -- the document has let's say 200 fields, and I
>>>>>>> only populate maybe 50 of them.  If I specify appearance streams for
>>>>>>> *only* the ones I populate, will that work?  If that's the case,
>>>>>>> should /NeedAppearances be set to true or false?
>>>>>>
>>>>>> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>>>>>>
>>>>>> BR
>>>>>> Maruan
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>     Aaron
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Aaron Mulder <am...@gmail.com>.
I hadn't (I guess, obviously) found the page describing the tools yet.
Thanks, both of you!

Aaron

On Sun, Jul 24, 2016 at 11:07 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
> Hi,
>
>> Am 24.07.2016 um 17:05 schrieb Aaron Mulder <am...@gmail.com>:
>>
>> Thank you again.
>>
>> I looked at MultilineFields.pdf in a text editor but it seems to be
>> linearized with all the streams compressed -- I can't make much sense
>> of it.  Is there some convenient tool to emit a "human-readable"
>> version with all the streams decompressed, without altering any of the
>> object IDs or anything?
>
> the easiest is to take a look with the PDFDebugger
>
> BR
> Maruan
>
>>
>> I recall once before using some kind of PDF analysis tool that dumped
>> out a lot of information, but it produced sort of a diagnostic log.
>> What I'd really like is a tool that produces a completely legitimate
>> PDF -- pretty much the same as the original file, just with no
>> compression.
>>
>> Thanks,
>>     Aaron
>>
>> On Sun, Jul 24, 2016 at 10:34 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>> Hi Aaron,
>>>
>>>> Am 24.07.2016 um 16:24 schrieb Aaron Mulder <am...@gmail.com>:
>>>>
>>>> OK, thanks.  The font is just the standard Helvetica so it should not
>>>> need to be embedded;
>>>
>>> If you need to support languages other than western text Arial (or Arial Unicode for a much larger coverage)  might be a better option as they support more characters.
>>>
>>>> I just need to specify the appropriate point size
>>>> in order for the text to fit in the available space.
>>>
>>> If you specify the font size then this should be taken. If you'd like to fit the text into the form field specify a font size of 0 (zero) for auto scaling.
>>>
>>>>
>>>> Some of the fields are multiline text fields ("text areas").  For
>>>> those fields, do I need to manually line-break the text that I'm going
>>>> to put in those fields in order to lay out the text in the appearance
>>>> stream appropriately?
>>>
>>> that shouldn't be necessary - but include line-breaks where you'd like to have new paragraphs.
>>>
>>> You can take a look at the AlignmentTest.pdf and MultilineFields.pdf documents in pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/. The lower half is prefilled by Adobe Acrobat to see how a filled out form should look like. If you run the unit tests you'll also see how PDFBox fills these fields (the upper half).
>>>
>>>> I guess I am assuming that I will need to do
>>>> that, though I suppose I'll find out shortly.  :)
>>>
>>> BR
>>> Maruan
>>>
>>>>
>>>> Thanks,
>>>>     Aaron
>>>>
>>>> On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>>>> Hi,
>>>>>
>>>>>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>>>>>
>>>>>> I am filling out a form on an existing PDF document.  The base
>>>>>> document has /NeedAppearances true and the result is that the text
>>>>>> looks different on every viewer.  For instance, on some the text is
>>>>>> offset vertically or the text is cut off when using a font that works
>>>>>> fine on a different viewer.
>>>>>>
>>>>>> I gather the solution to this is to set /NeedAppearances to false and
>>>>>> provide an appearance stream for every field.  I don't know much about
>>>>>> appearance streams.
>>>>>>
>>>>>> In looking at the output of the PDFBox examples, it seems that the
>>>>>> appearance stream actually writes the field's value as text, e.g. for
>>>>>> the form containing the text "Sample field" the appearance stream is:
>>>>>>
>>>>>> /Tx BMC
>>>>>> q
>>>>>> 1 1 198 48 re
>>>>>> W
>>>>>> n
>>>>>> BT
>>>>>> /Helv 12 Tf
>>>>>> 2 20.692 Td
>>>>>> (Sample field) Tj
>>>>>> ET
>>>>>> Q
>>>>>> EMC
>>>>>>
>>>>>> Is there a way to not include the specific text in the appearance
>>>>>> stream such that if the user changes the value in the field then the
>>>>>> appearance stream will still work?
>>>>>
>>>>> If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>>>>>
>>>>>>
>>>>>> Really all I want from the appearance stream is to specify the font
>>>>>> and bounds and offset such that the text is in exactly the same place
>>>>>> on every viewer and the same font size fits into the available space
>>>>>> on every viewer.
>>>>>
>>>>> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>>>>>
>>>>> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>>>>>
>>>>>
>>>>>>
>>>>>> And one more question -- the document has let's say 200 fields, and I
>>>>>> only populate maybe 50 of them.  If I specify appearance streams for
>>>>>> *only* the ones I populate, will that work?  If that's the case,
>>>>>> should /NeedAppearances be set to true or false?
>>>>>
>>>>> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>>>>>
>>>>> BR
>>>>> Maruan
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>     Aaron
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

> Am 24.07.2016 um 17:05 schrieb Aaron Mulder <am...@gmail.com>:
> 
> Thank you again.
> 
> I looked at MultilineFields.pdf in a text editor but it seems to be
> linearized with all the streams compressed -- I can't make much sense
> of it.  Is there some convenient tool to emit a "human-readable"
> version with all the streams decompressed, without altering any of the
> object IDs or anything?

the easiest is to take a look with the PDFDebugger 

BR
Maruan

> 
> I recall once before using some kind of PDF analysis tool that dumped
> out a lot of information, but it produced sort of a diagnostic log.
> What I'd really like is a tool that produces a completely legitimate
> PDF -- pretty much the same as the original file, just with no
> compression.
> 
> Thanks,
>     Aaron
> 
> On Sun, Jul 24, 2016 at 10:34 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>> Hi Aaron,
>> 
>>> Am 24.07.2016 um 16:24 schrieb Aaron Mulder <am...@gmail.com>:
>>> 
>>> OK, thanks.  The font is just the standard Helvetica so it should not
>>> need to be embedded;
>> 
>> If you need to support languages other than western text Arial (or Arial Unicode for a much larger coverage)  might be a better option as they support more characters.
>> 
>>> I just need to specify the appropriate point size
>>> in order for the text to fit in the available space.
>> 
>> If you specify the font size then this should be taken. If you'd like to fit the text into the form field specify a font size of 0 (zero) for auto scaling.
>> 
>>> 
>>> Some of the fields are multiline text fields ("text areas").  For
>>> those fields, do I need to manually line-break the text that I'm going
>>> to put in those fields in order to lay out the text in the appearance
>>> stream appropriately?
>> 
>> that shouldn't be necessary - but include line-breaks where you'd like to have new paragraphs.
>> 
>> You can take a look at the AlignmentTest.pdf and MultilineFields.pdf documents in pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/. The lower half is prefilled by Adobe Acrobat to see how a filled out form should look like. If you run the unit tests you'll also see how PDFBox fills these fields (the upper half).
>> 
>>> I guess I am assuming that I will need to do
>>> that, though I suppose I'll find out shortly.  :)
>> 
>> BR
>> Maruan
>> 
>>> 
>>> Thanks,
>>>     Aaron
>>> 
>>> On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>>> Hi,
>>>> 
>>>>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>>>> 
>>>>> I am filling out a form on an existing PDF document.  The base
>>>>> document has /NeedAppearances true and the result is that the text
>>>>> looks different on every viewer.  For instance, on some the text is
>>>>> offset vertically or the text is cut off when using a font that works
>>>>> fine on a different viewer.
>>>>> 
>>>>> I gather the solution to this is to set /NeedAppearances to false and
>>>>> provide an appearance stream for every field.  I don't know much about
>>>>> appearance streams.
>>>>> 
>>>>> In looking at the output of the PDFBox examples, it seems that the
>>>>> appearance stream actually writes the field's value as text, e.g. for
>>>>> the form containing the text "Sample field" the appearance stream is:
>>>>> 
>>>>> /Tx BMC
>>>>> q
>>>>> 1 1 198 48 re
>>>>> W
>>>>> n
>>>>> BT
>>>>> /Helv 12 Tf
>>>>> 2 20.692 Td
>>>>> (Sample field) Tj
>>>>> ET
>>>>> Q
>>>>> EMC
>>>>> 
>>>>> Is there a way to not include the specific text in the appearance
>>>>> stream such that if the user changes the value in the field then the
>>>>> appearance stream will still work?
>>>> 
>>>> If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>>>> 
>>>>> 
>>>>> Really all I want from the appearance stream is to specify the font
>>>>> and bounds and offset such that the text is in exactly the same place
>>>>> on every viewer and the same font size fits into the available space
>>>>> on every viewer.
>>>> 
>>>> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>>>> 
>>>> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>>>> 
>>>> 
>>>>> 
>>>>> And one more question -- the document has let's say 200 fields, and I
>>>>> only populate maybe 50 of them.  If I specify appearance streams for
>>>>> *only* the ones I populate, will that work?  If that's the case,
>>>>> should /NeedAppearances be set to true or false?
>>>> 
>>>> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>>>> 
>>>> BR
>>>> Maruan
>>>> 
>>>>> 
>>>>> Thanks,
>>>>>     Aaron
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 24.07.2016 um 17:05 schrieb Aaron Mulder:
> Thank you again.
>
> I looked at MultilineFields.pdf in a text editor but it seems to be
> linearized with all the streams compressed -- I can't make much sense
> of it.  Is there some convenient tool to emit a "human-readable"
> version with all the streams decompressed, without altering any of the
> object IDs or anything?
>
> I recall once before using some kind of PDF analysis tool that dumped
> out a lot of information, but it produced sort of a diagnostic log.
> What I'd really like is a tool that produces a completely legitimate
> PDF -- pretty much the same as the original file, just with no
> compression.

To get an uncompressed PDF, use WriteDecodedDoc. To look at a PDF 
without producing an uncompressed PDF, use PDFDebugger.

Tilman


>
> Thanks,
>       Aaron
>
> On Sun, Jul 24, 2016 at 10:34 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>> Hi Aaron,
>>
>>> Am 24.07.2016 um 16:24 schrieb Aaron Mulder <am...@gmail.com>:
>>>
>>> OK, thanks.  The font is just the standard Helvetica so it should not
>>> need to be embedded;
>> If you need to support languages other than western text Arial (or Arial Unicode for a much larger coverage)  might be a better option as they support more characters.
>>
>>> I just need to specify the appropriate point size
>>> in order for the text to fit in the available space.
>> If you specify the font size then this should be taken. If you'd like to fit the text into the form field specify a font size of 0 (zero) for auto scaling.
>>
>>> Some of the fields are multiline text fields ("text areas").  For
>>> those fields, do I need to manually line-break the text that I'm going
>>> to put in those fields in order to lay out the text in the appearance
>>> stream appropriately?
>> that shouldn't be necessary - but include line-breaks where you'd like to have new paragraphs.
>>
>> You can take a look at the AlignmentTest.pdf and MultilineFields.pdf documents in pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/. The lower half is prefilled by Adobe Acrobat to see how a filled out form should look like. If you run the unit tests you'll also see how PDFBox fills these fields (the upper half).
>>
>>> I guess I am assuming that I will need to do
>>> that, though I suppose I'll find out shortly.  :)
>> BR
>> Maruan
>>
>>> Thanks,
>>>       Aaron
>>>
>>> On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>>> Hi,
>>>>
>>>>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>>>>
>>>>> I am filling out a form on an existing PDF document.  The base
>>>>> document has /NeedAppearances true and the result is that the text
>>>>> looks different on every viewer.  For instance, on some the text is
>>>>> offset vertically or the text is cut off when using a font that works
>>>>> fine on a different viewer.
>>>>>
>>>>> I gather the solution to this is to set /NeedAppearances to false and
>>>>> provide an appearance stream for every field.  I don't know much about
>>>>> appearance streams.
>>>>>
>>>>> In looking at the output of the PDFBox examples, it seems that the
>>>>> appearance stream actually writes the field's value as text, e.g. for
>>>>> the form containing the text "Sample field" the appearance stream is:
>>>>>
>>>>> /Tx BMC
>>>>> q
>>>>> 1 1 198 48 re
>>>>> W
>>>>> n
>>>>> BT
>>>>> /Helv 12 Tf
>>>>> 2 20.692 Td
>>>>> (Sample field) Tj
>>>>> ET
>>>>> Q
>>>>> EMC
>>>>>
>>>>> Is there a way to not include the specific text in the appearance
>>>>> stream such that if the user changes the value in the field then the
>>>>> appearance stream will still work?
>>>> If the field has a value then this value will be part of the appearance stream. But that doesn�t mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>>>>
>>>>> Really all I want from the appearance stream is to specify the font
>>>>> and bounds and offset such that the text is in exactly the same place
>>>>> on every viewer and the same font size fits into the available space
>>>>> on every viewer.
>>>> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>>>>
>>>> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>>>>
>>>>
>>>>> And one more question -- the document has let's say 200 fields, and I
>>>>> only populate maybe 50 of them.  If I specify appearance streams for
>>>>> *only* the ones I populate, will that work?  If that's the case,
>>>>> should /NeedAppearances be set to true or false?
>>>> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>>>>
>>>> BR
>>>> Maruan
>>>>
>>>>> Thanks,
>>>>>       Aaron
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Aaron Mulder <am...@gmail.com>.
Thank you again.

I looked at MultilineFields.pdf in a text editor but it seems to be
linearized with all the streams compressed -- I can't make much sense
of it.  Is there some convenient tool to emit a "human-readable"
version with all the streams decompressed, without altering any of the
object IDs or anything?

I recall once before using some kind of PDF analysis tool that dumped
out a lot of information, but it produced sort of a diagnostic log.
What I'd really like is a tool that produces a completely legitimate
PDF -- pretty much the same as the original file, just with no
compression.

Thanks,
     Aaron

On Sun, Jul 24, 2016 at 10:34 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
> Hi Aaron,
>
>> Am 24.07.2016 um 16:24 schrieb Aaron Mulder <am...@gmail.com>:
>>
>> OK, thanks.  The font is just the standard Helvetica so it should not
>> need to be embedded;
>
> If you need to support languages other than western text Arial (or Arial Unicode for a much larger coverage)  might be a better option as they support more characters.
>
>> I just need to specify the appropriate point size
>> in order for the text to fit in the available space.
>
> If you specify the font size then this should be taken. If you'd like to fit the text into the form field specify a font size of 0 (zero) for auto scaling.
>
>>
>> Some of the fields are multiline text fields ("text areas").  For
>> those fields, do I need to manually line-break the text that I'm going
>> to put in those fields in order to lay out the text in the appearance
>> stream appropriately?
>
> that shouldn't be necessary - but include line-breaks where you'd like to have new paragraphs.
>
> You can take a look at the AlignmentTest.pdf and MultilineFields.pdf documents in pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/. The lower half is prefilled by Adobe Acrobat to see how a filled out form should look like. If you run the unit tests you'll also see how PDFBox fills these fields (the upper half).
>
>> I guess I am assuming that I will need to do
>> that, though I suppose I'll find out shortly.  :)
>
> BR
> Maruan
>
>>
>> Thanks,
>>      Aaron
>>
>> On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>>> Hi,
>>>
>>>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>>>
>>>> I am filling out a form on an existing PDF document.  The base
>>>> document has /NeedAppearances true and the result is that the text
>>>> looks different on every viewer.  For instance, on some the text is
>>>> offset vertically or the text is cut off when using a font that works
>>>> fine on a different viewer.
>>>>
>>>> I gather the solution to this is to set /NeedAppearances to false and
>>>> provide an appearance stream for every field.  I don't know much about
>>>> appearance streams.
>>>>
>>>> In looking at the output of the PDFBox examples, it seems that the
>>>> appearance stream actually writes the field's value as text, e.g. for
>>>> the form containing the text "Sample field" the appearance stream is:
>>>>
>>>> /Tx BMC
>>>> q
>>>> 1 1 198 48 re
>>>> W
>>>> n
>>>> BT
>>>> /Helv 12 Tf
>>>> 2 20.692 Td
>>>> (Sample field) Tj
>>>> ET
>>>> Q
>>>> EMC
>>>>
>>>> Is there a way to not include the specific text in the appearance
>>>> stream such that if the user changes the value in the field then the
>>>> appearance stream will still work?
>>>
>>> If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>>>
>>>>
>>>> Really all I want from the appearance stream is to specify the font
>>>> and bounds and offset such that the text is in exactly the same place
>>>> on every viewer and the same font size fits into the available space
>>>> on every viewer.
>>>
>>> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>>>
>>> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>>>
>>>
>>>>
>>>> And one more question -- the document has let's say 200 fields, and I
>>>> only populate maybe 50 of them.  If I specify appearance streams for
>>>> *only* the ones I populate, will that work?  If that's the case,
>>>> should /NeedAppearances be set to true or false?
>>>
>>> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>>>
>>> BR
>>> Maruan
>>>
>>>>
>>>> Thanks,
>>>>      Aaron
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi Aaron,

> Am 24.07.2016 um 16:24 schrieb Aaron Mulder <am...@gmail.com>:
> 
> OK, thanks.  The font is just the standard Helvetica so it should not
> need to be embedded;

If you need to support languages other than western text Arial (or Arial Unicode for a much larger coverage)  might be a better option as they support more characters. 

> I just need to specify the appropriate point size
> in order for the text to fit in the available space.

If you specify the font size then this should be taken. If you'd like to fit the text into the form field specify a font size of 0 (zero) for auto scaling. 

> 
> Some of the fields are multiline text fields ("text areas").  For
> those fields, do I need to manually line-break the text that I'm going
> to put in those fields in order to lay out the text in the appearance
> stream appropriately?

that shouldn't be necessary - but include line-breaks where you'd like to have new paragraphs.

You can take a look at the AlignmentTest.pdf and MultilineFields.pdf documents in pdfbox/src/test/resources/org/apache/pdfbox/pdmodel/interactive/form/. The lower half is prefilled by Adobe Acrobat to see how a filled out form should look like. If you run the unit tests you'll also see how PDFBox fills these fields (the upper half).

> I guess I am assuming that I will need to do
> that, though I suppose I'll find out shortly.  :)

BR
Maruan

> 
> Thanks,
>      Aaron
> 
> On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
>> Hi,
>> 
>>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>> 
>>> I am filling out a form on an existing PDF document.  The base
>>> document has /NeedAppearances true and the result is that the text
>>> looks different on every viewer.  For instance, on some the text is
>>> offset vertically or the text is cut off when using a font that works
>>> fine on a different viewer.
>>> 
>>> I gather the solution to this is to set /NeedAppearances to false and
>>> provide an appearance stream for every field.  I don't know much about
>>> appearance streams.
>>> 
>>> In looking at the output of the PDFBox examples, it seems that the
>>> appearance stream actually writes the field's value as text, e.g. for
>>> the form containing the text "Sample field" the appearance stream is:
>>> 
>>> /Tx BMC
>>> q
>>> 1 1 198 48 re
>>> W
>>> n
>>> BT
>>> /Helv 12 Tf
>>> 2 20.692 Td
>>> (Sample field) Tj
>>> ET
>>> Q
>>> EMC
>>> 
>>> Is there a way to not include the specific text in the appearance
>>> stream such that if the user changes the value in the field then the
>>> appearance stream will still work?
>> 
>> If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>> 
>>> 
>>> Really all I want from the appearance stream is to specify the font
>>> and bounds and offset such that the text is in exactly the same place
>>> on every viewer and the same font size fits into the available space
>>> on every viewer.
>> 
>> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>> 
>> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>> 
>> 
>>> 
>>> And one more question -- the document has let's say 200 fields, and I
>>> only populate maybe 50 of them.  If I specify appearance streams for
>>> *only* the ones I populate, will that work?  If that's the case,
>>> should /NeedAppearances be set to true or false?
>> 
>> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>> 
>> BR
>> Maruan
>> 
>>> 
>>> Thanks,
>>>      Aaron
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Aaron Mulder <am...@gmail.com>.
OK, thanks.  The font is just the standard Helvetica so it should not
need to be embedded; I just need to specify the appropriate point size
in order for the text to fit in the available space.

Some of the fields are multiline text fields ("text areas").  For
those fields, do I need to manually line-break the text that I'm going
to put in those fields in order to lay out the text in the appearance
stream appropriately?  I guess I am assuming that I will need to do
that, though I suppose I'll find out shortly.  :)

Thanks,
      Aaron

On Sun, Jul 24, 2016 at 10:16 AM, Maruan Sahyoun <sa...@fileaffairs.de> wrote:
> Hi,
>
>> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
>>
>> I am filling out a form on an existing PDF document.  The base
>> document has /NeedAppearances true and the result is that the text
>> looks different on every viewer.  For instance, on some the text is
>> offset vertically or the text is cut off when using a font that works
>> fine on a different viewer.
>>
>> I gather the solution to this is to set /NeedAppearances to false and
>> provide an appearance stream for every field.  I don't know much about
>> appearance streams.
>>
>> In looking at the output of the PDFBox examples, it seems that the
>> appearance stream actually writes the field's value as text, e.g. for
>> the form containing the text "Sample field" the appearance stream is:
>>
>> /Tx BMC
>> q
>> 1 1 198 48 re
>> W
>> n
>> BT
>> /Helv 12 Tf
>> 2 20.692 Td
>> (Sample field) Tj
>> ET
>> Q
>> EMC
>>
>> Is there a way to not include the specific text in the appearance
>> stream such that if the user changes the value in the field then the
>> appearance stream will still work?
>
> If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.
>
>>
>> Really all I want from the appearance stream is to specify the font
>> and bounds and offset such that the text is in exactly the same place
>> on every viewer and the same font size fits into the available space
>> on every viewer.
>
> The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.
>
> One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.
>
>
>>
>> And one more question -- the document has let's say 200 fields, and I
>> only populate maybe 50 of them.  If I specify appearance streams for
>> *only* the ones I populate, will that work?  If that's the case,
>> should /NeedAppearances be set to true or false?
>
> If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).
>
> BR
> Maruan
>
>>
>> Thanks,
>>       Aaron
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Text Field Appearance Streams

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

> Am 24.07.2016 um 14:35 schrieb Aaron Mulder <am...@gmail.com>:
> 
> I am filling out a form on an existing PDF document.  The base
> document has /NeedAppearances true and the result is that the text
> looks different on every viewer.  For instance, on some the text is
> offset vertically or the text is cut off when using a font that works
> fine on a different viewer.
> 
> I gather the solution to this is to set /NeedAppearances to false and
> provide an appearance stream for every field.  I don't know much about
> appearance streams.
> 
> In looking at the output of the PDFBox examples, it seems that the
> appearance stream actually writes the field's value as text, e.g. for
> the form containing the text "Sample field" the appearance stream is:
> 
> /Tx BMC
> q
> 1 1 198 48 re
> W
> n
> BT
> /Helv 12 Tf
> 2 20.692 Td
> (Sample field) Tj
> ET
> Q
> EMC
> 
> Is there a way to not include the specific text in the appearance
> stream such that if the user changes the value in the field then the
> appearance stream will still work?

If the field has a value then this value will be part of the appearance stream. But that doesnÄt mean that the value can not change. When the user enters new text then the viewer will recalculate the appearance stream and the new text will replace the old one.

> 
> Really all I want from the appearance stream is to specify the font
> and bounds and offset such that the text is in exactly the same place
> on every viewer and the same font size fits into the available space
> on every viewer.

The font (/Helv in your example) and font size (12 in your example) are part of the fields properties and specified in the default appearance string. This will make it into the appearance stream. Unfortunately setting such as bounds and offset are not part of the PDF specification but implementation specific. So you might get different results with different viewers.

One thing you should also ensure is that the fonts used for forms filling are embedded in the PDF so that all viewers can use the same font.


> 
> And one more question -- the document has let's say 200 fields, and I
> only populate maybe 50 of them.  If I specify appearance streams for
> *only* the ones I populate, will that work?  If that's the case,
> should /NeedAppearances be set to true or false?

If the appearance stream for the filled out fields is defined then you should set /NeedAppearances to false (or remove the key completely).

BR
Maruan

> 
> Thanks,
>       Aaron
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org