You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Kevin Ternes <KT...@thegeneral.com> on 2016/04/28 19:39:25 UTC
Editing Text again
So I have a bunch of source PDFs that I use PDFBox 2.0.0 to fill out and sometimes edit.
Specifically, for certain business cases I remove or update the text "(signed by Named Insured)".
I edit using a method similar to the one over on SourceForge, https://stackoverflow.com/questions/35420609/pdfbox-2-0-rc3-find-and-replace-text
However, if a PDF gets edited by _Acrobat_ and the change is, for example, "(Signed by Named Insured.)" where the S is capitalized and a period is inserted, the method will no longer be able to find the target text even if I make the corresponding changes in my method call.
Using PDFDebugger, I see that this:
0.699 0.676 0.639 0.747 k
/TT1 8 Tf
0.539 -10.877 Td
(\(signed by Named Insured\)) Tj
0.698 0.675 0.639 0.74 k
/TT1 9.96 Tf
-0.87 -27.115 Td
Has been changed to this:
0.699 0.676 0.639 0.747 k
/TT1 8 Tf
0.539 -10.877 Td
(\() Tj
/C2_2 8 Tf
(\0006) Tj
/TT1 8 Tf
1 0 0 1 113.02 381.017 Tm
(igned by Named Insured) Tj
/C2_2 8 Tf
87.164 0 Td
(\000\021) Tj
/TT1 8 Tf
(\)) Tj
0.698 0.675 0.639 0.74 k
/TT1 9.96 Tf
-96.573 -27.115 Td
And it is obvious why the method will no longer work.
Has anyone any suggestions on how to programmatically deal with this?
Or is there a setting in Acrobat that I can use to tell it to stop doing this crap?!
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Editing Text again
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 28.04.2016 um 19:39 schrieb Kevin Ternes:
> So I have a bunch of source PDFs that I use PDFBox 2.0.0 to fill out and sometimes edit.
> Specifically, for certain business cases I remove or update the text "(signed by Named Insured)".
> I edit using a method similar to the one over on SourceForge, https://stackoverflow.com/questions/35420609/pdfbox-2-0-rc3-find-and-replace-text
See https://pdfbox.apache.org/2.0/migration.html "Why was the
ReplaceText example removed?"
What you could do instead is to draw a blank rectangle and put your text
on top. However the old text would still exist in text extraction.
Tilman
>
> However, if a PDF gets edited by _Acrobat_ and the change is, for example, "(Signed by Named Insured.)" where the S is capitalized and a period is inserted, the method will no longer be able to find the target text even if I make the corresponding changes in my method call.
>
> Using PDFDebugger, I see that this:
> 0.699 0.676 0.639 0.747 k
> /TT1 8 Tf
> 0.539 -10.877 Td
> (\(signed by Named Insured\)) Tj
> 0.698 0.675 0.639 0.74 k
> /TT1 9.96 Tf
> -0.87 -27.115 Td
>
> Has been changed to this:
> 0.699 0.676 0.639 0.747 k
> /TT1 8 Tf
> 0.539 -10.877 Td
> (\() Tj
> /C2_2 8 Tf
> (\0006) Tj
> /TT1 8 Tf
> 1 0 0 1 113.02 381.017 Tm
> (igned by Named Insured) Tj
> /C2_2 8 Tf
> 87.164 0 Td
> (\000\021) Tj
> /TT1 8 Tf
> (\)) Tj
> 0.698 0.675 0.639 0.74 k
> /TT1 9.96 Tf
> -96.573 -27.115 Td
> And it is obvious why the method will no longer work.
>
> Has anyone any suggestions on how to programmatically deal with this?
> Or is there a setting in Acrobat that I can use to tell it to stop doing this crap?!
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org