You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by A H <od...@gmail.com> on 2020/07/04 23:08:52 UTC

Can you help me? I need to combine 2 PDF files and order them based on a text included on each page of the PDF files. Is it possible to do this using PDFBox?

Hello,

Can you help me? I need to combine 2 PDF files and order them based on a
text included on each page of the PDF files. Is it possible to do this
using PDFBox?

In each page of the PDF files there is an order number. It is the numerical
value to the right of the letter P (please see the PDF samples attached).
That number appears in the same position on all pages, both in order forms
and in checklists. The order number is generally numeric, very rarely does
it include any letter at the end.

For each order form there is at least one checklist. It is possible that
the checklist and orders forms have more than one page, normally they only
have one. In the case of having more than one page, the additional pages
are always identified with the order number in the same position as shown
in the example pdfs.

Can you help me with an example of how to combine the two PDF files using
the order number as a parameter. I have some experience in C # and Java,
but I have never manipulated PDF files using PDFBox.

Thank you,
-- 
Aldo

Re: Can you help me? I need to combine 2 PDF files and order them based on a text included on each page of the PDF files. Is it possible to do this using PDFBox?

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

Didn't you ask this on stackoverflow?

I mentioned to run ExtractText. Because I got your files in moderation, 
I could do that, and I see that the text doesn't extract. Even in Adobe 
Reader, when trying to get an order sample, I get "􀀒􀀓􀀔􀀕􀀔􀀖􀀗". It 
is similar with the check list file.

So you can't, unless you tell the person who gave you this file to 
generate it properly.

When this is done, use the splitter class, or create a new PDDocument, 
and call addPage() to add pages from the old document. Don't close any 
documents until all results saved done, because of common resource usage.

Tilman

Am 05.07.2020 um 01:08 schrieb A H:
>
> Hello,
>
> Can you help me? I need to combine 2 PDF files and order them based on 
> a text included on each page of the PDF files. Is it possible to do 
> this using PDFBox?
>
> In each page of the PDF files there is an order number. It is the 
> numerical value to the right of the letter P (please see the PDF 
> samples attached). That number appears in the same position on all 
> pages, both in order forms and in checklists. The order number is 
> generally numeric, very rarely does it include any letter at the end.
>
> For each order form there is at least one checklist. It is possible 
> that the checklist and orders forms have more than one page, normally 
> they only have one. In the case of having more than one page, the 
> additional pages are always identified with the order number in the 
> same position as shown in the example pdfs.
>
> Can you help me with an example of how to combine the two PDF files 
> using the order number as a parameter. I have some experience in C # 
> and Java, but I have never manipulated PDF files using PDFBox.
>
> Thank you,
> -- 
> Aldo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org