You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Ru...@in.ey.com on 2011/11/04 06:36:41 UTC

Fw: Functionality in PDFBOX

Hi,
I want a clarification. I have a requirement to merge the pdf documents, I 
managed to do that via PDFBox, but the question is can we merge the pdf 
files based on the bookmark or at the beginning of the file?
Also I know we can split the documents, but can this be done again based 
on bookmarks or taking into consideration some field value?
Also add bookmarks at relative position?
I have seen various issues in Jira and have searched the website, samples 
etc., but could not find anything specific for this tasks, your 
suggestions on this will be of great help.
We have some requirements based on this and need to decide on this at the 
earliest. Thanks.


Regards, 
Rubesh 
----- Forwarded by Rubesh M Xavier/AABS/GSS/ErnstYoung/IN on 11/04/2011 
11:01 AM -----

From:
"Jeremias Maerki (Resolved) (JIRA)" <ji...@apache.org>
To:
rubesh.xavier@in.ey.com
Date:
11/03/2011 07:36 PM
Subject:
[jira] [Resolved] (PDFBOX-1158) Functionality in PDFBOX




     [ 
https://issues.apache.org/jira/browse/PDFBOX-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
]

Jeremias Maerki resolved PDFBOX-1158.
-------------------------------------

    Resolution: Invalid

Please don't use JIRA to ask questions. Questions should be sent to 
users@pdfbox.apache.org.

1a. Apache PDFBox cannot directly create PostScript from PDF. But it 
supports painting PDF pages to Graphics2D objects. That means you can use 
the print function to print a PDF to a PostScript printer driver (set to 
output to a file if needed). Or you can use PSDocumentGraphics2D from 
Apache XML Graphics Commons to create PostScript files but the former is 
easier. http://pdfbox.apache.org/commandlineutilities/PrintPDF.html can 
serve as a starting point.

1b. Apache PDFBox cannot create PDF from PostScript because it lacks a 
complete PostScript interpreter to start with. You may need to look at 
GhostScript for that functionality (available under the GPL or a 
commercial license). GhostScript can do PDF->PS, too.

2. See http://pdfbox.apache.org/commandlineutilities/PDFMerger.html

Any follow-up questions to users@pdfbox.apache.org, please.
 
> Functionality in PDFBOX
> -----------------------
>
>                 Key: PDFBOX-1158
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1158
>             Project: PDFBox
>          Issue Type: Test
>            Reporter: Rubesh MX
>              Labels: Feature
>
> HI, I want to know if the following features are possible with PDFBox; 
1. Convert PDF to postscript and vice versa 2. Append docuemnt at the 
beginning of the file and append doc as per the bookmark; could you please 
confirm on this. I could not find any details in the website/samples for 
this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA 
administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

 




The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it.   It may contain confidential or legally privileged information.   If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.

Re: Functionality in PDFBOX

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi Rubesh,

it's possible to merge complete PDFs or only selected pages. I think you have two ways for doing this

a) merge everything into your 'master' document. To influence the order of pages you have to rebuild the page tree accordingly (the merge samples provided only add to the end I think) or
b) merge both your 'master' and child documents into a new document. This way you can import selected pages as you move forward based on your criteria

I think b might be easier to accomplish.

In order to locate the pages you will need to find out either where the bookmark points to or where a selected field is located.  PDFBox has the necessary code to enable you to do this.

So in short it's possible with PDFBox.

Regards

Maruan

Am 04.11.2011 um 06:36 schrieb Rubesh.Xavier@in.ey.com:

> Hi,
> I want a clarification. I have a requirement to merge the pdf documents, I 
> managed to do that via PDFBox, but the question is can we merge the pdf 
> files based on the bookmark or at the beginning of the file?
> Also I know we can split the documents, but can this be done again based 
> on bookmarks or taking into consideration some field value?
> Also add bookmarks at relative position?
> I have seen various issues in Jira and have searched the website, samples 
> etc., but could not find anything specific for this tasks, your 
> suggestions on this will be of great help.
> We have some requirements based on this and need to decide on this at the 
> earliest. Thanks.
> 
> 
> Regards, 
> Rubesh 
> ----- Forwarded by Rubesh M Xavier/AABS/GSS/ErnstYoung/IN on 11/04/2011 
> 11:01 AM -----
> 
> From:
> "Jeremias Maerki (Resolved) (JIRA)" <ji...@apache.org>
> To:
> rubesh.xavier@in.ey.com
> Date:
> 11/03/2011 07:36 PM
> Subject:
> [jira] [Resolved] (PDFBOX-1158) Functionality in PDFBOX
> 
> 
> 
> 
>     [ 
> https://issues.apache.org/jira/browse/PDFBOX-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
> ]
> 
> Jeremias Maerki resolved PDFBOX-1158.
> -------------------------------------
> 
>    Resolution: Invalid
> 
> Please don't use JIRA to ask questions. Questions should be sent to 
> users@pdfbox.apache.org.
> 
> 1a. Apache PDFBox cannot directly create PostScript from PDF. But it 
> supports painting PDF pages to Graphics2D objects. That means you can use 
> the print function to print a PDF to a PostScript printer driver (set to 
> output to a file if needed). Or you can use PSDocumentGraphics2D from 
> Apache XML Graphics Commons to create PostScript files but the former is 
> easier. http://pdfbox.apache.org/commandlineutilities/PrintPDF.html can 
> serve as a starting point.
> 
> 1b. Apache PDFBox cannot create PDF from PostScript because it lacks a 
> complete PostScript interpreter to start with. You may need to look at 
> GhostScript for that functionality (available under the GPL or a 
> commercial license). GhostScript can do PDF->PS, too.
> 
> 2. See http://pdfbox.apache.org/commandlineutilities/PDFMerger.html
> 
> Any follow-up questions to users@pdfbox.apache.org, please.
> 
>> Functionality in PDFBOX
>> -----------------------
>> 
>>                Key: PDFBOX-1158
>>                URL: https://issues.apache.org/jira/browse/PDFBOX-1158
>>            Project: PDFBox
>>         Issue Type: Test
>>           Reporter: Rubesh MX
>>             Labels: Feature
>> 
>> HI, I want to know if the following features are possible with PDFBox; 
> 1. Convert PDF to postscript and vice versa 2. Append docuemnt at the 
> beginning of the file and append doc as per the bookmark; could you please 
> confirm on this. I could not find any details in the website/samples for 
> this.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA 
> administrators: 
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> 
> 
> 
> 
> 
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it.   It may contain confidential or legally privileged information.   If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.


Re: Functionality in PDFBOX

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Hi Rubesh

On 04.11.2011 06:36:41 Rubesh.Xavier wrote:
> Hi,
> I want a clarification. I have a requirement to merge the pdf documents, I 
> managed to do that via PDFBox, but the question is can we merge the pdf 
> files based on the bookmark or at the beginning of the file?

I don't know what you mean by "based on the bookmark". Would you please
explain in different words? If you want to insert your PDF at the
beginning, rather than at the end, then you simply have to switch the
order in which you merge your PDFs.

> Also I know we can split the documents, but can this be done again based 
> on bookmarks or taking into consideration some field value?

Of course, you can traverse the bookmark tree (called document outline
in PDF) and find the destination page.

- From the PDFDocument, get to the catalog using getDocumentCatalog()
- From there get to the document outline using getDocumentOutline()
- From there you can navigate through the tree by
getFirstChild()/getLastChild() and
PDOutlineItem.getNext/PreviousSibling()
- On PDOutlineItem, you can call findDestinationPage(PDDocument) to find
the page you're looking for.
- Then split according to the extracted page indices.

(The PrintBookmarks example might help you here)

If by "field value" you mean fields from an AcroForm, I guess that can
be done, too, but I imagine this could be a bit more complicated.

> Also add bookmarks at relative position?

You simply have to keep track of the pages you add and update the
document outline accordingly. Take a look at the sources or javadoc of
the package org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline
and at the CreateBookmarks example.

> I have seen various issues in Jira and have searched the website, samples 
> etc., but could not find anything specific for this tasks, your 
> suggestions on this will be of great help.
> We have some requirements based on this and need to decide on this at the 
> earliest. Thanks.
> 
> 
> Regards, 
> Rubesh 
> ----- Forwarded by Rubesh M Xavier/AABS/GSS/ErnstYoung/IN on 11/04/2011 
> 11:01 AM -----
> 
> From:
> "Jeremias Maerki (Resolved) (JIRA)" <ji...@apache.org>
> To:
> rubesh.xavier@in.ey.com
> Date:
> 11/03/2011 07:36 PM
> Subject:
> [jira] [Resolved] (PDFBOX-1158) Functionality in PDFBOX
> 
> 
> 
> 
>      [ 
> https://issues.apache.org/jira/browse/PDFBOX-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel 
> ]
> 
> Jeremias Maerki resolved PDFBOX-1158.
> -------------------------------------
> 
>     Resolution: Invalid
> 
> Please don't use JIRA to ask questions. Questions should be sent to 
> users@pdfbox.apache.org.
> 
> 1a. Apache PDFBox cannot directly create PostScript from PDF. But it 
> supports painting PDF pages to Graphics2D objects. That means you can use 
> the print function to print a PDF to a PostScript printer driver (set to 
> output to a file if needed). Or you can use PSDocumentGraphics2D from 
> Apache XML Graphics Commons to create PostScript files but the former is 
> easier. http://pdfbox.apache.org/commandlineutilities/PrintPDF.html can 
> serve as a starting point.
> 
> 1b. Apache PDFBox cannot create PDF from PostScript because it lacks a 
> complete PostScript interpreter to start with. You may need to look at 
> GhostScript for that functionality (available under the GPL or a 
> commercial license). GhostScript can do PDF->PS, too.
> 
> 2. See http://pdfbox.apache.org/commandlineutilities/PDFMerger.html
> 
> Any follow-up questions to users@pdfbox.apache.org, please.
>  
> > Functionality in PDFBOX
> > -----------------------
> >
> >                 Key: PDFBOX-1158
> >                 URL: https://issues.apache.org/jira/browse/PDFBOX-1158
> >             Project: PDFBox
> >          Issue Type: Test
> >            Reporter: Rubesh MX
> >              Labels: Feature
> >
> > HI, I want to know if the following features are possible with PDFBox; 
> 1. Convert PDF to postscript and vice versa 2. Append docuemnt at the 
> beginning of the file and append doc as per the bookmark; could you please 
> confirm on this. I could not find any details in the website/samples for 
> this.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA 
> administrators: 
> https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
>  
> 
> 
> 
> 
> The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it.   It may contain confidential or legally privileged information.   If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.




Jeremias Maerki