You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by karthick g <ik...@gmail.com> on 2017/04/03 06:52:17 UTC
Re: There was a problem when loading a pdf file to Adobe Acrobat. document(18)
Hi team,
These are the drop box links for the error pdf file.
PDF File
---------------
https://www.dropbox.com/s/hzkzfowl29b62fy/corrupt.pdf?dl=0
Screen Shot when loading the PDF file
------------------------------------------------------
https://www.dropbox.com/s/h9x5wad9dgwxnxt/pdf-error.png?dl=0
Screen Shot when pressing the CTRL key
-------------------------------------------------------------
https://www.dropbox.com/s/qpkztacdfx6hwep/ctrl-key-error-pdf.png?dl=0
Please let me know if you need more information
Regards,
karthick G
On Fri, Mar 31, 2017 at 10:51 AM, karthick g <ik...@gmail.com>
wrote:
> Hi Team,
>
> Apologies for sending the previous mail to developer team, Please guide me
>
> * There was a problem when loading a pdf file to Adobe Acrobat.
>
> * "There was a problem reading this document (18) " (Error Shown by
> acrobat)*
> * When running the same file to the PDFBox it is working fine.
>
> * Is their way to catch this document(18) error in pdfbox while parse?*
>
>
> Regards,
> karthick
>
Re: There was a problem when loading a pdf file to Adobe Acrobat.
document(18)
Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,
Yes your file is corrupt. PDFBox does catch it when parse:
Warning [PDFStreamEngine] Unexpected object type:
org.apache.pdfbox.cos.COSDictionary
java.io.IOException: Unexpected object type:
org.apache.pdfbox.cos.COSDictionary
at
org.apache.pdfbox.pdmodel.graphics.PDXObject.createXObject(PDXObject.java:62)
at
org.apache.pdfbox.pdmodel.PDResources.getXObject(PDResources.java:409)
at
org.apache.pdfbox.contentstream.operator.graphics.DrawObject.process(DrawObject.java:53)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:853)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:509)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:158)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:221)
at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:147)
at
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:69)
at
org.apache.pdfbox.debugger.pagepane.PagePane$RenderWorker.doInBackground(PagePane.java:290)
at
org.apache.pdfbox.debugger.pagepane.PagePane$RenderWorker.doInBackground(PagePane.java:259)
at javax.swing.SwingWorker$1.call(SwingWorker.java:295)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at javax.swing.SwingWorker.run(SwingWorker.java:334)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
The problem is at Root/Pages/Kids/[0]/Resources/XObject/Im1 , at
position 504.907 663.76, probably a missing logo at the top right. This
should contain a stream dictionary with image or form items, but it
contains only a dictionary with one item "nums". PDFBox somehow survives
this.
However there must be another problem... I was able to fix the mentioned
problem, but still can't view the file with Adobe Reader.
Now what you asked for is whether one could catch it. Sadly, so many
people asked us to be able to display all sort of corrupt files with the
argument "but it is displayed by Adobe!" so we are very lenient.
Tilman
Am 03.04.2017 um 08:52 schrieb karthick g:
> Hi team,
>
> These are the drop box links for the error pdf file.
>
> PDF File
> ---------------
> https://www.dropbox.com/s/hzkzfowl29b62fy/corrupt.pdf?dl=0
>
> Screen Shot when loading the PDF file
> ------------------------------------------------------
>
> https://www.dropbox.com/s/h9x5wad9dgwxnxt/pdf-error.png?dl=0
>
> Screen Shot when pressing the CTRL key
> -------------------------------------------------------------
>
> https://www.dropbox.com/s/qpkztacdfx6hwep/ctrl-key-error-pdf.png?dl=0
>
> Please let me know if you need more information
>
> Regards,
> karthick G
>
>
>
>
>
>
> On Fri, Mar 31, 2017 at 10:51 AM, karthick g <ik...@gmail.com>
> wrote:
>
>> Hi Team,
>>
>> Apologies for sending the previous mail to developer team, Please guide me
>>
>> * There was a problem when loading a pdf file to Adobe Acrobat.
>>
>> * "There was a problem reading this document (18) " (Error Shown by
>> acrobat)*
>> * When running the same file to the PDFBox it is working fine.
>>
>> * Is their way to catch this document(18) error in pdfbox while parse?*
>>
>>
>> Regards,
>> karthick
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: There was a problem when loading a pdf file to Adobe Acrobat.
document(18)
Posted by Tilman Hausherr <TH...@t-online.de>.
I ran PDFDebugger command line application with your PDF file. There is
a log window (click on the bottom right).
Alternatively, render the file programmatically.
Tilman
Am 13.04.2017 um 06:47 schrieb karthick g:
> Hi,
>
> Thanks for your support. Based on your suggestion, Now I moved to PDFBox
> latest version 2.0.5.
> Still I can not able to get the error on PDFParse. Even I have set Lenient
> option of pdf parser to false,
> to catch the error. The file is parsing well and I can not able to get the
> error for malformed pdf. Can You send
> the code snippet or explain the way that how you got the error.
>
> Regards,
> karthick G
>
> On Thu, Apr 6, 2017 at 2:21 PM, karthick g <ik...@gmail.com> wrote:
>
>> Hi,
>>
>> Thanks for your support. I am running PDFBox 1.8.2.
>>
>> I parsed the file with FileInputStream object.
>>
>>
>>
>>
>>
>>
>> *FileInputStream fs; try { fs = new
>> FileInputStream("/home/karthik/5.56.0/corrupt.pdf");
>> PDFParser pdfParser = new PDFParser(fs); pdfParser.parse();*
>>
>>
>> Am not getting Unexpected object type exception as you mentioned. I parsed
>> in many ways I can not able to produce the exception as you got. It is
>> parsing fine for me, that is my problem.
>> Please let me know if you need more information
>>
>> Regards,
>> Karthick G
>>
>>
>>
>> On Mon, Apr 3, 2017 at 12:22 PM, karthick g <ik...@gmail.com>
>> wrote:
>>
>>> Hi team,
>>>
>>> These are the drop box links for the error pdf file.
>>>
>>> PDF File
>>> ---------------
>>> https://www.dropbox.com/s/hzkzfowl29b62fy/corrupt.pdf?dl=0
>>>
>>> Screen Shot when loading the PDF file
>>> ------------------------------------------------------
>>>
>>> https://www.dropbox.com/s/h9x5wad9dgwxnxt/pdf-error.png?dl=0
>>>
>>> Screen Shot when pressing the CTRL key
>>> -------------------------------------------------------------
>>>
>>> https://www.dropbox.com/s/qpkztacdfx6hwep/ctrl-key-error-pdf.png?dl=0
>>>
>>> Please let me know if you need more information
>>>
>>> Regards,
>>> karthick G
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Mar 31, 2017 at 10:51 AM, karthick g <ik...@gmail.com>
>>> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> Apologies for sending the previous mail to developer team, Please guide
>>>> me
>>>>
>>>> * There was a problem when loading a pdf file to Adobe Acrobat.
>>>>
>>>> * "There was a problem reading this document (18) " (Error Shown by
>>>> acrobat)*
>>>> * When running the same file to the PDFBox it is working fine.
>>>>
>>>> * Is their way to catch this document(18) error in pdfbox while
>>>> parse?*
>>>>
>>>>
>>>> Regards,
>>>> karthick
>>>>
>>>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: There was a problem when loading a pdf file to Adobe Acrobat. document(18)
Posted by karthick g <ik...@gmail.com>.
Hi,
Thanks for your support. Based on your suggestion, Now I moved to PDFBox
latest version 2.0.5.
Still I can not able to get the error on PDFParse. Even I have set Lenient
option of pdf parser to false,
to catch the error. The file is parsing well and I can not able to get the
error for malformed pdf. Can You send
the code snippet or explain the way that how you got the error.
Regards,
karthick G
On Thu, Apr 6, 2017 at 2:21 PM, karthick g <ik...@gmail.com> wrote:
> Hi,
>
> Thanks for your support. I am running PDFBox 1.8.2.
>
> I parsed the file with FileInputStream object.
>
>
>
>
>
>
> *FileInputStream fs; try { fs = new
> FileInputStream("/home/karthik/5.56.0/corrupt.pdf");
> PDFParser pdfParser = new PDFParser(fs); pdfParser.parse();*
>
>
> Am not getting Unexpected object type exception as you mentioned. I parsed
> in many ways I can not able to produce the exception as you got. It is
> parsing fine for me, that is my problem.
> Please let me know if you need more information
>
> Regards,
> Karthick G
>
>
>
> On Mon, Apr 3, 2017 at 12:22 PM, karthick g <ik...@gmail.com>
> wrote:
>
>> Hi team,
>>
>> These are the drop box links for the error pdf file.
>>
>> PDF File
>> ---------------
>> https://www.dropbox.com/s/hzkzfowl29b62fy/corrupt.pdf?dl=0
>>
>> Screen Shot when loading the PDF file
>> ------------------------------------------------------
>>
>> https://www.dropbox.com/s/h9x5wad9dgwxnxt/pdf-error.png?dl=0
>>
>> Screen Shot when pressing the CTRL key
>> -------------------------------------------------------------
>>
>> https://www.dropbox.com/s/qpkztacdfx6hwep/ctrl-key-error-pdf.png?dl=0
>>
>> Please let me know if you need more information
>>
>> Regards,
>> karthick G
>>
>>
>>
>>
>>
>>
>> On Fri, Mar 31, 2017 at 10:51 AM, karthick g <ik...@gmail.com>
>> wrote:
>>
>>> Hi Team,
>>>
>>> Apologies for sending the previous mail to developer team, Please guide
>>> me
>>>
>>> * There was a problem when loading a pdf file to Adobe Acrobat.
>>>
>>> * "There was a problem reading this document (18) " (Error Shown by
>>> acrobat)*
>>> * When running the same file to the PDFBox it is working fine.
>>>
>>> * Is their way to catch this document(18) error in pdfbox while
>>> parse?*
>>>
>>>
>>> Regards,
>>> karthick
>>>
>>
>>
>
Re: There was a problem when loading a pdf file to Adobe Acrobat.
document(18)
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 06.04.2017 um 10:51 schrieb karthick g:
> Hi,
>
> Thanks for your support. I am running PDFBox 1.8.2.
>
> I parsed the file with FileInputStream object.
>
>
>
>
>
>
> *FileInputStream fs; try { fs = new
> FileInputStream("/home/karthik/5.56.0/corrupt.pdf");
> PDFParser pdfParser = new PDFParser(fs); pdfParser.parse();*
>
>
> Am not getting Unexpected object type exception as you mentioned. I parsed
> in many ways I can not able to produce the exception as you got. It is
> parsing fine for me, that is my problem.
> Please let me know if you need more information
You are using an outdated version. 1.8.2 is from June 2013. Contact the
website where you got that and tell them that we're now at 1.8.13 / 2.0.5.
But the current version 2.0.5 would throw an exception but not up to the
top, i.e. PDFBox would recover. You could only see it in logging.
What you seem to want is a "strict" PDFBox version, i.e. that fails for
everything not really compliant with the PDF specification. For this,
you should fork the project and remove everything that is lenient. This
would require a thorough code review.
Tilman
>
> Regards,
> Karthick G
>
>
>
> On Mon, Apr 3, 2017 at 12:22 PM, karthick g <ik...@gmail.com> wrote:
>
>> Hi team,
>>
>> These are the drop box links for the error pdf file.
>>
>> PDF File
>> ---------------
>> https://www.dropbox.com/s/hzkzfowl29b62fy/corrupt.pdf?dl=0
>>
>> Screen Shot when loading the PDF file
>> ------------------------------------------------------
>>
>> https://www.dropbox.com/s/h9x5wad9dgwxnxt/pdf-error.png?dl=0
>>
>> Screen Shot when pressing the CTRL key
>> -------------------------------------------------------------
>>
>> https://www.dropbox.com/s/qpkztacdfx6hwep/ctrl-key-error-pdf.png?dl=0
>>
>> Please let me know if you need more information
>>
>> Regards,
>> karthick G
>>
>>
>>
>>
>>
>>
>> On Fri, Mar 31, 2017 at 10:51 AM, karthick g <ik...@gmail.com>
>> wrote:
>>
>>> Hi Team,
>>>
>>> Apologies for sending the previous mail to developer team, Please guide me
>>>
>>> * There was a problem when loading a pdf file to Adobe Acrobat.
>>>
>>> * "There was a problem reading this document (18) " (Error Shown by
>>> acrobat)*
>>> * When running the same file to the PDFBox it is working fine.
>>>
>>> * Is their way to catch this document(18) error in pdfbox while parse?*
>>>
>>>
>>> Regards,
>>> karthick
>>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: There was a problem when loading a pdf file to Adobe Acrobat. document(18)
Posted by karthick g <ik...@gmail.com>.
Hi,
Thanks for your support. I am running PDFBox 1.8.2.
I parsed the file with FileInputStream object.
*FileInputStream fs; try { fs = new
FileInputStream("/home/karthik/5.56.0/corrupt.pdf");
PDFParser pdfParser = new PDFParser(fs); pdfParser.parse();*
Am not getting Unexpected object type exception as you mentioned. I parsed
in many ways I can not able to produce the exception as you got. It is
parsing fine for me, that is my problem.
Please let me know if you need more information
Regards,
Karthick G
On Mon, Apr 3, 2017 at 12:22 PM, karthick g <ik...@gmail.com> wrote:
> Hi team,
>
> These are the drop box links for the error pdf file.
>
> PDF File
> ---------------
> https://www.dropbox.com/s/hzkzfowl29b62fy/corrupt.pdf?dl=0
>
> Screen Shot when loading the PDF file
> ------------------------------------------------------
>
> https://www.dropbox.com/s/h9x5wad9dgwxnxt/pdf-error.png?dl=0
>
> Screen Shot when pressing the CTRL key
> -------------------------------------------------------------
>
> https://www.dropbox.com/s/qpkztacdfx6hwep/ctrl-key-error-pdf.png?dl=0
>
> Please let me know if you need more information
>
> Regards,
> karthick G
>
>
>
>
>
>
> On Fri, Mar 31, 2017 at 10:51 AM, karthick g <ik...@gmail.com>
> wrote:
>
>> Hi Team,
>>
>> Apologies for sending the previous mail to developer team, Please guide me
>>
>> * There was a problem when loading a pdf file to Adobe Acrobat.
>>
>> * "There was a problem reading this document (18) " (Error Shown by
>> acrobat)*
>> * When running the same file to the PDFBox it is working fine.
>>
>> * Is their way to catch this document(18) error in pdfbox while parse?*
>>
>>
>> Regards,
>> karthick
>>
>
>