You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Ga...@sungard.com on 2015/03/06 21:05:48 UTC
PDFParser Error Caused by:
org.apache.pdfbox.exceptions.WrappedIOException
Hello,
I am getting PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException
Complete stack trace is on the following link.
( http://apaste.info/DRD )
I am trying to import 4GB Long PDF using Tika into Solr. I was able to import up to 500MB.
Please suggest if there is any workaround.
Thanks
G
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException
Posted by John Hewson <jo...@jahewson.com>.
> On 6 Mar 2015, at 12:05, Ganesh.Yadav@sungard.com wrote:
>
> Hello,
> I am getting PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException
> Complete stack trace is on the following link.
> ( http://apaste.info/DRD )
>
> I am trying to import 4GB Long PDF using Tika into Solr. I was able to import up to 500MB.
Just checking - you gave java at least 4GB of heap, right?
— John
> Please suggest if there is any workaround.
>
> Thanks
> G
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException
Posted by Tilman Hausherr <TH...@t-online.de>.
Sorry, wrong links, use these:
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/1.8.9-SNAPSHOT/
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.0-SNAPSHOT/
Tilman
Am 07.03.2015 um 14:21 schrieb Tilman Hausherr:
> The best would be to test whether that file can be handled by newer
> versions of PDFBox (1.8.9 and 2.0)
>
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/1.8.9-SNAPSHOT/
>
> https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/2.0.0-SNAPSHOT/
>
>
> download the jar files, for each one try
>
> - run java -jar <jarfile> ExtractText <yourfile>
> - see what happens
> - tell it
>
> Your paste indicates a problem in RandomAccessBuffer.java.
>
> Tilman
>
> Am 06.03.2015 um 21:05 schrieb Ganesh.Yadav@sungard.com:
>> Hello,
>> I am getting PDFParser Error Caused by:
>> org.apache.pdfbox.exceptions.WrappedIOException
>> Complete stack trace is on the following link.
>> ( http://apaste.info/DRD )
>>
>> I am trying to import 4GB Long PDF using Tika into Solr. I was able
>> to import up to 500MB.
>>
>>
>> Please suggest if there is any workaround.
>>
>> Thanks
>> G
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException
Posted by Tilman Hausherr <TH...@t-online.de>.
The best would be to test whether that file can be handled by newer
versions of PDFBox (1.8.9 and 2.0)
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/1.8.9-SNAPSHOT/
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox/2.0.0-SNAPSHOT/
download the jar files, for each one try
- run java -jar <jarfile> ExtractText <yourfile>
- see what happens
- tell it
Your paste indicates a problem in RandomAccessBuffer.java.
Tilman
Am 06.03.2015 um 21:05 schrieb Ganesh.Yadav@sungard.com:
> Hello,
> I am getting PDFParser Error Caused by: org.apache.pdfbox.exceptions.WrappedIOException
> Complete stack trace is on the following link.
> ( http://apaste.info/DRD )
>
> I am trying to import 4GB Long PDF using Tika into Solr. I was able to import up to 500MB.
>
>
> Please suggest if there is any workaround.
>
> Thanks
> G
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org