You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tom Samplonius (JIRA)" <ji...@apache.org> on 2013/07/09 01:47:48 UTC

[jira] [Commented] (PDFBOX-1647) Not able to parse PDF files using PHP and Linux

    [ https://issues.apache.org/jira/browse/PDFBOX-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13702638#comment-13702638 ] 

Tom Samplonius commented on PDFBOX-1647:
----------------------------------------

Executing binary programs, like PDFBox from within PHP on a shared webhost will never be reliable.  There are many things that can go wrong:  shared providers will use SUExec which heavily restricts what the binary can do; some shared hosts completely disable executing files; the shared host may not have Java installed; the shared host may have applied memory limits preventing PDFBox from starting.  Plus, your shared host probably won't provide an ssh terminal to the server either, so you will be completely blind if things aren't working.  And shared hosts may or may not give you copies of the PHP logs.  

You should use a VPS provider instead.
                
> Not able to parse PDF files using PHP and Linux
> -----------------------------------------------
>
>                 Key: PDFBOX-1647
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1647
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.2
>         Environment: PHP and Linux Platform
>            Reporter: Rajiv Maity
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Hi, I am trying to extract texts using pdfbox-app1.0.8.jar, I am able to extract it in windows xp pc but when I am trying to execute the same program in Go Daddy Linux server, the program is not working i.e not extracting the texts from the pdf. Below is the program, that I am using to extract texts from pdf files.
>         require_once 'pdf_lib/pdf_box/PDFBox/ExtractText.php';
> 	$jar = "pdf_lib/pdf_box/lib/pdfbox-app-1.8.2.jar";
> 	$pdf_box = new PDFBox\PDFBox($jar);
> 	$extract_text = new PDFBox\ExtractText($pdf_box);
> 	$extract_text->parse($pdfLocation);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira