You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Max Gravitt <mg...@me.com> on 2010/06/30 03:38:59 UTC

Using PDFBox with Google Apps Engine

Hi,

I'm using PDFBox locally with no problem, but cannot with GAE.  After searching around, it appears that GAE restricts the use of AWT Rectangle and, of course, the file system to write temp files.  Is there a version of PDFBox that works with GAE?  The functionality I'm looking for is just for extracting text of a PDF file, without regard for page or position location.  Is this possible?

thanks!
Max

Re: Using PDFBox with Google Apps Engine

Posted by Max Gravitt <mg...@me.com>.
Ok, great!  Can you send me the new jar file or the changes you made to those two files?

I'm not setup to recompile PDFBox, but I can if you have the source files but not jar.  For example, I'm not sure what changes you made to Rectangle.

Thanks!
Max

Max W. Gravitt
919.673.8460 (txt msg welcome)

On Jun 30, 2010, at 10:20 AM, Fabrizio Accatino <fh...@gmail.com> wrote:

> You have to modify the source code of PdfBox.  I have done that. I had the
> same requirement: text extraction from pdf on GAE.
> 
> From my blog:
> http://fhtino.blogspot.com/2010/04/pdfbox-text-extration-gae.html
> <http://fhtino.blogspot.com/2010/04/pdfbox-text-extration-gae.html>
> 
>   fabrizio
> 
> 
> 
> On Wed, Jun 30, 2010 at 3:38 AM, Max Gravitt <mg...@me.com> wrote:
> 
>> Hi,
>> 
>> I'm using PDFBox locally with no problem, but cannot with GAE.  After
>> searching around, it appears that GAE restricts the use of AWT Rectangle
>> and, of course, the file system to write temp files.  Is there a version of
>> PDFBox that works with GAE?  The functionality I'm looking for is just for
>> extracting text of a PDF file, without regard for page or position location.
>> Is this possible?
>> 
>> thanks!
>> Max
>> 

Re: Using PDFBox with Google Apps Engine

Posted by Fabrizio Accatino <fh...@gmail.com>.
You have to modify the source code of PdfBox.  I have done that. I had the
same requirement: text extraction from pdf on GAE.

>From my blog:
http://fhtino.blogspot.com/2010/04/pdfbox-text-extration-gae.html
<http://fhtino.blogspot.com/2010/04/pdfbox-text-extration-gae.html>

   fabrizio



On Wed, Jun 30, 2010 at 3:38 AM, Max Gravitt <mg...@me.com> wrote:

> Hi,
>
> I'm using PDFBox locally with no problem, but cannot with GAE.  After
> searching around, it appears that GAE restricts the use of AWT Rectangle
> and, of course, the file system to write temp files.  Is there a version of
> PDFBox that works with GAE?  The functionality I'm looking for is just for
> extracting text of a PDF file, without regard for page or position location.
>  Is this possible?
>
> thanks!
> Max
>