You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by "Toeroek, Laszlo (EXT)" <la...@siemens.com> on 2010/12/08 11:30:05 UTC

pdf2text under windows xp

Hi

I've been trying to do text extraction under windows xp. I downloaded pdfbox-app-1.3.1.jar<http://www.apache.org/dyn/closer.cgi/pdfbox/1.3.1/pdfbox-app-1.3.1.jar> and typed:

java -jar pdfbox-app-x.y.z.jar org.apache.pdfbox.ExtractText mypdf.pdf mytext.txt

But I got "usage: java pdfbox-app-x.y.z.jar <command> <args..>".

So I tried running it again without the -jar option but I got

"Exception in thread "main" java.lang.NoClassDefFoundError: pdfbox-app-1/3/1/jar"

What am I doing wrong?

Kind regards,

Las

Mit freundlichem Gruß / Best regards / Baráti üdvözlettel,
---------------------------------------------------
László Török
Software Developer
evosoft GmbH

Dienstleister der Siemens AG
I IA AS RD HMI

Siemensstr. 2, 90766 Fürth
Phone: +49 911 978 - 2251
eMail: Laszlo.Torok@evosoft.com<ma...@evosoft.com>
          Laszlo.Toeroek.ext@siemens.com<ma...@siemens.com>

www.evosoft.com<http://www.evosoft.com/>

evosoft GmbH: Sitz der Gesellschaft: Nürnberg
Handelsregister: HRB 13657 Amtsgericht Nürnberg
Geschäftsführer: Ekkehard Reuß, Jean-Marc Raczka, Dr. Rainer Besold, Lorenz Guschlbauer

Wichtiger Hinweis: Diese E-Mail und etwaige Anlagen enthalten firmenvertrauliche Informationen. Sollten Sie diese E-Mail irrtümlich
erhalten haben, benachrichtigen Sie uns bitte durch Antwort-Mail und löschen Sie diese E-Mail nebst Anlagen von Ihrem System.
Vielen Dank.


Re: pdf2text under windows xp

Posted by Jukka Zitting <jz...@adobe.com>.
Hi,

On 08/12/10 11:30, Toeroek, Laszlo (EXT) wrote:
> I've been trying to do text extraction under windows xp. I downloaded
> pdfbox-app-1.3.1.jar and typed:
>
> java -jar pdfbox-app-x.y.z.jar org.apache.pdfbox.ExtractText
>     mypdf.pdf mytext.txt

Use the following command:

     java -jar pdfbox-app-1.3.1.jar ExtractText ...

This will start up a generic PDFBox command line tool that uses the 
first command line argument (in this case "ExtractText") to select which 
utility tool to run.

You can also invoke the ExtractText or other utility classes directly 
like this:

     java -cp pdfbox-app-1.3.1.jar org.apache.pdfbox.ExtractText ...

Note that now we're using the pdfbox-app jar as just a normal classpath 
component instead of as a runnable jar, so you need to explicitly 
specify the full name of the main class.

BR,

Jukka Zitting