You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by I I <go...@gmail.com> on 2010/03/02 05:16:43 UTC

Need your help: How to use pdfbox

Hello,

I want to read pdf contents using Java or convert pdf to text using Java. I
downloaded the pdfbox and unzipped it to a folder. I then executed *ant *on
command line prompt. Everything was fine with some warnings there. And then
I used the command as shown below from the folder src/main/java. I used this
command as there is no ExtractText.class file as yet.


*C:\pdfbox-1.0.0\src\main\java>javac org\apache\pdfbox\ExtractText.java
Jan.pdf Jan.txt
org\apache\pdfbox\ExtractText.java:27: package pdmodel does not exist
import pdmodel.PDDocument;
^
org\apache\pdfbox\ExtractText.java:28: package pdmodel.encryption does not
exist

import pdmodel.encryption.AccessPermiss*ion;
*^
org\apache\pdfbox\ExtractText.java:29: package pdmodel.encryption does not
exist

import pdmodel.encryption.StandardDecryptionMaterial;
^
org\apache\pdfbox\ExtractText.java:30: package util does not exist
import util.PDFText2HTML;
^
org\apache\pdfbox\ExtractText.java:31: package util does not exist
import util.PDFTextStripper;
^
error: Class names, 'Jan.pdf,Jan.txt', are only accepted if annotation
processin
g is explicitly requested
org\apache\pdfbox\ExtractText.java:162: cannot find symbol
symbol : class PDDocument
location: class org.apache.pdfbox.ExtractText
PDDocument document = null;
^
org\apache\pdfbox\ExtractText.java:170: cannot find symbol
symbol : variable PDDocument
location: class org.apache.pdfbox.ExtractText
document = PDDocument.load(url, force);
^
org\apache\pdfbox\ExtractText.java:179: cannot find symbol
symbol : variable PDDocument
location: class org.apache.pdfbox.ExtractText
document = PDDocument.load(pdfFile, force);
^
org\apache\pdfbox\ExtractText.java:189: cannot find symbol
symbol : class StandardDecryptionMaterial
location: class org.apache.pdfbox.ExtractText
StandardDecryptionMaterial sdm = new StandardDecryptionMater
ial( password );
^
org\apache\pdfbox\ExtractText.java:189: cannot find symbol
symbol : class StandardDecryptionMaterial
location: class org.apache.pdfbox.ExtractText
StandardDecryptionMaterial sdm = new StandardDecryptionMater
ial( password );
^
org\apache\pdfbox\ExtractText.java:191: cannot find symbol
symbol : class AccessPermission
location: class org.apache.pdfbox.ExtractText
AccessPermission ap = document.getCurrentAccessPermission();

^
org\apache\pdfbox\ExtractText.java:223: cannot find symbol
symbol : class PDFTextStripper
location: class org.apache.pdfbox.ExtractText
PDFTextStripper stripper = null;
^
org\apache\pdfbox\ExtractText.java:226: cannot find symbol
symbol : class PDFText2HTML
location: class org.apache.pdfbox.ExtractText
stripper = new PDFText2HTML(encoding);
^
org\apache\pdfbox\ExtractText.java:230: cannot find symbol
symbol : class PDFTextStripper
location: class org.apache.pdfbox.ExtractText
stripper = new PDFTextStripper(encoding);
^
15 errors

C:\pdfbox-1.0.0\src\main\java>*




And then, I did the following:




*C:\pdfbox-1.0.0\src\main\java>java org\apache\pdfbox\ExtractText Jan.pdf
Jan.txt

Exception in thread "main" java.lang.NoClassDefFoundError:
org\apache\pdfbox\Ext
ractText
Caused by: java.lang.ClassNotFoundException: org\apache\pdfbox\ExtractText
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClassInternal(Unknown Source)
Could not find the main class: org\apache\pdfbox\ExtractText. Program will
exit
.*

*C:\pdfbox-1.0.0\src\main\java>*
*
*
Could you please help me how to extract text from a pdf file. Your help will
be highly appreciated.*
*
*
*
Thanks
*

*