You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by eric <zh...@163.com> on 2011/10/12 05:22:33 UTC

read forms fail and PDAcroForm is null

I tried 1.6 and 1.7 both, and the problem is same, it can't read any form from the test file

        PDAcroForm acroForm = catalog.getAcroForm(); //return null

the acroForm is null but the test pdf file does have  a form, can anybody help me?

post the src file and test file
///////////////////////////////
package Pdfbox.test;

import java.net.URL;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Writer;

import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileWriter;

import java.util.Iterator;
import java.util.List;

import org.apache.pdfbox.pdfparser.PDFParser;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.util.PDFTextStripper;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
import org.apache.pdfbox.pdmodel.interactive.form.PDAcroForm;
import org.apache.pdfbox.pdmodel.interactive.form.PDField;

import org.apache.pdfbox.exceptions.CryptographyException;
import org.apache.pdfbox.exceptions.InvalidPasswordException;

public class ReadForms {

    private static final String pdfFile = "/root/test_readforms.pdf";// Book1.pdf";

    public ReadForms() {

    }

    public static void processField(PDField field, String sLevel, String sParent)
            throws IOException {
        List kids = field.getKids();
        if (kids != null) {
            Iterator kidsIter = kids.iterator();
            if (!sParent.equals(field.getPartialName())) {
                sParent = sParent + "." + field.getPartialName();
            }
            System.out.println(sLevel + sParent);
            // System.out.println(sParent + " is of type " +
            // field.getClass().getName());
            while (kidsIter.hasNext()) {
                Object pdfObj = kidsIter.next();
                if (pdfObj instanceof PDField) {
                    PDField kid = (PDField) pdfObj;
                    processField(kid, "|  " + sLevel, sParent);
                }
            }
        } else {
            String outputString = sLevel + sParent + "."
                    + field.getPartialName() + " = " + field.getValue()
                    + ",  type=" + field.getClass().getName();

            System.out.println(outputString);
        }
    }

    /**
     * This will read a PDF file and print out the form elements. <br /> see
     * usage() for commandline
     *
     * @param args
     *            command line arguments
     *
     * @throws IOException
     *             If there is an error importing the FDF document.
     * @throws CryptographyException
     *             If there is an error decrypting the document.
     */
    public static void main(String[] args) throws IOException,
            CryptographyException {

        // Create the PDF Document
        PDDocument doc = PDDocument.load(pdfFile);

        // Extract the catalog
        PDDocumentCatalog catalog = doc.getDocumentCatalog();

        // Retrieve the AcroForm
        PDAcroForm acroForm = catalog.getAcroForm();

        // Retrieve all fields that should be change
        List listField = acroForm.getFields();
        Iterator<PDField> it = listField.iterator();

        // Loop on each field
        while (it.hasNext()) {
            PDField field = (PDField) it.next();
            processField(field, "|--", field.getPartialName());
        }

    }
}


Re: read forms fail and PDAcroForm is null

Posted by eric <zh...@163.com>.
Hi
I am confused with the form in the pdf, i just create a word file and insert a 3X3 form in open office, then export to pdf format, but the output pdf can't be read through PDFDocumentCatalog.
And i also test the example classes:
org.apache.pdfbox.examples.fdf.SetField and org.apache.pdfbox.examples.fdf.PrintFields
the result is the same:
Exception in thread "main" java.lang.NullPointerException
    at org.apache.pdfbox.examples.fdf.SetField.setField(SetField.java:53)
    at org.apache.pdfbox.examples.fdf.SetField.setField(SetField.java:95)
    at org.apache.pdfbox.examples.fdf.SetField.main(SetField.java:78)
becaue the acroForm is null


Can someone give me a test pdf file which does contain a  acroform ?


thanks


At 2011-10-12 14:27:12,"Kévin Sailly" <ke...@gmail.com> wrote:
>Hello,
>
>
>The source code from PDFDocumentCatalog is pretty simple:
>
>            /**             * This will get the documents acroform.
>This will return null if
>             * no acroform is part of the document.
>             *
>             * *@return* The documents acroform.
>             */
>            *public* *PDAcroForm*
><http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/pdmodel/interactive/form/PDAcroForm.java.htm>
>getAcroForm() {
>                *if* (acroForm == null) {
>                    *COSDictionary*
><http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/cos/COSDictionary.java.htm>
>acroFormDic = (*COSDictionary*
><http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/cos/COSDictionary.java.htm>)
>root
>                            .getDictionaryObject(COSName.ACRO_FORM);
>                    *if* (acroFormDic != null) {
>                        acroForm = *new* *PDAcroForm*
><http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/pdmodel/interactive/form/PDAcroForm.java.htm>(document,
>acroFormDic);
>                    }
>                }
>                *return* acroForm;
>            }
>It' looks like your document does not contains an acroForm, but
>something like a form.
>
>Try look deeper in your document and parse all DictionnaryObject's from it.
>
>Regards,
>Kevin
>
>
>
>2011/10/12 eric <zh...@163.com>
>
>> I tried 1.6 and 1.7 both, and the problem is same, it can't read any form
>> from the test file
>>
>>         PDAcroForm acroForm = catalog.getAcroForm(); //return null
>>
>> the acroForm is null but the test pdf file does have  a form, can anybody
>> help me?
>>
>> post the src file and test file
>> ///////////////////////////////
>> package Pdfbox.test;
>>
>> import java.net.URL;
>>
>> import java.io.File;
>> import java.io.FileOutputStream;
>> import java.io.IOException;
>> import java.io.OutputStreamWriter;
>> import java.io.Writer;
>>
>> import java.io.BufferedWriter;
>> import java.io.FileInputStream;
>> import java.io.FileNotFoundException;
>> import java.io.FileWriter;
>>
>> import java.util.Iterator;
>> import java.util.List;
>>
>> import org.apache.pdfbox.pdfparser.PDFParser;
>> import org.apache.pdfbox.pdmodel.PDDocument;
>> import org.apache.pdfbox.util.PDFTextStripper;
>>
>> import org.apache.pdfbox.pdmodel.PDDocument;
>> import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
>> import org.apache.pdfbox.pdmodel.interactive.f orm.PDAcroForm;
>> import org.apache.pdfbox.pdmodel.interactive.form.PDField;
>>
>> import org.apache.pdfbox.exceptions.CryptographyException;
>> import org.apache.pdfbox.exceptions.InvalidPasswordException;
>>
>> public class ReadForms {
>>
>>     private static final String pdfFile = "/root/test_readforms.pdf";//
>> Book1.pdf";
>>
>>     public ReadForms() {
>>
>>     }
>>
>>     public static void processField(PDField field, String sLevel, String
>> sParent)
>>             throws IOException {
>>         List kids = field.getKids();
>>         if (kids != null) {
>>             Iterator kidsIter = kids.iterator();
>>             if (!sParent.equals(field.getPartialName())) {
>>                  sParent = sParent + "." + field.getPartialName();
>>             }
>>             System.out.println(sLevel + sParent);
>>             // System.out.println(sParent + " is of type " +
>>             // field.getClass().getName());
>>             while (kidsIter.hasNext()) {
>>                 Object pdfObj = kidsIter.next();
>>                 if (pdfObj instanceof PDField) {
>>                     PDField kid = (PDField) pdfObj;
>>                     proce ssField(kid, "|  " + sLevel, sParent);
>>                 }
>>             }
>>         } else {
>>             String outputString = sLevel + sParent + "."
>>                     + field.getPartialName() + " = " + field.getValue()
>>                     + ",  type=" + field.getClass().getName();
>>
>>             System.out.println(outputString);
>>         }
>>     }
>>
>>     /**
>>      * This will read a PDF file and print out the form elements. <br />
>> see
>>      * usage() for commandline
>> &nb sp;    *
>>      * @param args
>>      *            command line arguments
>>      *
>>      * @throws IOException
>>      *             If there is an error importing the FDF document.
>>      * @throws CryptographyException
>>      *             If there is an error decrypting the document.
>>      */
>>     public static void main(String[] args) throws IOException,
>>             CryptographyException {
>>
>>         // Create the PDF Document
>>         PDDocument doc = PDDocument.load(pdfFile);
>>
>>         // Extract the catalog
>>         PDDocumentCatalog catalog = doc.getDocumentCatalog();
>>
>>         // Retrieve the AcroForm
>>         PDAcroForm acroForm = catalog.getAcroForm();
>>
>>         // Retrieve all fields that should be change
>>         List listField = acroForm.getFields();
>>         Iterator<PDField> it = listField.iterator();
>>
>>         // Loop on each field
>>         while (it.hasNext()) {
>>             PDField field = (PDField) it.next();
>>             processField(field, "|--", field.getPartialName());
>>         }
>>
>>     }
>> }
>>
>>
>>
>>

Re: read forms fail and PDAcroForm is null

Posted by Kévin Sailly <ke...@gmail.com>.
Hello,


The source code from PDFDocumentCatalog is pretty simple:

            /**             * This will get the documents acroform.
This will return null if
             * no acroform is part of the document.
             *
             * *@return* The documents acroform.
             */
            *public* *PDAcroForm*
<http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/pdmodel/interactive/form/PDAcroForm.java.htm>
getAcroForm() {
                *if* (acroForm == null) {
                    *COSDictionary*
<http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/cos/COSDictionary.java.htm>
acroFormDic = (*COSDictionary*
<http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/cos/COSDictionary.java.htm>)
root
                            .getDictionaryObject(COSName.ACRO_FORM);
                    *if* (acroFormDic != null) {
                        acroForm = *new* *PDAcroForm*
<http://www.java2s.com/Open-Source/Java-Document/PDF/PDFBox-1.4.0/org/apache/pdfbox/pdmodel/interactive/form/PDAcroForm.java.htm>(document,
acroFormDic);
                    }
                }
                *return* acroForm;
            }
It' looks like your document does not contains an acroForm, but
something like a form.

Try look deeper in your document and parse all DictionnaryObject's from it.

Regards,
Kevin



2011/10/12 eric <zh...@163.com>

> I tried 1.6 and 1.7 both, and the problem is same, it can't read any form
> from the test file
>
>         PDAcroForm acroForm = catalog.getAcroForm(); //return null
>
> the acroForm is null but the test pdf file does have  a form, can anybody
> help me?
>
> post the src file and test file
> ///////////////////////////////
> package Pdfbox.test;
>
> import java.net.URL;
>
> import java.io.File;
> import java.io.FileOutputStream;
> import java.io.IOException;
> import java.io.OutputStreamWriter;
> import java.io.Writer;
>
> import java.io.BufferedWriter;
> import java.io.FileInputStream;
> import java.io.FileNotFoundException;
> import java.io.FileWriter;
>
> import java.util.Iterator;
> import java.util.List;
>
> import org.apache.pdfbox.pdfparser.PDFParser;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.util.PDFTextStripper;
>
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDDocumentCatalog;
> import org.apache.pdfbox.pdmodel.interactive.f orm.PDAcroForm;
> import org.apache.pdfbox.pdmodel.interactive.form.PDField;
>
> import org.apache.pdfbox.exceptions.CryptographyException;
> import org.apache.pdfbox.exceptions.InvalidPasswordException;
>
> public class ReadForms {
>
>     private static final String pdfFile = "/root/test_readforms.pdf";//
> Book1.pdf";
>
>     public ReadForms() {
>
>     }
>
>     public static void processField(PDField field, String sLevel, String
> sParent)
>             throws IOException {
>         List kids = field.getKids();
>         if (kids != null) {
>             Iterator kidsIter = kids.iterator();
>             if (!sParent.equals(field.getPartialName())) {
>                  sParent = sParent + "." + field.getPartialName();
>             }
>             System.out.println(sLevel + sParent);
>             // System.out.println(sParent + " is of type " +
>             // field.getClass().getName());
>             while (kidsIter.hasNext()) {
>                 Object pdfObj = kidsIter.next();
>                 if (pdfObj instanceof PDField) {
>                     PDField kid = (PDField) pdfObj;
>                     proce ssField(kid, "|  " + sLevel, sParent);
>                 }
>             }
>         } else {
>             String outputString = sLevel + sParent + "."
>                     + field.getPartialName() + " = " + field.getValue()
>                     + ",  type=" + field.getClass().getName();
>
>             System.out.println(outputString);
>         }
>     }
>
>     /**
>      * This will read a PDF file and print out the form elements. <br />
> see
>      * usage() for commandline
> &nb sp;    *
>      * @param args
>      *            command line arguments
>      *
>      * @throws IOException
>      *             If there is an error importing the FDF document.
>      * @throws CryptographyException
>      *             If there is an error decrypting the document.
>      */
>     public static void main(String[] args) throws IOException,
>             CryptographyException {
>
>         // Create the PDF Document
>         PDDocument doc = PDDocument.load(pdfFile);
>
>         // Extract the catalog
>         PDDocumentCatalog catalog = doc.getDocumentCatalog();
>
>         // Retrieve the AcroForm
>         PDAcroForm acroForm = catalog.getAcroForm();
>
>         // Retrieve all fields that should be change
>         List listField = acroForm.getFields();
>         Iterator<PDField> it = listField.iterator();
>
>         // Loop on each field
>         while (it.hasNext()) {
>             PDField field = (PDField) it.next();
>             processField(field, "|--", field.getPartialName());
>         }
>
>     }
> }
>
>
>
>