You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Javier García Sánchez (JIRA)" <ji...@apache.org> on 2014/08/12 16:39:11 UTC

[jira] [Created] (PDFBOX-2270) PDField.getFullyQualifiedName() returns name adding suffix '.null'

Javier García Sánchez created PDFBOX-2270:
---------------------------------------------

             Summary: PDField.getFullyQualifiedName() returns name adding suffix '.null'
                 Key: PDFBOX-2270
                 URL: https://issues.apache.org/jira/browse/PDFBOX-2270
             Project: PDFBox
          Issue Type: Bug
          Components: AcroForm
    Affects Versions: 1.8.6, 1.8.0, 1.7.1, 1.5.0
         Environment: JSE1.6
            Reporter: Javier García Sánchez


We have several pdf files where each one contains one pdf form with their own fields. We need to read all pdf fields and list them into a txt file.
 
The problem comes when a pdf form has duplicated field names, so the field.getFullyQualifiedName() returns the name of the field wrong, adding '.null' at the final of field's names.

-->Situation:
1) PDf file containing a pdf form
2) The pdf form contains lot of fields, some of their field's names are duplicated, like for example 'Applicant.city'.
3) When I try to list all of field's names, duplicate field's names comes with a suffix '.null' --> this only happends on duplicated field's names.
----------------------------------------------------------------------------------------------

-->Example:
1) PDF Form with 4 fields whos names are: 'Applicant.name', 'Applicant.phone', 'Applicant.ssn', 'Applicant.name'.

2)After running the code shown bellow, the result list is: 'Applicant.name.null', 'Applicant.phone', 'Applicant.ssn', 'Applicant.name.null'.
----------------------------------------------------------------------------------------------

-->Attach the code for listing all pdf form field's names:

 public static Set<String> printFields( PDDocument doc ) throws IOException {
        PDDocumentCatalog docCatalog = doc.getDocumentCatalog();
        PDAcroForm acroForm = docCatalog.getAcroForm();
        List fields = acroForm.getFields();
        Iterator fieldsIter = fields.iterator();
        
        Set<String> fieldSet = new HashSet<String>();
        
        while ( fieldsIter.hasNext() ){
            PDField field = (PDField)fieldsIter.next();
            // String fieldFullName = processField(field);
            fieldSet.addAll( processField( field ) );
        }
        
        return fieldSet;
    }
    
    
    private static Set<String> processField( PDField field ) throws IOException {
        List kids = field.getKids();
        Set<String> result = new HashSet<String>();
        if( kids != null ){
            Iterator kidsIter = kids.iterator();
            
            while ( kidsIter.hasNext() ){
                Object pdfObj = kidsIter.next();
                if( pdfObj instanceof PDField ){
                    PDField kid = (PDField)pdfObj;
                    result.addAll( processField( kid ) );
                }
            }
        }else{
            
            //System.out.println( "field.getFullyQualifiedName(): " + field.getFullyQualifiedName() );
            
            result.add( field.getFullyQualifiedName() );
        }
        
        return result;
        
    }
--------------------------------------------------------------------------------
field.getFullyQualifiedName()  is returning duplicated field's names with a prefix '.null'.

Thanks in advance.




--
This message was sent by Atlassian JIRA
(v6.2#6252)