You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2018/12/30 20:16:00 UTC

[jira] [Commented] (PDFBOX-4003) Can't retrieve number tree from structure tree

    [ https://issues.apache.org/jira/browse/PDFBOX-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731066#comment-16731066 ] 

Tilman Hausherr commented on PDFBOX-4003:
-----------------------------------------

I think it can be done by using a class with two constructurs, like this:
{code:java}
    public static class PDParentTreeValue implements COSObjectable
    {
        COSObjectable obj;
        public PDParentTreeValue(COSArray obj)
        {
            this.obj = obj;
        }
        public PDParentTreeValue(COSDictionary obj)
        {
            this.obj = obj;
        }
        @Override
        public COSBase getCOSObject()
        {
            return obj.getCOSObject();
        }
        @Override
        public String toString()
        {
            return obj.toString();
        }
    }{code}
now this code works:
{code:java}
        PDDocument doc = PDDocument.load(new URL("https://issues.apache.org/jira/secure/attachment/12896900/GeneralForbearance.pdf").openStream());
        COSDictionary parentTreeDict = doc.getDocumentCatalog().getStructureTreeRoot().getCOSObject().getCOSDictionary(COSName.PARENT_TREE);
        PDNumberTreeNode numberTreeNode = new PDNumberTreeNode(parentTreeDict, PDParentTreeValue.class);
        Object value = numberTreeNode.getValue(0);
        System.out.println("value: " + value);{code}
I'm gonna use this when working on merging the parent tree. The current code only works when there's a NUMS array, not when there are KIDS.

> Can't retrieve number tree from structure tree
> ----------------------------------------------
>
>                 Key: PDFBOX-4003
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4003
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.8
>            Reporter: Tilman Hausherr
>            Priority: Major
>              Labels: StructureTree
>             Fix For: 2.0.14, 3.0.0 PDFBox
>
>         Attachments: GeneralForbearance.pdf
>
>
> {code}
> PDDocument doc = PDDocument.load(new File("GeneralForbearance.pdf"));
> Object value = doc.getDocumentCatalog().getStructureTreeRoot().getParentTree().getValue(0);
> {code}
> The code above always fails when used on a PDF with a structure tree:
> {code}
> Exception in thread "main" java.io.IOException: Error while trying to create value in number tree:org.apache.pdfbox.cos.COSBase.<init>(org.apache.pdfbox.cos.COSArray)
> 	at org.apache.pdfbox.pdmodel.common.PDNumberTreeNode.convertCOSToPD(PDNumberTreeNode.java:212)
> 	at org.apache.pdfbox.pdmodel.common.PDNumberTreeNode.getNumbers(PDNumberTreeNode.java:185)
> 	at org.apache.pdfbox.pdmodel.common.PDNumberTreeNode.getValue(PDNumberTreeNode.java:139)
> 	at pdfboxpageimageextraction.MergeTest.main(MergeTest.java:29)
> Caused by: java.lang.NoSuchMethodException: org.apache.pdfbox.cos.COSBase.<init>(org.apache.pdfbox.cos.COSArray)
> 	at java.lang.Classj.getConstructor0(Class.java:3082)
> 	at java.lang.Class.getDeclaredConstructor(Class.java:2178)
> 	at org.apache.pdfbox.pdmodel.common.PDNumberTreeNode.convertCOSToPD(PDNumberTreeNode.java:206)
> 	... 3 more
> {code}
> I suspect that it is related to the PDNumberTreeNode having been called with a {{COSBase}} class parameter in {{getParentTree()}}. 
> That one doesn't have a constructor with a parameter.
> The structure tree numbers tree has mixed contents, these can be arrays or dictionaries.
> What we need is some PD wrapper that can have both inside.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org