You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2018/11/07 18:38:00 UTC

[jira] [Created] (PDFBOX-4370) Jempbox's ResourceEvent crazily slow to initialize

Tim Allison created PDFBOX-4370:
-----------------------------------

             Summary: Jempbox's ResourceEvent crazily slow to initialize
                 Key: PDFBOX-4370
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4370
             Project: PDFBox
          Issue Type: Task
          Components: JempBox
    Affects Versions: 1.8.16
            Reporter: Tim Allison
         Attachments: slow.zip

In our new batch of regression files on Tika, one of the new PDFs caused a timeout.  This is not an infinite loop, but it does take several minutes. This may not be fixable.

Admittedly, the XMP is large, and there are quite a few events.

This is the code that triggers the problem.
{noformat}
            XMPMetadata xmp = XMPMetadata.load(is);
            XMPSchemaMediaManagement mmSchema = xmp.getMediaManagementSchema();
            mmSchema.getHistory();
{noformat}

The slow part _seems_ to be setting the attribute namespace when creating a new ResourceEvent.  When I comment out the following in ResourceEvent's initializer, the processing time is quite fast (1 second).

{noformat}
            parent.setAttributeNS( 
                XMPSchema.NS_NAMESPACE, 
                "xmlns:stEvt", 
                NAMESPACE );
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org