You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ta...@apache.org on 2022/08/05 14:04:14 UTC

[tika] branch main updated: TIKA-3832 -- defend against an infinite loop in handling bookmarks in PDFs.

This is an automated email from the ASF dual-hosted git repository.

tallison pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git


The following commit(s) were added to refs/heads/main by this push:
     new dcea49b41 TIKA-3832 -- defend against an infinite loop in handling bookmarks in PDFs.
dcea49b41 is described below

commit dcea49b41ae8dad79497d645c72b4d1b297f983b
Author: tallison <ta...@apache.org>
AuthorDate: Fri Aug 5 10:04:08 2022 -0400

    TIKA-3832 -- defend against an infinite loop in handling bookmarks in PDFs.
---
 .../src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java    | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java b/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java
index 13ccd70b8..9de6e0daf 100644
--- a/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java
+++ b/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/AbstractPDF2XHTML.java
@@ -972,6 +972,9 @@ class AbstractPDF2XHTML extends PDFTextStripper {
             }
             xhtml.startElement("ul");
             while (current != null) {
+                if (seen.contains(current)) {
+                    break;
+                }
                 seen.add(current);
                 xhtml.startElement("li");
                 xhtml.characters(current.getTitle());