You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Kevin Oberlag (JIRA)" <ji...@apache.org> on 2017/03/06 22:13:32 UTC

[jira] [Created] (TIKA-2290) PDFParser 'ocr' properties cannot be set via headers when using Tika JAXRS

Kevin Oberlag created TIKA-2290:
-----------------------------------

             Summary: PDFParser 'ocr' properties cannot be set via headers when using Tika JAXRS
                 Key: TIKA-2290
                 URL: https://issues.apache.org/jira/browse/TIKA-2290
             Project: Tika
          Issue Type: Bug
          Components: ocr, parser
    Affects Versions: 1.14, 1.13
            Reporter: Kevin Oberlag


I have created a stackoverflow question on this topic [here | http://stackoverflow.com/questions/42602834/x-tika-pdfocrstrategy-is-an-invalid-x-tika-ocr-header-error], but I'll reiterate the main issue. 

I am trying to use TikaJAXRS and add headers for setting PDFParser properties. Specifically the ocrStrategy property. However, when I add the header using X-Tika-PDFocrStrategy, I get an error stating that it is an invalid X-Tika-OCR header.

After looking into the source code, I believe the issue might be with the 'fillParseContext' method in the TikaResource.java file.

The if statement first looks for a key that starts with the OCR header prefix, and since the PDFParser's property name contains 'ocr', it is trying to find a property named 'ocrStrategy' in the OCRParser class, which doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)