You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/10/26 17:37:00 UTC
[jira] [Resolved] (TIKA-3582) Tika does not respect a configuration
value passed over a HTTP Header
[ https://issues.apache.org/jira/browse/TIKA-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Allison resolved TIKA-3582.
-------------------------------
Fix Version/s: 2.1.1
Assignee: Tim Allison
Resolution: Fixed
I added an override in the tesseract parser for timeouts per parse. Eventually, we can get around to deprecating the current methods.
Users can now add {{X-Tika-Timeout-Millis}} via the headers for /rmeta and /tika. This value cannot be greater than {{taskTimeoutMillis}} as configured for tika-server. The reason I made this choice was out of concern for security. There may be some circumstances where folks hosting the server would not want clients to set whatever they felt was a reasonable timeout.
So, for now, the server should have the largest {{taskTimeoutMillis}} desired, but the clients should specify a smaller limit.
I'm not held to this decision. Please reopen if this doesn't make sense.
> Tika does not respect a configuration value passed over a HTTP Header
> ---------------------------------------------------------------------
>
> Key: TIKA-3582
> URL: https://issues.apache.org/jira/browse/TIKA-3582
> Project: Tika
> Issue Type: Bug
> Components: server
> Affects Versions: 2.1.0
> Reporter: dataminer.accolade
> Assignee: Tim Allison
> Priority: Major
> Fix For: 2.1.1
>
> Attachments: sampleimage.png
>
>
>
> I think the value of TikaServerConfig.TaskTimeoutMillis should be overridden for the current request over *X-Tika-OCRTimeoutSeconds* header. The following request takes more than 120 seconds.
> *curl -vvv -X PUT -T sampleimage.png http://localhost:9998/tika --header "X-Tika-OCRTimeoutSeconds: 600"*
>
> Tesserect is configured with tessdata_best models
--
This message was sent by Atlassian Jira
(v8.3.4#803005)