You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Julian Reschke (Jira)" <ji...@apache.org> on 2021/03/15 15:35:00 UTC

[jira] [Commented] (TIKA-3320) TikaServer Header Name is Case-sensitive

    [ https://issues.apache.org/jira/browse/TIKA-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301714#comment-17301714 ] 

Julian Reschke commented on TIKA-3320:
--------------------------------------

(side note: RFC 2616 is really really obsolete)

Yes, field names in HTTP are case-insensitive. Relying on upper/lowercase will not work with many components. In particular, HTTP/2 always lower-cases field names on the wire.

> TikaServer Header Name is Case-sensitive
> ----------------------------------------
>
>                 Key: TIKA-3320
>                 URL: https://issues.apache.org/jira/browse/TIKA-3320
>             Project: Tika
>          Issue Type: Bug
>          Components: core, server
>    Affects Versions: 1.25
>            Reporter: Subhajit Das
>            Priority: Minor
>
> It seems that TikaServer 1.25 header like “X-Tika-PDFOcrStrategy” is case sensitive.
> This is creating issue in a system where request headers are automatically lowercased.
>  
> According to [https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2]
> "Field names are case-insensitive"
>  
> The issue is due to
> First a case-sensitive checking happens for startsWith "X-Tika-PDF" or "X-Tika-OCR". Then getDeclaredField of the respective config class is called to get field, and invokes the setter method.
> The same is maintained in newer TikaServer.
>  
> Possible solution:
> Case-insensitive checking for startsWith. For getDeclaredField we can assume only fields will be there (irrespective of case) for any name, and then find out the field for it. Then derive setter from actual field name. Invoke the same.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)