You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Jamshid Afshar (JIRA)" <ji...@apache.org> on 2015/05/12 04:11:00 UTC

[jira] [Created] (TIKA-1627) Authentication for fileUrl

Jamshid Afshar created TIKA-1627:
------------------------------------

             Summary: Authentication for fileUrl
                 Key: TIKA-1627
                 URL: https://issues.apache.org/jira/browse/TIKA-1627
             Project: Tika
          Issue Type: Improvement
          Components: server
    Affects Versions: 1.9
            Reporter: Jamshid Afshar


The fileUrl feature in 1.9-SNAPSHOT is great! Are there plans for letting the client provide auth credentials for the request to the remote source (fileUrl)? Seems tika would need support for HTTP Basic/Digest username:password, then HTTPS certificates, and maybe AWS S3 access+secret keys. But, I guess S3 auth can be used now if the client provides a signed url.

I tried the (old and deprecated?) HTTP url syntax containing username:password, but that is apparently ignored. Tika gets a 401 and that causes tika to respond with a 500 error.
{noformat}
$ curl -H "fileUrl: http://user:password@example.com/foo.jpg" -H "Accept: application/json" -X PUT http://localhost:9998/meta
HTTP/1.1 500 Server Error
{noformat}

I think it's fine to require credentials to be provided in each request, but others might want them configurable on the server, probably by domain or domain + path.

A weird alternative solution to this might be for tika to be like a proxy -- pass through any Authorization: or Cookie: from the request and forward any 401/403 response from the remote source (fileUrl) to the tika client. I wonder if that might make an OAuth handler for the remote source possible/easier.

Sorry if this isn't the right place to suggest this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)