You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Eric Pugh <ep...@opensourceconnections.com> on 2020/02/04 16:25:41 UTC
Anyone can share an example of Java code POSTing a file to
Tika-Server?
Shockingly, I can’t find a nice example of Java code being used to do a HTTP POST of a file to Tika Server.
The CURL command that I want to convert to Java code is:
curl -T /myfile.pdf http://localhost:9998/rmeta --header "X-Tika-OCRLanguage: eng" --header "X-Tika-PDFOcrStrategy: ocr_and_text_extraction" --header "X-Tika-OCRoutputType: hocr"
I’m not having much luck, so was wondering if anyone in the community had an example?
Eric
_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal>
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
Re: Anyone can share an example of Java code POSTing a file to Tika-Server?
Posted by John Patrick <nh...@gmail.com>.
this contains post example plus others, mainly string content not binary content
https://openjdk.java.net/groups/net/httpclient/recipes.html
this contains reading file examples
https://docs.oracle.com/javase/tutorial/essential/io/file.html
i’ll look at doing a pull request to the tika documentation t include a basic example
Sent from my iPhone
> On 4 Feb 2020, at 16:25, Eric Pugh <ep...@opensourceconnections.com> wrote:
>
> Shockingly, I can’t find a nice example of Java code being used to do a HTTP POST of a file to Tika Server.
>
> The CURL command that I want to convert to Java code is:
>
> curl -T /myfile.pdf http://localhost:9998/rmeta --header "X-Tika-OCRLanguage: eng" --header "X-Tika-PDFOcrStrategy: ocr_and_text_extraction" --header "X-Tika-OCRoutputType: hocr"
>
> I’m not having much luck, so was wondering if anyone in the community had an example?
>
> Eric
>
>
>
> _______________________
> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com | My Free/Busy
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.
>
Re: Anyone can share an example of Java code POSTing a file to Tika-Server?
Posted by Tim Allison <ta...@apache.org>.
I updated the example to include putting the bytes (tika-server classic)
and sending the filepath in the header:
https://github.com/tballison/tika-addons/blob/master/tika-eval-solrj/src/main/java/org/tallison/tikaeval/example/TikaServerClient.java#L142
On Tue, Feb 4, 2020 at 8:37 AM Tim Allison <ta...@apache.org> wrote:
> Oops...that example requires you to use the non-secure settings and ships
> the file url to tika server to read from a fileshare. I'll add an example
> with putting bytes later today.
>
>
> On Tue, Feb 4, 2020 at 8:33 AM Tim Allison <ta...@apache.org> wrote:
>
>> Try:
>> https://github.com/tballison/tika-addons/blob/master/tika-eval-solrj/src/main/java/org/tallison/tikaeval/example/TikaServerClient.java#L148
>>
>> On Tue, Feb 4, 2020 at 8:25 AM Eric Pugh <ep...@opensourceconnections.com>
>> wrote:
>>
>>> Shockingly, I can’t find a nice example of Java code being used to do a
>>> HTTP POST of a file to Tika Server.
>>>
>>> The CURL command that I want to convert to Java code is:
>>>
>>> curl -T /myfile.pdf http://localhost:9998/rmeta --header
>>> "X-Tika-OCRLanguage: eng" --header "X-Tika-PDFOcrStrategy:
>>> ocr_and_text_extraction" --header "X-Tika-OCRoutputType: hocr"
>>>
>>> I’m not having much luck, so was wondering if anyone in the community
>>> had an example?
>>>
>>> Eric
>>>
>>>
>>>
>>> _______________________
>>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>>> | http://www.opensourceconnections.com | My Free/Busy
>>> <http://tinyurl.com/eric-cal>
>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>>> This e-mail and all contents, including attachments, is considered to be
>>> Company Confidential unless explicitly stated otherwise, regardless
>>> of whether attachments are marked as such.
>>>
>>>
Re: Anyone can share an example of Java code POSTing a file to Tika-Server?
Posted by Tim Allison <ta...@apache.org>.
Oops...that example requires you to use the non-secure settings and ships
the file url to tika server to read from a fileshare. I'll add an example
with putting bytes later today.
On Tue, Feb 4, 2020 at 8:33 AM Tim Allison <ta...@apache.org> wrote:
> Try:
> https://github.com/tballison/tika-addons/blob/master/tika-eval-solrj/src/main/java/org/tallison/tikaeval/example/TikaServerClient.java#L148
>
> On Tue, Feb 4, 2020 at 8:25 AM Eric Pugh <ep...@opensourceconnections.com>
> wrote:
>
>> Shockingly, I can’t find a nice example of Java code being used to do a
>> HTTP POST of a file to Tika Server.
>>
>> The CURL command that I want to convert to Java code is:
>>
>> curl -T /myfile.pdf http://localhost:9998/rmeta --header
>> "X-Tika-OCRLanguage: eng" --header "X-Tika-PDFOcrStrategy:
>> ocr_and_text_extraction" --header "X-Tika-OCRoutputType: hocr"
>>
>> I’m not having much luck, so was wondering if anyone in the community had
>> an example?
>>
>> Eric
>>
>>
>>
>> _______________________
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>> | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>
Re: Anyone can share an example of Java code POSTing a file to Tika-Server?
Posted by Tim Allison <ta...@apache.org>.
Try:
https://github.com/tballison/tika-addons/blob/master/tika-eval-solrj/src/main/java/org/tallison/tikaeval/example/TikaServerClient.java#L148
On Tue, Feb 4, 2020 at 8:25 AM Eric Pugh <ep...@opensourceconnections.com>
wrote:
> Shockingly, I can’t find a nice example of Java code being used to do a
> HTTP POST of a file to Tika Server.
>
> The CURL command that I want to convert to Java code is:
>
> curl -T /myfile.pdf http://localhost:9998/rmeta --header
> "X-Tika-OCRLanguage: eng" --header "X-Tika-PDFOcrStrategy:
> ocr_and_text_extraction" --header "X-Tika-OCRoutputType: hocr"
>
> I’m not having much luck, so was wondering if anyone in the community had
> an example?
>
> Eric
>
>
>
> _______________________
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>