You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Naama Hophstatder (Jira)" <ji...@apache.org> on 2022/03/02 07:14:00 UTC

[jira] [Created] (TIKA-3684) Extract text returns the text multiple times

Naama Hophstatder created TIKA-3684:
---------------------------------------

             Summary: Extract text returns the text multiple times
                 Key: TIKA-3684
                 URL: https://issues.apache.org/jira/browse/TIKA-3684
             Project: Tika
          Issue Type: Bug
          Components: docker
    Affects Versions: 2.1.0
            Reporter: Naama Hophstatder
         Attachments: example.docx

We are using tika docker container as a linux service, when I want to extract text from a word document, e.g.:

curl -T example.docx http://localhost:9998/tika --header "Accept: text/plain"

we get the text 3 times.

Notice: We also have tika server v1.14, and this version returns the text just as expected.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)