You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Paul Chibulcuteanu (JIRA)" <ji...@apache.org> on 2017/06/21 13:48:00 UTC

[jira] [Updated] (OAK-6377) Text extraction with oak-run and tika requires fake string in the command to work

     [ https://issues.apache.org/jira/browse/OAK-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Chibulcuteanu updated OAK-6377:
------------------------------------
    Description: 
According to the [text-extraction documentation| https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md#step-3---perform-the-text-extraction] there is currently no need to set a segmentstore for the extract command.

{code}
    java -cp tika-app-1.15.jar:oak-run.jar \
    org.apache.jackrabbit.oak.run.Main tika \
    --data-file binary-stats.csv \
    --store-path ./store  \
    --fds-path /path/to/datastore  --extract
{code}

The command parser expects a string option so the workaround for this would be to provide a fake string at the end. 
e.g:
{code}
java -cp .......... --extract fakestore
{code}

  was:
According to the [text-extraction documentation| https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md#step-3---perform-the-text-extraction] there is currently no need to set a segmentstore for the extract command.

{code}
    java -cp tika-app-1.15.jar:oak-run.jar \
    org.apache.jackrabbit.oak.run.Main tika \
    --data-file binary-stats.csv \
    --store-path ./store  \
    --fds-path /path/to/datastore  extract
{code}

The command parser expects a string option so the workaround for this would be to provide a fake string at the end. 
e.g:
{code}
java -cp .......... --extract fakestore
{code}


> Text extraction with oak-run and tika requires fake string in the command to work
> ---------------------------------------------------------------------------------
>
>                 Key: OAK-6377
>                 URL: https://issues.apache.org/jira/browse/OAK-6377
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene, run
>    Affects Versions: 1.8, 1.7.2
>            Reporter: Paul Chibulcuteanu
>            Priority: Minor
>
> According to the [text-extraction documentation| https://github.com/apache/jackrabbit-oak/blob/trunk/oak-doc/src/site/markdown/query/pre-extract-text.md#step-3---perform-the-text-extraction] there is currently no need to set a segmentstore for the extract command.
> {code}
>     java -cp tika-app-1.15.jar:oak-run.jar \
>     org.apache.jackrabbit.oak.run.Main tika \
>     --data-file binary-stats.csv \
>     --store-path ./store  \
>     --fds-path /path/to/datastore  --extract
> {code}
> The command parser expects a string option so the workaround for this would be to provide a fake string at the end. 
> e.g:
> {code}
> java -cp .......... --extract fakestore
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)