You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/05/15 14:49:04 UTC

[jira] [Commented] (TIKA-2360) Handle SentimentParser resource failure more robustly

    [ https://issues.apache.org/jira/browse/TIKA-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010631#comment-16010631 ] 

Tim Allison commented on TIKA-2360:
-----------------------------------

My preference would be not to include the SentimentParser by default:

1) network calls that are not currently robustly handled

2) .sent glob in mime detection which could cause problems for users who happen to have files with that suffix, and y, I can't imagine users have a bunch of Apple II files kicking around, but this is a mildly worrisome method of triggering the SentimentParser

3) while very cool, it is a fundamentally different thing than a parser.  It enriches already extracted UTF-8 text, kind of like the phone number handler, etc.  I realize NER does exactly the same thing...I know...



> Handle SentimentParser resource failure more robustly
> -----------------------------------------------------
>
>                 Key: TIKA-2360
>                 URL: https://issues.apache.org/jira/browse/TIKA-2360
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Blocker
>
> The SentimentParser tests currently require a network call to github.  For those working behind a proxy or would prefer Tika not to make unexpected network calls, can we please turn this off by default?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)