You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Koji Sekiguchi (JIRA)" <ji...@apache.org> on 2011/05/12 06:09:47 UTC

[jira] [Created] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

uima: add an ability to skip runtime error in AnalysisEngine
------------------------------------------------------------

                 Key: SOLR-2512
                 URL: https://issues.apache.org/jira/browse/SOLR-2512
             Project: Solr
          Issue Type: Improvement
    Affects Versions: 3.1
            Reporter: Koji Sekiguchi
            Priority: Minor
             Fix For: 3.2, 4.0


Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032339#comment-13032339 ] 

Koji Sekiguchi commented on SOLR-2512:
--------------------------------------

Hi Tommaso, thank you for updating the patch!

In my patch, I try to log the first 100 chars of the target text in the error message because an online NLP service I'm using is error-prone when I post a large text. But you are using SolrInputDocument in the updated patch. I'd like my method rather than logging whole solr document.

I think that users who set ignoreErrors=true want to know the fact that the error occurs, but don't want to see whole document in the error message.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Tommaso Teofili (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032265#comment-13032265 ] 

Tommaso Teofili commented on SOLR-2512:
---------------------------------------

Hello Koji, in the first implementations (see SOLR-2129) the UIMAUpdateProcessor was ignoring errors on  UIMA pipelines and I thought it was good to take control of what was happening and if any exception was thrown, however I get your point and my opinion is that that behavior should be configurable with a parameter like <bool name="ignore-errors">true</bool>.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2512:
---------------------------------

    Attachment: SOLR-2512.patch

A draft patch attached. It doesn't include the switch.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Tommaso Teofili (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033060#comment-13033060 ] 

Tommaso Teofili commented on SOLR-2512:
---------------------------------------

+1

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033058#comment-13033058 ] 

Koji Sekiguchi commented on SOLR-2512:
--------------------------------------

I'll commit soon.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Tommaso Teofili (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032265#comment-13032265 ] 

Tommaso Teofili edited comment on SOLR-2512 at 5/12/11 8:10 AM:
----------------------------------------------------------------

Hello Koji, in the first implementations (see SOLR-2129) the UIMAUpdateProcessor was ignoring errors on UIMA pipelines but me and others thought it was good to take control of what was happening if any exception was thrown rather than ignoring it; however I get your point and my opinion is that that behavior should be configurable with a parameter like <bool name="ignoreErrors">true</bool>.

      was (Author: teofili):
    Hello Koji, in the first implementations (see SOLR-2129) the UIMAUpdateProcessor was ignoring errors on  UIMA pipelines and I thought it was good to take control of what was happening and if any exception was thrown, however I get your point and my opinion is that that behavior should be configurable with a parameter like <bool name="ignoreErrors">true</bool>.
  
> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2512:
---------------------------------

    Attachment: SOLR-2512.patch

A new patch. I added a test case for the flag true|false.

About the logging uniqueKey, yeah I could get the uniqueKey, but it cannot be taken from cmd without schema. So I understood the idea in your patch.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Tommaso Teofili (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032357#comment-13032357 ] 

Tommaso Teofili commented on SOLR-2512:
---------------------------------------

bq. I think that users who set ignoreErrors=true want to know the fact that the error occurs, but don't want to see whole document in the error message.

You're right Koji. Considering your comment I am wondering if it may be better to get the uniqueid so that one can easily debug the document which caused that error from that without having to see the text in the log.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032447#comment-13032447 ] 

Koji Sekiguchi commented on SOLR-2512:
--------------------------------------

If no one objects, I'll commit tomorrow.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Tommaso Teofili (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tommaso Teofili updated SOLR-2512:
----------------------------------

    Attachment: SOLR-2512.patch

first patch which adds a configuration parameter <bool name="ignoreErrors"> to decide if errors in UIMA pipeliness execution should avoid document indexing or not

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Tommaso Teofili (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032265#comment-13032265 ] 

Tommaso Teofili edited comment on SOLR-2512 at 5/12/11 8:07 AM:
----------------------------------------------------------------

Hello Koji, in the first implementations (see SOLR-2129) the UIMAUpdateProcessor was ignoring errors on  UIMA pipelines and I thought it was good to take control of what was happening and if any exception was thrown, however I get your point and my opinion is that that behavior should be configurable with a parameter like <bool name="ignoreErrors">true</bool>.

      was (Author: teofili):
    Hello Koji, in the first implementations (see SOLR-2129) the UIMAUpdateProcessor was ignoring errors on  UIMA pipelines and I thought it was good to take control of what was happening and if any exception was thrown, however I get your point and my opinion is that that behavior should be configurable with a parameter like <bool name="ignore-errors">true</bool>.
  
> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2512:
---------------------------------

    Attachment: SOLR-2512.patch

Updated patch attached. I understand the requirement for logging uniqueKey. In this patch, I introduced optional parameter logField:

{code}
<bool name="ignoreErrors">true</bool>
<!-- This is optional. It is used for logging when text processing fails. Usually, set uniqueKey field name -->
<str name="logField">id</str>
{code}

It is effective regardless of ignoreErrors setting (see the patch).

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi reassigned SOLR-2512:
------------------------------------

    Assignee: Koji Sekiguchi

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi resolved SOLR-2512.
----------------------------------

    Resolution: Fixed

trunk: Committed revision 1102785.
3x: Committed revision 1102789.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Koji Sekiguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Sekiguchi updated SOLR-2512:
---------------------------------

    Attachment: SOLR-2512.patch

README.txt updated.

> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2512) uima: add an ability to skip runtime error in AnalysisEngine

Posted by "Tommaso Teofili (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032483#comment-13032483 ] 

Tommaso Teofili commented on SOLR-2512:
---------------------------------------

One more thing I'd change is using StringBuilder with append() instead of String concatenation ("some string" + "another string") inside the catch block of UIMAUpdateRequestProcessor.processAdd() method (I did so in my patch) since it's more efficient.

Still I'm not sure logging the first 100 chars of text is a good idea but you're right that we should maintain the schema information to know what field is the uniquekey and this would put unnecessary coupling between the two classes.


> uima: add an ability to skip runtime error in AnalysisEngine
> ------------------------------------------------------------
>
>                 Key: SOLR-2512
>                 URL: https://issues.apache.org/jira/browse/SOLR-2512
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 3.1
>            Reporter: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2512.patch, SOLR-2512.patch, SOLR-2512.patch
>
>
> Currently, if AnalysisEngine throws an exception during processing a text, whole adding docs go fail. Because online NLP services are error-prone, users should be able to choose whether solr skips the text processing (but source text can be indexed) for the document or throws a runtime exception so that solr can stop adding documents entirely.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org