You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Lance Norskog (JIRA)" <ji...@apache.org> on 2011/01/18 04:56:46 UTC

[jira] Commented: (SOLR-445) XmlUpdateRequestHandler bad documents mid batch aborts rest of batch

    [ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982997#action_12982997 ] 

Lance Norskog commented on SOLR-445:
------------------------------------

bq. 2> From the original post, rolling this back will be tricky. Very tricky. The autocommit feature makes it indeterminate what's been committed to the index, so I don't know how to even approach rolling back everything.
Don't allow autocommits during an update. Simple. Or, rather, all update requests block at the beginning during an autocommit. If an update request has too many documents, don't do so many documents in an update.



> XmlUpdateRequestHandler bad documents mid batch aborts rest of batch
> --------------------------------------------------------------------
>
>                 Key: SOLR-445
>                 URL: https://issues.apache.org/jira/browse/SOLR-445
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Will Johnson
>            Assignee: Erick Erickson
>             Fix For: Next
>
>         Attachments: SOLR-445.patch, solr-445.xml
>
>
> Has anyone run into the problem of handling bad documents / failures mid batch.  Ie:
> <add>
>   <doc>
>     <field name="id">1</field>
>   </doc>
>   <doc>
>     <field name="id">2</field>
>     <field name="myDateField">I_AM_A_BAD_DATE</field>
>   </doc>
>   <doc>
>     <field name="id">3</field>
>   </doc>
> </add>
> Right now solr adds the first doc and then aborts.  It would seem like it should either fail the entire batch or log a message/return a code and then continue on to add doc 3.  Option 1 would seem to be much harder to accomplish and possibly require more memory while Option 2 would require more information to come back from the API.  I'm about to dig into this but I thought I'd ask to see if anyone had any suggestions, thoughts or comments.    

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org