You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2014/08/24 06:35:11 UTC

[jira] [Resolved] (SOLR-6385) Strange behavior on indexing document with wrong date format

     [ https://issues.apache.org/jira/browse/SOLR-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson resolved SOLR-6385.
----------------------------------

    Resolution: Duplicate

Discussed on mailing lit. Plus, the whole question of what to do when one document in a batch is bad, or how to deal with errors in asynchronous requests has other JIRAs.

> Strange behavior on indexing document with wrong date format
> ------------------------------------------------------------
>
>                 Key: SOLR-6385
>                 URL: https://issues.apache.org/jira/browse/SOLR-6385
>             Project: Solr
>          Issue Type: Bug
>          Components: clients - java
>    Affects Versions: 4.7.2
>         Environment: Solr server in Windows 7, solrj
>            Reporter: Denis Shishlyannikoc
>            Priority: Critical
>
> Hello.
> I try to work with solr lately and did not get much experience with it yet, so part of problems that I will describe here can be due to lack of knowledge.
> Excuse me for that.
> Problems that I saw:
> 1) I use solj to index collection of SolrInputDocuments.
> To do it I call method add(Collection) of CloudSolrServer object.
> Just for fun I tried to index one of documents with not correct date:
> I took solr valid date value of one of these SolrInputDocuments and changed the "T" symbol in it to "K".
> (this date is defined in schema.xml as 
> <field name="mydate" type="tdate" indexed="true" stored="true" multiValued="false" />	)
> Solr failed to index collection and returned SolrServerException.
> Also what happened above is that part of documents of this SolrInputDocuments collection got indexed correctly, problematic date document failed to be indexed together with several valid (from all points of view) SolrInputDocuments of this collection.
> Looks like solr went through documents in collection, indexing them one by one, trowed exception on problematic date document and finally did not index all valid documents that were after problematic date document.
> 2) After failure, described in 1), solr kept problematic date document in some queue and tried to reindex this document again (attempt per some 3-5 minutes, did not measure exact time of that), showing same (failed to parse date) exception in logs! After solr server restart issue is gone: no more tries to reindex problematic date document.
> Questions to be answered
> 1) What is the default behavior of solr on indexing problematic values fields? 
> For example for date field: I expect solr to index null date (instead of not indexing of whole document) and then write some warning to logs and return some indication of problem on UpdateResponse. 
> Maybe solr behavior on not valid field values should be configurable (defined in some xml element in schema).
> 2) While indexing collection of documents, should solr index all valid documents (and not return on first problem as it happens now) ?
> If I index collection of documents, I expect solr to index all valid (from all points of view) documents and return indexing status on UpdateResponse about all not indexed problematic documents.
> 3) Why solr tries to reindex problematic document? Looks like bug that can create useless load on server.
> If this behavior is planned by design, then how can I force solr to stop reindexing such problem documents (without restarting of solr server)?
> Where can I read about it?
> Thank you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org