You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by je...@bnf.fr on 2013/09/23 14:32:27 UTC

[DIH] Logging skipped documents

Hello,

I have a question, I index documents and a small part them are skipped, (I
am in onError="skip" mode)
I'm trying to get a list of them, in order to analyse what's worng with
these documents
Is there a mean to get the list of skipped documents, and some more
information (my onError="skip" is in an XPathEntityProcessor, the name of
the file processed would be OK)


Cordialement,
-----------------------------------------------
Jérôme Dupont
Bibliothèque Nationale de France
Département des Systèmes d'Information
Tour T3 - Quai François Mauriac
75706 Paris Cedex 13
téléphone: 33 (0)1 53 79 45 40
e-mail: jerome.dupont@bnf.fr
-----------------------------------------------



Participez à la Grande Collecte 1914-1918 Avant d'imprimer, pensez à l'environnement. 

Re: [DIH] Logging skipped documents

Posted by Stefan Matheis <ma...@gmail.com>.
Jérôme

Just had a quick look at the source of http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/XPathEntityProcessor.java?view=markup#l324 .. which looks like there is LOG.warn(msg, e); Statement on Line 331 where msg should include the url for the tried document?

Otherwise, if that's not the place where the exception happens .. you might be able to add LOG Statements all by yourself and compile SOLR from Source (again) to make that work?

-Stefan  


On Monday, September 23, 2013 at 2:32 PM, jerome.dupont@bnf.fr wrote:

>  
> Hello,
>  
> I have a question, I index documents and a small part them are skipped, (I
> am in onError="skip" mode)
> I'm trying to get a list of them, in order to analyse what's worng with
> these documents
> Is there a mean to get the list of skipped documents, and some more
> information (my onError="skip" is in an XPathEntityProcessor, the name of
> the file processed would be OK)
>  
>  
> Cordialement,
> -----------------------------------------------
> Jérôme Dupont
> Bibliothèque Nationale de France
> Département des Systèmes d'Information
> Tour T3 - Quai François Mauriac
> 75706 Paris Cedex 13
> téléphone: 33 (0)1 53 79 45 40
> e-mail: jerome.dupont@bnf.fr (mailto:jerome.dupont@bnf.fr)
> -----------------------------------------------
>  
>  
>  
> Participez à la Grande Collecte 1914-1918 Avant d'imprimer, pensez à l'environnement.