You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by kamaci <fu...@gmail.com> on 2013/08/21 19:19:27 UTC

Display Document Count Added To Solr Server

Currently you can not see how many documents are added to Solr Server. One
could see how many documents are added to Solr server simultaneously (as a
hadoop counter) and after all documents are added total document count
should be logged too.

I have made a patch for that purpose
(https://issues.apache.org/jira/browse/NUTCH-1631). Thar patch counts
documents added to Solr Server and writes it to context as a Hadoop counter.
So one can see how many documents are added simultaneously at Hadoop
Map/Reduce Administration page. 

On the other hand SolrWriter logs how many documents are added at each batch
(maximum of commit size) but does not log total count at the end of indexing
process. That patch also logs total document count as well as writing to
Hadoop context as a counter.



--
View this message in context: http://lucene.472066.n3.nabble.com/Display-Document-Count-Added-To-Solr-Server-tp4085918.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Display Document Count Added To Solr Server

Posted by kamaci <fu...@gmail.com>.
Hi Lewis;

Thanks for your comment. Could you add it as a comment at Jira
(https://issues.apache.org/jira/browse/NUTCH-1631) so other commiters can
see it.



--
View this message in context: http://lucene.472066.n3.nabble.com/Display-Document-Count-Added-To-Solr-Server-tp4085918p4086120.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Display Document Count Added To Solr Server

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Nice work.
The patch looks good and I would be +1 to getting it in to the codebase.
Thanks
Lewis

On Wednesday, August 21, 2013, kamaci <fu...@gmail.com> wrote:
> Currently you can not see how many documents are added to Solr Server. One
> could see how many documents are added to Solr server simultaneously (as a
> hadoop counter) and after all documents are added total document count
> should be logged too.
>
> I have made a patch for that purpose
> (https://issues.apache.org/jira/browse/NUTCH-1631). Thar patch counts
> documents added to Solr Server and writes it to context as a Hadoop
counter.
> So one can see how many documents are added simultaneously at Hadoop
> Map/Reduce Administration page.
>
> On the other hand SolrWriter logs how many documents are added at each
batch
> (maximum of commit size) but does not log total count at the end of
indexing
> process. That patch also logs total document count as well as writing to
> Hadoop context as a counter.
>
>
>
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/Display-Document-Count-Added-To-Solr-Server-tp4085918.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

-- 
*Lewis*