You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Rogério Pereira Araújo (JIRA)" <ji...@apache.org> on 2012/11/02 19:35:11 UTC

[jira] [Created] (NUTCH-1489) elasticindex should report the indexed documents like solrindex does

Rogério Pereira Araújo created NUTCH-1489:
---------------------------------------------

             Summary: elasticindex should report the indexed documents like solrindex does
                 Key: NUTCH-1489
                 URL: https://issues.apache.org/jira/browse/NUTCH-1489
             Project: Nutch
          Issue Type: Improvement
          Components: indexer
    Affects Versions: 2.1
            Reporter: Rogério Pereira Araújo
            Priority: Trivial


When I run:

nutch elasticindex elasticsearch

To index crawled documents in a standard elasticsearch setup, the process takes some time, finishes, but doesn't report how many documents was indexed, it would be nice to have the same feedback as solrindex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NUTCH-1489) elasticindex should report the indexed documents like solrindex does

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490629#comment-13490629 ] 

Lewis John McGibbney commented on NUTCH-1489:
---------------------------------------------

Hi Rogério, when I came back to look at this this morning I am struggling to see what other logging you would like over and above what is currently implemented in ElasticWriter...
Further to your comments... I propose to close this issue as the logging seems pretty explicit and just fine to me. 
                
> elasticindex should report the indexed documents like solrindex does
> --------------------------------------------------------------------
>
>                 Key: NUTCH-1489
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1489
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 2.1
>            Reporter: Rogério Pereira Araújo
>            Priority: Trivial
>
> When I run:
> nutch elasticindex elasticsearch
> To index crawled documents in a standard elasticsearch setup, the process takes some time, finishes, but doesn't report how many documents was indexed, it would be nice to have the same feedback as solrindex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NUTCH-1489) elasticindex should report the indexed documents like solrindex does

Posted by "Ferdy Galema (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493829#comment-13493829 ] 

Ferdy Galema commented on NUTCH-1489:
-------------------------------------

Agree with Lewis, it seems there is already logging about the number of docs. (In the indexer task "total docs = XXX"). If you deploy distributed and you want to see totals, you should check the MapReduce counter "Map output records=XXX" that will aggregate the docs in each task.

Rogério, if you are convinced please let us know or otherwise we can close this off.
                
> elasticindex should report the indexed documents like solrindex does
> --------------------------------------------------------------------
>
>                 Key: NUTCH-1489
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1489
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 2.1
>            Reporter: Rogério Pereira Araújo
>            Priority: Trivial
>
> When I run:
> nutch elasticindex elasticsearch
> To index crawled documents in a standard elasticsearch setup, the process takes some time, finishes, but doesn't report how many documents was indexed, it would be nice to have the same feedback as solrindex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NUTCH-1489) elasticindex should report the indexed documents like solrindex does

Posted by "Lewis John McGibbney (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489689#comment-13489689 ] 

Lewis John McGibbney commented on NUTCH-1489:
---------------------------------------------

Hi Rogério, yes you are right here. This will be simple to implement and a patch would be very much appreciated :) SPecifically you need to hack o.a.n.indexer.elastic.ElasticWriter#ElasticWriter(), however identical  logging for the existing Solr implementation within other classes within the elastic package would also be very much appreciated.  
                
> elasticindex should report the indexed documents like solrindex does
> --------------------------------------------------------------------
>
>                 Key: NUTCH-1489
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1489
>             Project: Nutch
>          Issue Type: Improvement
>          Components: indexer
>    Affects Versions: 2.1
>            Reporter: Rogério Pereira Araújo
>            Priority: Trivial
>
> When I run:
> nutch elasticindex elasticsearch
> To index crawled documents in a standard elasticsearch setup, the process takes some time, finishes, but doesn't report how many documents was indexed, it would be nice to have the same feedback as solrindex.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira