You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/05/23 08:34:04 UTC

[jira] [Commented] (NUTCH-2388) bin/crawl indexing only webpages containing batchID instead of all in 2.x

    [ https://issues.apache.org/jira/browse/NUTCH-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020807#comment-16020807 ] 

ASF GitHub Bot commented on NUTCH-2388:
---------------------------------------

kaidul opened a new pull request #191: NUTCH-2388 bin/crawl indexing only webpages of current batch instead of all
URL: https://github.com/apache/nutch/pull/191
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> bin/crawl indexing only webpages containing batchID instead of all in 2.x
> -------------------------------------------------------------------------
>
>                 Key: NUTCH-2388
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2388
>             Project: Nutch
>          Issue Type: Bug
>          Components: bin
>    Affects Versions: 2.3
>            Reporter: Kaidul Islam
>            Assignee: Kaidul Islam
>            Priority: Trivial
>             Fix For: 2.4
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> During each iteration, after generating, fetching, parsing and updating DB the current batch, the indexer is supposed to index the current batch too. But its indexing all currently.
> {code}
> __bin_nutch index $commonOptions -D solr.server.url=$SOLRURL -all -crawlId "$CRAWL_ID"
> {code}
> It should be like below i guess -
> {code}
> __bin_nutch index $commonOptions -D solr.server.url=$SOLRURL $batchId -crawlId "$CRAWL_ID"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)