You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Tejas Patil (JIRA)" <ji...@apache.org> on 2013/12/23 06:48:54 UTC

[jira] [Commented] (NUTCH-1689) Improve CrawlDb stats

    [ https://issues.apache.org/jira/browse/NUTCH-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13855416#comment-13855416 ] 

Tejas Patil commented on NUTCH-1689:
------------------------------------

Some concerns:
1. While you are removing fields from the output, there can be people relying on the existing output (grepping or awking to get required fields). It ain't wise to simply remove off all the fields directly. Keep things backward compatible.
2. You can make the command configurable so that users get to select what all fields they want in the output
3. While submitting patch, commenting out the older code is not the best way. Remove those lines instead of commenting them out.

> Improve CrawlDb stats
> ---------------------
>
>                 Key: NUTCH-1689
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1689
>             Project: Nutch
>          Issue Type: Improvement
>            Reporter: Nguyen Manh Tien
>            Priority: Minor
>             Fix For: 2.3
>
>         Attachments: NUTCH-1689.patch
>
>
> Crawldb stats now is slow due to it load all fields from store, I change to load only necessary fields.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)