You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2009/02/12 05:13:59 UTC
[jira] Commented: (NUTCH-676) MapWritable is written inefficiently
and confusingly
[ https://issues.apache.org/jira/browse/NUTCH-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672870#action_12672870 ]
Hudson commented on NUTCH-676:
------------------------------
Integrated in Nutch-trunk #722 (See [http://hudson.zones.apache.org/hudson/job/Nutch-trunk/722/])
NUTCH-683 - broke CrawlDbMerger
> MapWritable is written inefficiently and confusingly
> ----------------------------------------------------
>
> Key: NUTCH-676
> URL: https://issues.apache.org/jira/browse/NUTCH-676
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 0.9.0
> Reporter: Todd Lipcon
> Assignee: Doğacan Güney
> Priority: Minor
> Fix For: 1.0.0
>
> Attachments: 0001-NUTCH-676-Replace-MapWritable-implementation-with-t.patch, NUTCH-676_v2.patch, NUTCH-676_v3.patch
>
>
> The MapWritable implemention in o.a.n.crawl is written confusingly - it maintains its own internal linked list which I think may have a bug somewhere (I'm getting an NPE in certain cases in the code, though it's hard to track down)
> Can anyone comment as to why MapWritable is written the way it is, rather than just using a HashMap or a LinkedHashMap if consistent ordering is important? I imagine that would improve performance.
> What about just using the Hadoop MapWritable? Obviously that would break some backwards compatibility but it may be a good idea at some point to reduce confusion (I didn't realize that Nutch had its own impl until a few minutes ago)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.