You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2015/03/26 05:33:54 UTC
[jira] [Commented] (NUTCH-1959) Improving CommonCrawlFormat
implementations
[ https://issues.apache.org/jira/browse/NUTCH-1959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381350#comment-14381350 ]
Chris A. Mattmann commented on NUTCH-1959:
------------------------------------------
Hi [~gostep] and [~lewismc] question - did NUTCH-1974 include this patch? I just tried to apply it and it thinks I'm trying to reverse the patch? If so, can you please close this as I committed and close NUTCH-1974. Let me know. Thanks!
> Improving CommonCrawlFormat implementations
> -------------------------------------------
>
> Key: NUTCH-1959
> URL: https://issues.apache.org/jira/browse/NUTCH-1959
> Project: Nutch
> Issue Type: Improvement
> Affects Versions: 1.9
> Reporter: Giuseppe Totaro
> Assignee: Chris A. Mattmann
> Priority: Minor
> Attachments: NUTCH-1959.patch, NUTCH-1959.v02.patch
>
>
> {{CommonCrawlFormat}} is an interface for Java classes that implement methods for writing data into Common Crawl format. {{AbstractCommonCrawlFormat}} is an abstract class that implements {{CommonCrawlFormat}} and provides abstract methods for "CommonCrawl formatter" classes.
> You can find in attachment a PATCH that includes some improvements for {{CommonCrawlFormat}}-based classes;
> * {{CommonCrawlFormat}} and {{AbstractCommonCrawlFormat}} now provide only the {{getJsonData()}} method, responsible for getting out JSON data.
> * {{AbstractCommonCrawlFormat}} provides also the abstract methods that each subclass has to implement in order to handle JSON objects.
> * {{CommonCrawlFormatSimple}} is a {{StringBuilder}}-based formatter that now provide also escaping of JSON string values.
> This PATCH aims at providing a better interface for implementing/extending {{CommonCrawlFormat}} classes.
> I would really appreciate your feedback.
> Thanks a lot,
> Giuseppe
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)