You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by yeshwanth kumar <ye...@gmail.com> on 2015/03/18 06:31:07 UTC

Integrating custom parsers to Nutch Crawl

hi

how can i integrate a custom parser into nutch crawl.

thanks,
-Yeshwanth

Re: Integrating custom parsers to Nutch Crawl

Posted by Jorge Luis Betancourt González <jl...@uci.cu>.
You just need to write a parser plugin, meaning implementing the org.apache.nutch.parse.HtmlParseFilter (hint! just one method). You can check some of the plugins shipped with the default distribution (all the parse-* plugins).

Regards,

----- Original Message -----
From: "yeshwanth kumar" <ye...@gmail.com>
To: user@nutch.apache.org
Sent: Wednesday, March 18, 2015 1:31:07 AM
Subject: Integrating custom parsers to Nutch Crawl

hi

how can i integrate a custom parser into nutch crawl.

thanks,
-Yeshwanth