You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ayushya Devmurari <pa...@gmail.com> on 2015/03/16 13:21:39 UTC

Modifying crawling to capture required data.

Hi all,

How can I modify/create my custom class which can provide me the
required flexibility to fetch and store the required data from each
page that is being crawled.

Regards
Ayushya

Re: Modifying crawling to capture required data.

Posted by Mohammed Omer <be...@gmail.com>.
Sorry, totally meant to link to this thread:
http://mail-archives.apache.org/mod_mbox/nutch-user/201503.mbox/%3C261716599.2073136.1426740148455.JavaMail.zimbra%40uci.cu%3E

That message and mine before it describe writing a Parser plugin, which
should help you along the way. Let us know if you run into issues / want
some feedback.

Sorry about that!

Mo

On Thu, Mar 19, 2015 at 12:07 AM, Mohammed Omer <be...@gmail.com>
wrote:

> Check out
> http://mail-archives.apache.org/mod_mbox/nutch-user/201503.mbox/browser
>
> Thank you,
>
> Mo
>
>
> On Mon, Mar 16, 2015 at 7:21 AM, Ayushya Devmurari <
> pathfinder2104.work@gmail.com> wrote:
>
>> Hi all,
>>
>> How can I modify/create my custom class which can provide me the
>> required flexibility to fetch and store the required data from each
>> page that is being crawled.
>>
>> Regards
>> Ayushya
>>
>
>

Re: Modifying crawling to capture required data.

Posted by Mohammed Omer <be...@gmail.com>.
Check out
http://mail-archives.apache.org/mod_mbox/nutch-user/201503.mbox/browser

Thank you,

Mo

On Mon, Mar 16, 2015 at 7:21 AM, Ayushya Devmurari <
pathfinder2104.work@gmail.com> wrote:

> Hi all,
>
> How can I modify/create my custom class which can provide me the
> required flexibility to fetch and store the required data from each
> page that is being crawled.
>
> Regards
> Ayushya
>