You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "abobrobo@gmail.com" <ab...@gmail.com> on 2006/09/03 12:02:54 UTC

How to add regular expression to nutch

hey.
I want to get some information from a web site,eg:I want to get the
company name and the adress from the company's homepage,so I should
design a regular expression for the need,but where should I add to the
Nutch source?

Re: How to add regular expression to nutch

Posted by Zaheed Haque <za...@gmail.com>.
Hi..

You can start from here..

http://wiki.apache.org/nutch/

About writing plugin
http://wiki.apache.org/nutch/PluginCentral

About development env.

http://wiki.apache.org/nutch/RunNutchInEclipse

Cheers


On 9/3/06, abobrobo@gmail.com <ab...@gmail.com> wrote:
> Zaheed Haque 写道:
> > Hi.
> >
> > You need to write a new plugin. Use the creativecommons plugin under
> > src/plugin/creativecommons as a template for your plugin.
> >
> > cheers
> > Zaheed
> >
> > On 9/3/06, abobrobo@gmail.com <ab...@gmail.com> wrote:
> >> hey.
> >> I want to get some information from a web site,eg:I want to get the
> >> company name and the adress from the company's homepage,so I should
> >> design a regular expression for the need,but where should I add to the
> >> Nutch source?
> >>
> >
> to zaheedHaque
> thanks,I am a beginner at the developing at plugin,so it is appriciate
> to show me some guides or resource.
> eg:what tools should I use to code?How to compile it?and How to
> configuretion the xml?
> thanks a lot
> coden
>

Re: How to add regular expression to nutch

Posted by "abobrobo@gmail.com" <ab...@gmail.com>.
Zaheed Haque 写道:
> Hi.
>
> You need to write a new plugin. Use the creativecommons plugin under
> src/plugin/creativecommons as a template for your plugin.
>
> cheers
> Zaheed
>
> On 9/3/06, abobrobo@gmail.com <ab...@gmail.com> wrote:
>> hey.
>> I want to get some information from a web site,eg:I want to get the
>> company name and the adress from the company's homepage,so I should
>> design a regular expression for the need,but where should I add to the
>> Nutch source?
>>
>
to zaheedHaque
thanks,I am a beginner at the developing at plugin,so it is appriciate 
to show me some guides or resource.
eg:what tools should I use to code?How to compile it?and How to 
configuretion the xml?
thanks a lot
coden

Re: How to add regular expression to nutch

Posted by Zaheed Haque <za...@gmail.com>.
Hi.

You need to write a new plugin. Use the creativecommons plugin under
src/plugin/creativecommons as a template for your plugin.

cheers
Zaheed

On 9/3/06, abobrobo@gmail.com <ab...@gmail.com> wrote:
> hey.
> I want to get some information from a web site,eg:I want to get the
> company name and the adress from the company's homepage,so I should
> design a regular expression for the need,but where should I add to the
> Nutch source?
>