You are viewing a plain text version of this content. The canonical link for it is here.
Posted to agent@nutch.apache.org by ad...@interfree.it on 2005/09/14 10:37:24 UTC
does Nutch crawl dynamic pages???
Hi,
I have some questions:
1) There are someone that know the limitations of nutch?
2) I have a site with frames of servlet , It is possible to crawl this page?
We see also that if the frame is a html page ,nutch-crawler works, instead if the frame is a servlet ,nutch-crawler doesn't work.
Please someone respond me!!!!!!!!
Adriano
-------------------------------------------------------------------------
Visita http://domini.interfree.it, il sito di Interfree dove trovare
soluzioni semplici e complete che soddisfano le tue esigenze in Internet,
ecco due esempi di offerte:
- Registrazione Dominio: un dominio con 1 MB di spazio disco + 2 caselle
email a soli 18,59 euro
- MioDominio: un dominio con 20 MB di spazio disco + 5 caselle email
a soli 51,13 euro
Vieni a trovarci!
Lo Staff di Interfree
-------------------------------------------------------------------------
Re: does Nutch crawl dynamic pages???
Posted by Jack Tang <hi...@gmail.com>.
Comment this line is ok
#-[?*!@=]
/Jack
On 9/14/05, mu xiaofeng <he...@gmail.com> wrote:
> yes ,
> edit you crawl-urlfilter.txt ,
>
> You should be able to get it to work by changing this:
>
> # skip URLs containing certain characters as probable queries, etc.
> -[?*!@=]
>
> To this:
>
> # skip URLs containing certain characters as probable queries, etc.
> -[*!@]
>
> 14 Sep 2005 08:37:24 -0000, adriano50@interfree.it <ad...@interfree.it>:
> >
> > Hi,
> >
> > I have some questions:
> >
> > 1) There are someone that know the limitations of nutch?
> > 2) I have a site with frames of servlet , It is possible to crawl this page?
> > We see also that if the frame is a html page ,nutch-crawler works, instead if the frame is a servlet ,nutch-crawler doesn't work.
> > Please someone respond me!!!!!!!!
> >
> > Adriano
> >
> >
> > -------------------------------------------------------------------------
> > Visita http://domini.interfree.it, il sito di Interfree dove trovare
> > soluzioni semplici e complete che soddisfano le tue esigenze in Internet,
> > ecco due esempi di offerte:
> >
> > - Registrazione Dominio: un dominio con 1 MB di spazio disco + 2 caselle
> > email a soli 18,59 euro
> > - MioDominio: un dominio con 20 MB di spazio disco + 5 caselle email
> > a soli 51,13 euro
> >
> > Vieni a trovarci!
> >
> > Lo Staff di Interfree
> > -------------------------------------------------------------------------
> >
> >
>
--
Keep Discovering ... ...
http://www.jroller.com/page/jmars
Re: does Nutch crawl dynamic pages???
Posted by mu xiaofeng <he...@gmail.com>.
yes ,
edit you crawl-urlfilter.txt ,
You should be able to get it to work by changing this:
# skip URLs containing certain characters as probable queries, etc.
-[?*!@=]
To this:
# skip URLs containing certain characters as probable queries, etc.
-[*!@]
14 Sep 2005 08:37:24 -0000, adriano50@interfree.it <ad...@interfree.it>:
>
> Hi,
>
> I have some questions:
>
> 1) There are someone that know the limitations of nutch?
> 2) I have a site with frames of servlet , It is possible to crawl this page?
> We see also that if the frame is a html page ,nutch-crawler works, instead if the frame is a servlet ,nutch-crawler doesn't work.
> Please someone respond me!!!!!!!!
>
> Adriano
>
>
> -------------------------------------------------------------------------
> Visita http://domini.interfree.it, il sito di Interfree dove trovare
> soluzioni semplici e complete che soddisfano le tue esigenze in Internet,
> ecco due esempi di offerte:
>
> - Registrazione Dominio: un dominio con 1 MB di spazio disco + 2 caselle
> email a soli 18,59 euro
> - MioDominio: un dominio con 20 MB di spazio disco + 5 caselle email
> a soli 51,13 euro
>
> Vieni a trovarci!
>
> Lo Staff di Interfree
> -------------------------------------------------------------------------
>
>