You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Sjaiful Bahri <sb...@rocketmail.com> on 2008/10/29 03:51:31 UTC
Crawl News Site
Hi Folks,
FYI, I am running a crawl algorithm on zipclue.com
Zipclue is a prototype of News Search Engine (abbreviated NSE), which makes extensive use of the structure present in hypertext. Zipclue NSE is designed to crawl news information on the web effectively and efficiently.
Zipclue stores and produces historical electronic news around the world depends upon limiting parameters which we as the authors can set up, in this particular instance, we rank them based upon news release time (NRT) which practically try to grab the latest news out there. The prototype with a full text and hyperlink database of electronic news is available here:
http://zipclue.com
Regards,
Sjaiful Bahri
Re: Crawl News Site
Posted by Alexander Aristov <al...@gmail.com>.
Just my personal impression. Although you have implemented very different
style of dysplaying results to distict your sire from other similar sites it
is not so usefuls as with traditional apporach. Summary is very short. I
cannot undestand what the artcile about. Very hard to match links with
corresponding entry.
I would improve the page.
But anyway your attempt is very good.
Alex
2008/10/29 Sjaiful Bahri <sb...@rocketmail.com>
> Hi Folks,
>
> FYI, I am running a crawl algorithm on zipclue.com
>
> Zipclue is a prototype of News Search Engine (abbreviated NSE), which makes
> extensive use of the structure present in hypertext. Zipclue NSE is designed
> to crawl news information on the web effectively and efficiently.
>
> Zipclue stores and produces historical electronic news around the world
> depends upon limiting parameters which we as the authors can set up, in this
> particular instance, we rank them based upon news release time (NRT) which
> practically try to grab the latest news out there. The prototype with a full
> text and hyperlink database of electronic news is available here:
> http://zipclue.com
>
> Regards,
> Sjaiful Bahri
>
>
>
>
>
--
Best Regards
Alexander Aristov