You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Bui Quang Hung <bq...@nishilab.sys.es.osaka-u.ac.jp> on 2006/08/29 08:00:13 UTC

A text-based search engine

Hi,

I am finding for a text-based search engine on the Web.

I have to do some experiments with HITS (Hypertext Induced Topic Seclection)
algorithm of J. Kleinberg. These experiments require a text-based search
engine to obtain web pages which include the inputted query terms. I think I
can not use current link-based search engine such as Google, Yahoo, MSN.
Furthermore, results outputted by these link-based search engines are too
good, I am afraid that I can not see the effectiveness of HITS algorithm by
using them.

If you know there is a text-based search engine on the Web, could you please
tell me.

Thank you very much in advanced.

Best regards, 
Hung.




-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/430 - Release Date: 8/28/2006
 


RE: A text-based search engine

Posted by Vishal Shah <vi...@rediff.co.in>.
Hello Hung,

  I don't know any WWW search engine that does pure text-based ranking.
One way to satisfy your requirements using existing engines is to pick
up a few results from each page of search results (1-10, 11-20, ...
191-200). You also need to filter out the pages that don't contain the
terms, coz most search engines use the anchor text of pages, apart from
the actual content as well.

This way, you might get a good mix of good and bad results. 

Regards,
-vishal.

-----Original Message-----
From: Bui Quang Hung [mailto:bqhung@nishilab.sys.es.osaka-u.ac.jp] 
Sent: Wednesday, August 30, 2006 12:57 AM
To: nutch-user@lucene.apache.org
Subject: RE: A text-based search engine

Hi,
I am afraid that my question is not clear.
My question is: Do you know any Web page repository on the Web which
satisfies the following two conditions:
- It contains at least 100 million pages.
- It provides a text-based ranking algorithm. We can obtain pages
including
the query terms.
Thank you very much in advance.
Best regards,
-----Original Message-----
From: Bui Quang Hung [mailto:bqhung@nishilab.sys.es.osaka-u.ac.jp] 
Sent: Tuesday, August 29, 2006 3:00 PM
To: nutch-user@lucene.apache.org
Subject: A text-based search engine

Hi,

I am finding for a text-based search engine on the Web.

I have to do some experiments with HITS (Hypertext Induced Topic
Seclection)
algorithm of J. Kleinberg. These experiments require a text-based search
engine to obtain web pages which include the inputted query terms. I
think I
can not use current link-based search engine such as Google, Yahoo, MSN.
Furthermore, results outputted by these link-based search engines are
too
good, I am afraid that I can not see the effectiveness of HITS algorithm
by
using them.

If you know there is a text-based search engine on the Web, could you
please
tell me.

Thank you very much in advanced.

Best regards, 
Hung.




-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/430 - Release Date:
8/28/2006
 

-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/430 - Release Date:
8/28/2006
 

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/430 - Release Date:
8/28/2006
 


RE: A text-based search engine

Posted by Bui Quang Hung <bq...@nishilab.sys.es.osaka-u.ac.jp>.
Hi,
I am afraid that my question is not clear.
My question is: Do you know any Web page repository on the Web which
satisfies the following two conditions:
- It contains at least 100 million pages.
- It provides a text-based ranking algorithm. We can obtain pages including
the query terms.
Thank you very much in advance.
Best regards,
-----Original Message-----
From: Bui Quang Hung [mailto:bqhung@nishilab.sys.es.osaka-u.ac.jp] 
Sent: Tuesday, August 29, 2006 3:00 PM
To: nutch-user@lucene.apache.org
Subject: A text-based search engine

Hi,

I am finding for a text-based search engine on the Web.

I have to do some experiments with HITS (Hypertext Induced Topic Seclection)
algorithm of J. Kleinberg. These experiments require a text-based search
engine to obtain web pages which include the inputted query terms. I think I
can not use current link-based search engine such as Google, Yahoo, MSN.
Furthermore, results outputted by these link-based search engines are too
good, I am afraid that I can not see the effectiveness of HITS algorithm by
using them.

If you know there is a text-based search engine on the Web, could you please
tell me.

Thank you very much in advanced.

Best regards, 
Hung.




-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/430 - Release Date: 8/28/2006
 

-- 
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/430 - Release Date: 8/28/2006
 

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.405 / Virus Database: 268.11.6/430 - Release Date: 8/28/2006