You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Björn Kremer <bk...@patorg.de> on 2011/03/07 11:12:17 UTC

[Lucene.Net] Google Ranking with Lucene

Hello,

i have just read that google has optimised its ranking. Now google shows 
more relevant results on the first pagen as before. Is there a chance to 
get advantage of this ranking algorithm with Lucene?

Thank You
Björn

Re: [Lucene.Net] Google Ranking with Lucene

Posted by Troy Howard <th...@gmail.com>.
Another good way to put this:

Google is an application which crawls and indexes the internet and
provides search functionality to end users. It uses domain specific
logic to perform it's ranking to improve relevance.

Lucene and Lucene.Net is a library, which allows end users to build
full text search into applications. It has no domain-specific logic
built into it, but rather provides the building blocks that can be
used to build any kind of domain specific search logic.

You certainly could implement something like Google's Page Rank
algorithm using Lucene's scoring system, but it would be a custom,
application/domain specific implementation.. Not something that would
be appropriate for a library like Lucene. This might be better posed
to the Apache Nutch project which uses Lucene and Solr as part of a
Google-like web crawler/indexer.

You can read about Apache Nutch here:

http://nutch.apache.org/index.html

Thanks,
Troy


On Mon, Mar 7, 2011 at 6:35 PM, Prescott Nasser <ge...@hotmail.com> wrote:
>
> Hello Björn.
>
> In short, no.
>
> What google does to their algorithmic search is extremely secretive. It is also a very limited subset of the type of data Lucene.Net might store. It uses lots of signals from around the web, such as how many people link to a particular page, to guage the important of a particular search result. This becomes irrelveant if you are indexing something such as research documents. People don't link to other documents in the traditional sense.
>
> Further, the recent changes they made really focused on downranking "content farms" where data is cheaply put together and lots of advertising is added to generate advertising revenues. Their changes don't really apply to Lucene. That would more apply to how you decide to index your content, how deal with strengtheners, etc.
>
> Hope that helps,
> ~Prescott Nasser
>
>
>
>
>
> ----------------------------------------
>> Date: Mon, 7 Mar 2011 11:12:17 +0100
>> From: bkr@patorg.de
>> To: lucene-net-dev@lucene.apache.org
>> Subject: [Lucene.Net] Google Ranking with Lucene
>>
>> Hello,
>>
>> i have just read that google has optimised its ranking. Now google shows
>> more relevant results on the first pagen as before. Is there a chance to
>> get advantage of this ranking algorithm with Lucene?
>>
>> Thank You
>> Björn

RE: [Lucene.Net] Google Ranking with Lucene

Posted by Prescott Nasser <ge...@hotmail.com>.
Hello Björn.
 
In short, no.
 
What google does to their algorithmic search is extremely secretive. It is also a very limited subset of the type of data Lucene.Net might store. It uses lots of signals from around the web, such as how many people link to a particular page, to guage the important of a particular search result. This becomes irrelveant if you are indexing something such as research documents. People don't link to other documents in the traditional sense.
 
Further, the recent changes they made really focused on downranking "content farms" where data is cheaply put together and lots of advertising is added to generate advertising revenues. Their changes don't really apply to Lucene. That would more apply to how you decide to index your content, how deal with strengtheners, etc.
 
Hope that helps,
~Prescott Nasser





----------------------------------------
> Date: Mon, 7 Mar 2011 11:12:17 +0100
> From: bkr@patorg.de
> To: lucene-net-dev@lucene.apache.org
> Subject: [Lucene.Net] Google Ranking with Lucene
>
> Hello,
>
> i have just read that google has optimised its ranking. Now google shows
> more relevant results on the first pagen as before. Is there a chance to
> get advantage of this ranking algorithm with Lucene?
>
> Thank You
> Björn