You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Stefan Groschupf <sg...@media-style.com> on 2005/12/16 20:27:32 UTC

"Something is Wrong with Google’s Mathematical Model"

Hi,
found this link on a news site, may some can found this interesting.
"An Israeli mathematician, Hillel Tal-Ezer, of the Academic College  
of Tel Aviv in Yaffo has written a paper on the faults of google's  
mathematical algorithms for page ranking"
http://www2.mta.ac.il/~hillel/data_mining/faults_of_PageRank.pdf

Cheers,
Stefan 

TrustRank (was Re: "Something is Wrong with Google’s Mathematical Model")

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Dec 16, 2005, at 4:09 PM, Fredrik Andersson wrote:
> While on the topic, during the "Bourbon update" earlier this year,  
> rumors
> were flying around about the "TrustRank" algorithm, which involved  
> some
> human input on validating credible sources of data on the web.  
> There's a
> paper from Stanford on that, http://www.vldb.org/conf/2004/ 
> RS15P3.PDF ,
> which is a fun read if you're an LSA geek.

This brings up an interesting and timely topic, feedback.

As we all know, Yahoo has assimilated two high profile folksonomy  
systems (flickr and del.icio.us).  Using real people to lend a hand  
in findabilty.

I'm curious if folks here are tackling user feedback such as ranking  
and tagging with Nutch (or just Lucene).  Anyone?  Besides Otis, of  
course, who has the wonderful Simpy service - it's a shame Yahoo  
didn't gobble him up, but maybe Google will :)

	Erik


Re: "Something is Wrong with Google’s Mathematical Model"

Posted by Fredrik Andersson <fi...@gmail.com>.
I know (or at least suspect) that Google has a distributed way of computing
a singular value decompositions for large matrices (i.e for the
term-document matrix). I think the same technique for dimension reduction
can be applied to approximate some eigenvalues of sparse matrices (the link
matrix, for instance), if I'm not mistaken - I'm getting kind of rusty on
the LSA mathematics these days. This would eliminate the need for infusing
fake links.

Anywho, the PageRank algorithm displayed in the original Google paper was
said to work on their dataset, which wasn't very big at the time. I'm sure
that they have modified the algorithm a lot since it was first published.

While on the topic, during the "Bourbon update" earlier this year, rumors
were flying around about the "TrustRank" algorithm, which involved some
human input on validating credible sources of data on the web. There's a
paper from Stanford on that, http://www.vldb.org/conf/2004/RS15P3.PDF ,
which is a fun read if you're an LSA geek.

On 12/16/05, Stefan Groschupf < sg@media-style.com> wrote:
>
> Hi,
> found this link on a news site, may some can found this interesting.
> "An Israeli mathematician, Hillel Tal-Ezer, of the Academic College
> of Tel Aviv in Yaffo has written a paper on the faults of google's
> mathematical algorithms for page ranking"
> http://www2.mta.ac.il/~hillel/data_mining/faults_of_PageRank.pdf<http://www2.mta.ac.il/%7Ehillel/data_mining/faults_of_PageRank.pdf>
>
> Cheers,
> Stefan
>

RE: "Something is Wrong with Google's Mathematical Model"

Posted by Paul Sutter <ps...@implicitlabs.com>.
The paper claims that he's developed a better algorithm. 

Has he published that yet?

Paul Sutter

-----Original Message-----
From: Stefan Groschupf [mailto:sg@media-style.com] 
Sent: Friday, December 16, 2005 11:28 AM
To: nutch-dev@lucene.apache.org
Subject: "Something is Wrong with Google's Mathematical Model"

Hi,
found this link on a news site, may some can found this interesting.
"An Israeli mathematician, Hillel Tal-Ezer, of the Academic College  
of Tel Aviv in Yaffo has written a paper on the faults of google's  
mathematical algorithms for page ranking"
http://www2.mta.ac.il/~hillel/data_mining/faults_of_PageRank.pdf

Cheers,
Stefan