You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by purpureleaf <pu...@gmail.com> on 2007/08/15 03:27:08 UTC

"omitted some entries very similar.." feature like google

Hi, when we search in google, sometimes you got
"In order to show you the most relevant results, we have omitted some
entries very similar to the 3 already displayed."
That means google thinks other results doesn't make sense. Most time it is
some word in the site's navigation bar. Can nutch do this?

For example, If I search 
site:http://species.wikimedia.org donations
google gives 3 results with the above note. donations is in its sidebar, so
doesn't make sense.
But if I search in nutch's demo, nutch gives all results. I have modified
the jsp to show scores, I found the first result's score is 0.1645642 
the second is 4.0708118E-4 
And the other results are even lower. So I think nutch should have a way to
do the same as google. But NutchBean can't get the scores, I think.

This is a simple case, in some cases, scores are similar(nutch's score,
don't know google's), but google still has an idea to cut from where. Does
it have a solution?

Regards
Pan

-- 
View this message in context: http://www.nabble.com/%22omitted-some-entries-very-similar..%22-feature-like-google-tf4270573.html#a12155005
Sent from the Nutch - User mailing list archive at Nabble.com.