You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Marek Bachmann <m....@uni-kassel.de> on 2011/10/28 15:55:59 UTC

Differences between LinkDB and Webgraph's inlink database?

Hello people,

can someone explain to me where the differences between the linkdb and 
webgraph's inlink database are?

As I have understood it, the linkdb holds the sites that are linking to 
a url and the anchor texts.

But this is nearly the same as on this page is written:
http://wiki.apache.org/nutch/NewScoring
"The inlink database is a listing of url and all of its inlinks."

Why do we have both databases?

Thanks


Re: Differences between LinkDB and Webgraph's inlink database?

Posted by Markus Jelsma <ma...@openindex.io>.
You understand correctly but we have both because there is no patch for 
NUTCH-1181 yet, you're more than welcome to provide some!

https://issues.apache.org/jira/browse/NUTCH-1181

On Friday 28 October 2011 15:55:59 Marek Bachmann wrote:
> Hello people,
> 
> can someone explain to me where the differences between the linkdb and
> webgraph's inlink database are?
> 
> As I have understood it, the linkdb holds the sites that are linking to
> a url and the anchor texts.
> 
> But this is nearly the same as on this page is written:
> http://wiki.apache.org/nutch/NewScoring
> "The inlink database is a listing of url and all of its inlinks."
> 
> Why do we have both databases?
> 
> Thanks

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350