You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by caezar <ca...@gmail.com> on 2009/04/29 14:13:27 UTC

What is Inlinks

Hi,

I'm curious, what is inlinks parameter, received by IndexingFilter.filter
method? I understand that this is a dummy question, but few hours of reading
wiki and googling didn't give me the answer.

Thanks
-- 
View this message in context: http://www.nabble.com/What-is-Inlinks-tp23295828p23295828.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.


Re: What is Inlinks

Posted by Dennis Kubes <ku...@apache.org>.
Inlinks are the inbound links to a given page.  The anchor text is the 
text used to create the inbound link.  For example say we have two pages 
A and B:

A -> <a href="http://inbound/link">Anchor Text</a> -> B

Here we have a link from A to B using "Anchor Text" as the inbound link 
(anchor) text and "http://inbound/link" as the inbound link.  Inlinks is 
an aggregation of all inbound links to a given page.  So if pages D, E, 
F, and G all point to B, Inlinks would have all the links from A, C, D, 
and E to B.

Inlinks are parsed out of the HTMl during the fetching/parsing process. 
  They are then pulled into other jobs such as the WebGraph tools and 
the indexing process.

Dennis

caezar wrote:
> Thats I understand. But what is this anchors? How these (inlinks) object is
> filled by the system? I suppose it should be some kind of inbound links to
> the page being indexed, found in current database, am I right?
> 
> Marko Bauhardt-3 wrote:
>> the inlinks parameter has a method to get the anchors. And the  
>> AnchorIndexingFilter index these anchor text's.
>>
> 

Re: What is Inlinks

Posted by caezar <ca...@gmail.com>.
Thats I understand. But what is this anchors? How these (inlinks) object is
filled by the system? I suppose it should be some kind of inbound links to
the page being indexed, found in current database, am I right?

Marko Bauhardt-3 wrote:
> 
> the inlinks parameter has a method to get the anchors. And the  
> AnchorIndexingFilter index these anchor text's.
> 

-- 
View this message in context: http://www.nabble.com/What-is-Inlinks-tp23295828p23310922.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.


Re: What is Inlinks

Posted by Marko Bauhardt <mb...@101tec.com>.
Hi
the inlinks parameter has a method to get the anchors. And the  
AnchorIndexingFilter index these anchor text's.

marko


On Apr 29, 2009, at 2:13 PM, caezar wrote:

>
> Hi,
>
> I'm curious, what is inlinks parameter, received by  
> IndexingFilter.filter
> method? I understand that this is a dummy question, but few hours of  
> reading
> wiki and googling didn't give me the answer.
>
> Thanks
> -- 
> View this message in context: http://www.nabble.com/What-is-Inlinks-tp23295828p23295828.html
> Sent from the Nutch - Dev mailing list archive at Nabble.com.
>
>