You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Zhen Zhen <zh...@cs.dal.ca> on 2006/09/18 19:13:38 UTC

Empty "incoming anchor text"

Hi all, sorry for having another question so soon :P.

After deploying Nutch, when I clicked on "anchors" link under
each URL, the page came with an empty "incoming achor text", Is this
normal?

thanks

Zhen

Re: Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

Posted by pd...@yahoo.com.
The nutch version 0.8 tutorial has a section and it is pretty straight forward.  Make sure to remember to change the nutch-site.xml file and fill in your username.

I have had mIxed results with cygwin and nutch (so make backups etc.).

Cheers


Sent from my Verizon Wireless BlackBerry  

-----Original Message-----
From: Jp Mutch <jp...@yahoo.com>
Date: Mon, 18 Sep 2006 10:48:47 
To:nutch-dev@lucene.apache.org
Subject: Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

Hello, 
   
  I'm new to Nutch.
   
  I selected Nutch 9.12 dev because I need to use a 
Java 1.5 local development environment. 
I am able to build Nutch 9.12 succesfully on Java 1.5, with very
little effort. Great packaging of distribution.
   
  [Aside: Only one problem: Kept giving typical 
compiler warnings due to template class mismatches,
in core as well as many plugins.].
   
  My questions are regarding crawling and testing/searching:
Due to my local requirements, initially I just need to run all of nutch
on a single machine in its local filesystem, without really needing
Hadoop or DFS [I don't mind if they are running "under the hood"].
  Later on if the initial study is successful, I will of course
  switch to the full blown Nutch with Hadoop+DFS+Distributed Search.
   
  (Q1) What tutorial do I need to follow to get Nutch 9.12 
to crawl and index on a single machine?
(a) The Nutch 0.8 tutorial
http://lucene.apache.org/nutch/tutorial8.html ?
OR
(c) The new Hadoop tutorial
http://wiki.apache.org/nutch/NutchHadoopTutorial ?
   
  (Q2) Can I run [crawl+search] Nutch 9.12 or later on a single Windows XP
  machine with Cygwin+Tomcat 5.5?
   
  Appreciate any help.
  Thanks a lot!
   
  -jp


 		
---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.

Re: Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

Posted by Richard Braman <rb...@taxcodesoftware.org>.
Jp Mutch wrote:
>    
>   My questions are regarding crawling and testing/searching:
> Due to my local requirements, initially I just need to run all of nutch
> on a single machine in its local filesystem, without really needing
> Hadoop or DFS [I don't mind if they are running "under the hood"].
>   Later on if the initial study is successful, I will of course
>   switch to the full blown Nutch with Hadoop+DFS+Distributed Search.
>    
>   
Hadoop is run no matter what.  Its no big deal, unless there is a Hadoop
bug, several have come along but have been fixed.
hadoop needs a tmp directory to execute jobs in the distributed
fashion.  I usually point mine to C:\tmp  Hdoop will also create some
directories related to its filesystem.  the main directories you will
work with will be your crawl directory and its subfolders crawldb lindb,
indexes, and segements.
>   (Q1) What tutorial do I need to follow to get Nutch 9.12 
> to crawl and index on a single machine?
> (a) The Nutch 0.8 tutorial
> http://lucene.apache.org/nutch/tutorial8.html ?
> OR
> (c) The new Hadoop tutorial
> http://wiki.apache.org/nutch/NutchHadoopTutorial ?
>    
>   
The .8 would work, there are some additional notes on windows on the wiki
>   (Q2) Can I run [crawl+search] Nutch 9.12 or later on a single Windows XP
>   machine with Cygwin+Tomcat 5.5?
>   
Yes
>    
>   Appreciate any help.
>   Thanks a lot!
>    
>   -jp
>
>
>  		
> ---------------------------------
> Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.
>   



Which tutorial to use for getting Nutch 9.12 up and running on a single machine?

Posted by Jp Mutch <jp...@yahoo.com>.
Hello, 
   
  I'm new to Nutch.
   
  I selected Nutch 9.12 dev because I need to use a 
Java 1.5 local development environment. 
I am able to build Nutch 9.12 succesfully on Java 1.5, with very
little effort. Great packaging of distribution.
   
  [Aside: Only one problem: Kept giving typical 
compiler warnings due to template class mismatches,
in core as well as many plugins.].
   
  My questions are regarding crawling and testing/searching:
Due to my local requirements, initially I just need to run all of nutch
on a single machine in its local filesystem, without really needing
Hadoop or DFS [I don't mind if they are running "under the hood"].
  Later on if the initial study is successful, I will of course
  switch to the full blown Nutch with Hadoop+DFS+Distributed Search.
   
  (Q1) What tutorial do I need to follow to get Nutch 9.12 
to crawl and index on a single machine?
(a) The Nutch 0.8 tutorial
http://lucene.apache.org/nutch/tutorial8.html ?
OR
(c) The new Hadoop tutorial
http://wiki.apache.org/nutch/NutchHadoopTutorial ?
   
  (Q2) Can I run [crawl+search] Nutch 9.12 or later on a single Windows XP
  machine with Cygwin+Tomcat 5.5?
   
  Appreciate any help.
  Thanks a lot!
   
  -jp


 		
---------------------------------
Yahoo! Messenger with Voice. Make PC-to-Phone Calls to the US (and 30+ countries) for 2¢/min or less.