You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by John Mendenhall <jo...@surfutopia.net> on 2008/01/27 01:32:26 UTC

nutch 0.9, fetch2, fetcher.parse conf value not used

I tried to run fetch without parsing by setting the
fetcher.parse property to false.  When I ran parse,
it said the segment had already been parsed, by the
fetch process.

It appears NUTCH-337 only fixed the unused
fetcher.parse configuration value in the Fetcher.java
class.  I have tried using fetch2 (Fetcher2.java) and
it appears the fetcher.parse configuration value is
not being used.

I will try with my same setup to use the fetch class
and see if this works as it should and does not parse
after fetching.

Is the Fetcher2 class not recommended?

Or, is it possible I have some other problem?

Thanks in advance for any assistance you can provide.

JohnM

-- 
john mendenhall
john@surfutopia.net
surf utopia
internet services

RE: JDK 1.5 & Tomcat 5.5

Posted by Christopher Bader <cb...@kratylos.com>.
I'm using Nutch 0.9 (the latest stable release) with JDK 1.5, and Tomcat
6.0.  I had a problem with JDK 1.6.

CB


-----Original Message-----
From: Duan, Nick [mailto:NDuan@mcdonaldbradley.com] 
Sent: Wednesday, January 30, 2008 4:50 PM
To: nutch-user@lucene.apache.org
Subject: JDK 1.5 & Tomcat 5.5

Does the latest Nutch work with JDK 1.5 or 1.6, and Tomcat 5.5 or 6.0?

Thanks!

Nick


JDK 1.5 & Tomcat 5.5

Posted by "Duan, Nick" <ND...@mcdonaldbradley.com>.
Does the latest Nutch work with JDK 1.5 or 1.6, and Tomcat 5.5 or 6.0?

Thanks!

Nick

Re: running out of space in /tmp

Posted by Susam Pal <su...@gmail.com>.
In the file 'conf/hadoop-site.xml', please add this:-

<property>
 <name>hadoop.tmp.dir</name>
 <value>/path/to/your/new/tmp/directory</value>
 <description>Base for Nutch Temporary Directories</description>
</property>

Regards,
Susam Pal

On Jan 31, 2008 9:12 PM, Christopher Bader <cb...@kratylos.com> wrote:
> All,
>
> Nutch is crashing on a crawl when it runs out of disk space in /tmp.
>
> There is lots of space in other partitions.  Aside from re-installing Linux,
> is there a way of getting Nutch and/or the Java VM to use a different
> directory for temporary storage?
>
> CB
>
>
>
>

running out of space in /tmp

Posted by Christopher Bader <cb...@kratylos.com>.
All,

Nutch is crashing on a crawl when it runs out of disk space in /tmp.

There is lots of space in other partitions.  Aside from re-installing Linux,
is there a way of getting Nutch and/or the Java VM to use a different
directory for temporary storage?

CB




Re: nutch 0.9, fetch2, fetcher.parse conf value not used

Posted by John Mendenhall <jo...@surfutopia.net>.
> I tried to run fetch without parsing by setting the
> fetcher.parse property to false.  When I ran parse,
> it said the segment had already been parsed, by the
> fetch process.
> 
> It appears NUTCH-337 only fixed the unused
> fetcher.parse configuration value in the Fetcher.java
> class.  I have tried using fetch2 (Fetcher2.java) and
> it appears the fetcher.parse configuration value is
> not being used.
> 
> I will try with my same setup to use the fetch class
> and see if this works as it should and does not parse
> after fetching.
> 
> Is the Fetcher2 class not recommended?
> Or, is it possible I have some other problem?
> 
> Thanks in advance for any assistance you can provide.

I changed my script to call `nutch fetch` instead
of `nutch fetch2`, using Fetcher.java rather than
Fetcher2.java.  Now the fetcher.parse configuration
value is being used.

I recommend we modify Fetcher2.java to use this
value instead of requiring it to be on the command
line.

JohnM

-- 
john mendenhall
john@surfutopia.net
surf utopia
internet services