You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ba...@gmx.de on 2011/06/21 17:18:07 UTC

hardware config / problems

hi everyone,

i'm new to nutch and i have some trouble to get a good working nutch-cluster setup.. (nutch 1.2)

my setup:
1 master (namenode, jobtracker, secondarynamenode)
2 nodes (datanode, tasktracker)

all pc's are virtual machines and have 500mb ram each

nutch-config:
- mapred.map.tasks 2
- mapred.reduce.tasks 2
- mapred.child.java.opts -Xmx256mb 
- fetcher.threads.fetch 20
- fetcher.server.delay 1.0
- fetcher.threads.per.host 3 
- replication 2

if i increase the map and reduce tasks the crawl doesn't work..because the tasktracker kills himself..(sometimes also the datanode)..so i changed it back to 2..

but my questions are more general..looks my setup/config ok? and is there any best practice hardware requirement for running a namenode or datanodes?

thanks for help and i appreciate all your answers..

bart


-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de