You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Luca Rondanini <lu...@translated.net> on 2007/07/20 15:33:14 UTC
Fetching problems: Nutch 0.9 Hung Threads
Hi all,
First of all....I've read all the posts regarding this problem in the
mailing list!! :)
I'm try to index more than 200k documents. I'm reading those documents
through an nfs mount partition. Everything seems fine till we arrive at
40k-50k documents....then the fetcher fails with the error "Hung Threads"!!
These are the configurations that i've tried:
1) topN=20.000
fetcher.threads=10
ulimit -n=1024
MergeFactor=20
file.limit=1M
----> Hung Threads
2) topN=5000
fetcher.threads=10
ulimit -n=1024
MergeFactor=20
file.limit=1M
----> Hung Threads
3) topN=5000
fetcher.threads=5
ulimit -n=1024
MergeFactor=20
file.limit=1M
----> Too many open file
4) topN=5000
fetcher.threads=5
ulimit -n=4096
MergeFactor=10
file.limit=1M
----> Hung Threads
Can anyone please give me a clue as to what is going on?!?
Thanks,
Luca
Re: Fetching problems: Nutch 0.9 Hung Threads
Posted by Luca Rondanini <lu...@translated.net>.
Hi,
Forgot to say the the problem doesn't occur if I crawl the same files on
the local file system.
Thanks!
Luca
Luca Rondanini wrote:
> Hi all,
>
> First of all....I've read all the posts regarding this problem in the
> mailing list!! :)
>
> I'm try to index more than 200k documents. I'm reading those documents
> through an nfs mount partition. Everything seems fine till we arrive at
> 40k-50k documents....then the fetcher fails with the error "Hung Threads"!!
>
> These are the configurations that i've tried:
>
> 1) topN=20.000
> fetcher.threads=10
> ulimit -n=1024
> MergeFactor=20
> file.limit=1M
>
> ----> Hung Threads
>
> 2) topN=5000
> fetcher.threads=10
> ulimit -n=1024
> MergeFactor=20
> file.limit=1M
>
> ----> Hung Threads
>
>
> 3) topN=5000
> fetcher.threads=5
> ulimit -n=1024
> MergeFactor=20
> file.limit=1M
>
> ----> Too many open file
>
>
> 4) topN=5000
> fetcher.threads=5
> ulimit -n=4096
> MergeFactor=10
> file.limit=1M
>
> ----> Hung Threads
>
>
>
> Can anyone please give me a clue as to what is going on?!?
> Thanks,
> Luca
Re: Fetching problems: Nutch 0.9 Hung Threads
Posted by Luca Rondanini <lu...@translated.net>.
After many tries...the problem seems solved!
I've changed the hadoop-site.xml file adding these lines:
<property>
<name>mapred.speculative.execution</name>
<value>false</value>
</property>
I hope this will help someone else!!
Thanks
Luca Rondanini
Research and Development
luca@translated.net
Tel: +39 06 91 62 00 55
Fax: +39 06 233 200 102
http://www.translated.net
Luca Rondanini wrote:
> Hi all,
>
> First of all....I've read all the posts regarding this problem in the
> mailing list!! :)
>
> I'm try to index more than 200k documents. I'm reading those documents
> through an nfs mount partition. Everything seems fine till we arrive at
> 40k-50k documents....then the fetcher fails with the error "Hung Threads"!!
>
> These are the configurations that i've tried:
>
> 1) topN=20.000
> fetcher.threads=10
> ulimit -n=1024
> MergeFactor=20
> file.limit=1M
>
> ----> Hung Threads
>
> 2) topN=5000
> fetcher.threads=10
> ulimit -n=1024
> MergeFactor=20
> file.limit=1M
>
> ----> Hung Threads
>
>
> 3) topN=5000
> fetcher.threads=5
> ulimit -n=1024
> MergeFactor=20
> file.limit=1M
>
> ----> Too many open file
>
>
> 4) topN=5000
> fetcher.threads=5
> ulimit -n=4096
> MergeFactor=10
> file.limit=1M
>
> ----> Hung Threads
>
>
>
> Can anyone please give me a clue as to what is going on?!?
> Thanks,
> Luca