You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Jeremy Huylebroeck (JIRA)" <ji...@apache.org> on 2006/08/02 02:06:13 UTC
[jira] Created: (NUTCH-337) Fetcher ignores the fetcher.parse value
configured in config file
Fetcher ignores the fetcher.parse value configured in config file
-----------------------------------------------------------------
Key: NUTCH-337
URL: http://issues.apache.org/jira/browse/NUTCH-337
Project: Nutch
Issue Type: Bug
Components: fetcher
Affects Versions: 0.8, 0.9
Reporter: Jeremy Huylebroeck
Priority: Trivial
using the command line call to Fetcher, if the noParsing parameter is given, everything is fine.
if the noParsing is not given, the value in the nutch-site.xml (or nutch-default.xml) should be taken but it is "true" that is always given to the call to fetch.
it should be the value from the conf.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (NUTCH-337) Fetcher ignores the fetcher.parse value
configured in config file
Posted by "Stefan Groschupf (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-337?page=all ]
Stefan Groschupf updated NUTCH-337:
-----------------------------------
Attachment: respectFetcherParsePropertyV1.patch
Hi Jeremy, thanks for catching this. Attached a fix. Should be easy for a contributor to commit this to trunk....
> Fetcher ignores the fetcher.parse value configured in config file
> -----------------------------------------------------------------
>
> Key: NUTCH-337
> URL: http://issues.apache.org/jira/browse/NUTCH-337
> Project: Nutch
> Issue Type: Bug
> Components: fetcher
> Affects Versions: 0.8, 0.9.0
> Reporter: Jeremy Huylebroeck
> Priority: Trivial
> Attachments: respectFetcherParsePropertyV1.patch
>
>
> using the command line call to Fetcher, if the noParsing parameter is given, everything is fine.
> if the noParsing is not given, the value in the nutch-site.xml (or nutch-default.xml) should be taken but it is "true" that is always given to the call to fetch.
> it should be the value from the conf.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (NUTCH-337) Fetcher ignores the fetcher.parse value
configured in config file
Posted by "Stefan Groschupf (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-337?page=all ]
Stefan Groschupf updated NUTCH-337:
-----------------------------------
Priority: Major (was: Trivial)
> Fetcher ignores the fetcher.parse value configured in config file
> -----------------------------------------------------------------
>
> Key: NUTCH-337
> URL: http://issues.apache.org/jira/browse/NUTCH-337
> Project: Nutch
> Issue Type: Bug
> Components: fetcher
> Affects Versions: 0.8, 0.9.0
> Reporter: Jeremy Huylebroeck
> Attachments: respectFetcherParsePropertyV1.patch
>
>
> using the command line call to Fetcher, if the noParsing parameter is given, everything is fine.
> if the noParsing is not given, the value in the nutch-site.xml (or nutch-default.xml) should be taken but it is "true" that is always given to the call to fetch.
> it should be the value from the conf.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Closed: (NUTCH-337) Fetcher ignores the fetcher.parse value
configured in config file
Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
[ http://issues.apache.org/jira/browse/NUTCH-337?page=all ]
Andrzej Bialecki closed NUTCH-337.
-----------------------------------
Fix Version/s: 0.8.1
0.9.0
Resolution: Fixed
Patch applied to branch-0.8 and trunk. Thanks!
> Fetcher ignores the fetcher.parse value configured in config file
> -----------------------------------------------------------------
>
> Key: NUTCH-337
> URL: http://issues.apache.org/jira/browse/NUTCH-337
> Project: Nutch
> Issue Type: Bug
> Components: fetcher
> Affects Versions: 0.8, 0.9.0
> Reporter: Jeremy Huylebroeck
> Fix For: 0.8.1, 0.9.0
>
> Attachments: respectFetcherParsePropertyV1.patch
>
>
> using the command line call to Fetcher, if the noParsing parameter is given, everything is fine.
> if the noParsing is not given, the value in the nutch-site.xml (or nutch-default.xml) should be taken but it is "true" that is always given to the call to fetch.
> it should be the value from the conf.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: nutch
Posted by an...@orbita1.ru.
My settings:
....
<property>
<name>mapred.local.dir</name>
<value>/hadoop/mapred/local</value>
<description>The local directory where MapReduce stores intermediate
data files. May be a comma-separated list of
directories on different devices in order to spread disk i/o.
</description>
</property>
<property>
<name>mapred.system.dir</name>
<value>/hadoop/mapred/system</value>
<description>The shared directory where MapReduce stores control files.
</description>
</property>
....
My device which mounted onto "/" have free space is 115G.
[root@xxxxx /]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 133G 13G 113G 11% /
Anybody have other ideas?
-----Original Message-----
From: Sami Siren [mailto:ssiren@gmail.com]
Sent: Wednesday, August 02, 2006 6:01 PM
To: nutch-dev@lucene.apache.org
Subject: Re: nutch
Importance: High
most propably you have run out of space in tmp (local) filesystem
use properties like
<property>
<name>mapred.system.dir</name>
<value><!-- path to fs that contains a lots of space --></value>
</property>
<property>
<name>mapred.local.dir</name>
<value><!-- path to fs that contains a lots of space --></value>
</property>
in hadoop-site.xml to get over this problem.
anton@orbita1.ru wrote:
>I forget.... ;-) One more question:
>This problem with nutch or hadoop?
>
>-----Original Message-----
>From: anton@orbita1.ru [mailto:anton@orbita1.ru]
>Sent: Wednesday, August 02, 2006 11:38 AM
>To: nutch-dev@lucene.apache.org
>Subject: nutch
>Importance: High
>
>I use nutch 0.8(mapred). Nutch started on 3 servers.
>When my nutch try index segment I get error on tasktracker:
><skiped>
>
>
>
>
>
>
>
Re: nutch
Posted by Sami Siren <ss...@gmail.com>.
most propably you have run out of space in tmp (local) filesystem
use properties like
<property>
<name>mapred.system.dir</name>
<value><!-- path to fs that contains a lots of space --></value>
</property>
<property>
<name>mapred.local.dir</name>
<value><!-- path to fs that contains a lots of space --></value>
</property>
in hadoop-site.xml to get over this problem.
anton@orbita1.ru wrote:
>I forget.... ;-) One more question:
>This problem with nutch or hadoop?
>
>-----Original Message-----
>From: anton@orbita1.ru [mailto:anton@orbita1.ru]
>Sent: Wednesday, August 02, 2006 11:38 AM
>To: nutch-dev@lucene.apache.org
>Subject: nutch
>Importance: High
>
>I use nutch 0.8(mapred). Nutch started on 3 servers.
>When my nutch try index segment I get error on tasktracker:
><skiped>
>
>
>
>
>
>
>
RE: nutch
Posted by an...@orbita1.ru.
I forget.... ;-) One more question:
This problem with nutch or hadoop?
-----Original Message-----
From: anton@orbita1.ru [mailto:anton@orbita1.ru]
Sent: Wednesday, August 02, 2006 11:38 AM
To: nutch-dev@lucene.apache.org
Subject: nutch
Importance: High
I use nutch 0.8(mapred). Nutch started on 3 servers.
When my nutch try index segment I get error on tasktracker:
<skiped>
nutch
Posted by an...@orbita1.ru.
I use nutch 0.8(mapred). Nutch started on 3 servers.
When my nutch try index segment I get error on tasktracker:
060727 215111 task_0025_r_000000_1 SEVERE FSError from child
060727 215111 task_0025_r_000000_1 org.apache.hadoop.fs.FSError:
java.io.IOException: No space left on device
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write(LocalFile
Sys
tem.java:152)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.fs.FSDataOutputStream$Summer.write(FSDataOutputStream.java
:69
)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStre
am.
java:98)
060727 215111 task_0025_r_000000_1 at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
060727 215111 task_0025_r_000000_1 at
java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
060727 215111 task_0025_r_000000_1 at
java.io.DataOutputStream.write(DataOutputStream.java:90)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:192)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:
873
)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.io.SequenceFile$Sorter$MergePass.run(SequenceFile.java:760
)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:696)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:522)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:316)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:755)
060727 215111 task_0025_r_000000_1 Caused by: java.io.IOException: No space
left on device
060727 215111 task_0025_r_000000_1 at
java.io.FileOutputStream.writeBytes(Native Method)
060727 215111 task_0025_r_000000_1 at
java.io.FileOutputStream.write(FileOutputStream.java:260)
060727 215111 task_0025_r_000000_1 at
org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write(LocalFile
Sys
tem.java:150)
060727 215111 task_0025_r_000000_1 ... 12 more
But on server with tasktracker free space on the HDD is 115G. I try get
segment from dfs. Segment occupies 2,4G on HDD. Why I get this errors?
Anybody can help me decide this problem?