You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Rod Taylor <rb...@sitesell.com> on 2005/11/03 21:32:08 UTC
mapred bug -- bad part calculation?
Sources are from October 31st. Sun Standard Edition 1.5.0_02-b09 for
amd64
Every segment that I fetch seems to be missing a part when stored on the
filesystem. The stranger thing is it is always the same part (very
reproducible).
If I have mapred.reduce.tasks set to 20, the hole is at part 13. That
is, the part-00013 directory is empty while the remainder (0 through 12,
14 through 19) all have data.
If I have mapred.reduce.tasks set to 19, the hole is at part 11.
content/part-00011 is empty.
Attached are my site configuration (reduce.tasks is 19), task log for a
failing task and the output from the job tracker.
Below is a snippet from the datanode log (the only errors that exist are
related to this task or others which process the above part #) and below
that the output from localhost:7845 on the jobtracker machine for the
job.
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at
java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at
java.io.BufferedInputStream.read(BufferedInputStream.java:313)
at java.io.DataInputStream.read(DataInputStream.java:134)
at org.apache.nutch.ndfs.DataNode
$DataXceiver.run(DataNode.java:369)
at java.lang.Thread.run(Thread.java:595)
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at
java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at
java.io.BufferedInputStream.read(BufferedInputStream.java:313)
at java.io.DataInputStream.read(DataInputStream.java:134)
at org.apache.nutch.ndfs.DataNode
$DataXceiver.run(DataNode.java:369)
at java.lang.Thread.run(Thread.java:595)
Job 'job_k1p80p'
Job File: /home/sitesell/system/submit_2pgex8/job.xml
Start time: Thu Nov 03 12:04:43 EST 2005
The job failed at: Thu Nov 03 16:00:42 EST 2005
__________________________________________________________________________________________________
Map Tasks
Map Task Id Pct Complete State
Diagnostic Text
task_m_2m1twe 1.0 103189 pages, 5045 errors, 13.1
pages/s, 1000 kb/s,
task_m_4nzguk 1.0 103141 pages, 5193 errors, 12.9
pages/s, 988 kb/s,
task_m_5aprs2 1.0 103427 pages, 4756 errors, 13.4
pages/s, 1027 kb/s,
task_m_6pd5q7 1.0 102650 pages, 5081 errors, 12.6
pages/s, 962 kb/s,
task_m_8qzj8p 1.0 103610 pages, 4539 errors, 13.6
pages/s, 1039 kb/s,
task_m_aev1di 1.0 102666 pages, 4997 errors, 13.2
pages/s, 1007 kb/s,
task_m_f2zfyw 1.0 103235 pages, 4662 errors, 13.6
pages/s, 1045 kb/s,
task_m_f84hfi 1.0 103746 pages, 4657 errors, 13.0
pages/s, 991 kb/s,
task_m_hhv9b9 1.0 102909 pages, 4972 errors, 13.5
pages/s, 1026 kb/s,
task_m_kijqqx 1.0 103439 pages, 4858 errors, 13.4
pages/s, 1024 kb/s,
task_m_n5mxax 1.0 102894 pages, 4953 errors, 13.3
pages/s, 1017 kb/s,
task_m_p45m8c 1.0 103705 pages, 4969 errors, 13.1
pages/s, 1007 kb/s,
task_m_qfevss 1.0 102640 pages, 5006 errors, 13.2
pages/s, 1011 kb/s,
task_m_qg3816 1.0 103658 pages, 5039 errors, 13.3
pages/s, 1014 kb/s,
task_m_rlxmuw 1.0 103609 pages, 4491 errors, 13.6
pages/s, 1038 kb/s,
task_m_t9ksdc 1.0 103053 pages, 5287 errors, 12.9
pages/s, 994 kb/s,
task_m_wt3oyf 1.0 103006 pages, 5168 errors, 13.3
pages/s, 1014 kb/s,
task_m_xk3gxz 1.0 103294 pages, 5216 errors, 13.0
pages/s, 996 kb/s,
task_m_yjrejy 1.0 103158 pages, 4787 errors, 13.5
pages/s, 1038 kb/s,
__________________________________________________________________________________________________
Reduce Task Id Pct Complete State Diagnostic Text
task_r_2ktith 1.0 reduce > reduce
task_r_6hwvi0 1.0 reduce > reduce
task_r_8bi6h5 1.0 reduce > reduce
task_r_bpisbi 1.0 reduce > reduce
task_r_cfoo7z 1.0 reduce > reduce
task_r_cmy1r3 1.0 reduce > reduce
task_r_efnd4k 1.0 reduce > reduce
task_r_ervlp5 1.0 reduce > reduce
task_r_kvmno7 1.0 reduce > reduce
task_r_n4q36e 1.0 reduce > reduce
task_r_o4st5w 1.0 reduce > reduce
task_r_ow0sul 1.0 reduce > reduce
task_r_r7u152 1.0 reduce > reduce
task_r_ra99xx 1.0 reduce > reduce
task_r_ush85v 1.0 reduce > reduce
task_r_vbmkfw 1.0 reduce > reduce
task_r_wbirax 1.0 reduce > reduce
task_r_z17yss 1.0 reduce > reduce
task_r_o9mv91 0.9153447 reduce > reduce Timed
out.java.io.IOException: Task process exit with nonzero status.
at org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139)
at
org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92) Timed
out.java.io.IOException: Task process exit
with nonzero status. at
org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139) at
org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92) Timed
out.java.io.IOException: Task process exit
with nonzero status. at
org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139) at
org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92) Timed
out.java.io.IOException: Task process exit
with nonzero status. at
org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139) at
org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92)
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 13:43 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > Every segment that I fetch seems to be missing a part when stored on the
> > filesystem. The stranger thing is it is always the same part (very
> > reproducible).
>
> This sounds strange. Are the datanode errors always on the same host?
> How many hosts are you running this on?
It also seems to be limited to large segments. Using -topN 1000000
executes without any problems. 3 and 7 million both had difficulties.
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 13:43 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > Every segment that I fetch seems to be missing a part when stored on the
> > filesystem. The stranger thing is it is always the same part (very
> > reproducible).
>
> This sounds strange. Are the datanode errors always on the same host?
> How many hosts are you running this on?
I lied earlier. It still happens with smaller segments, just not as
frequently.
Found this in the namenode log file:
051104 200412 Server connection on port 5466 from 192.168.100.11:
exiting
051104 200438 Server connection on port 5466 from 192.168.100.11:
starting
051104 200438 Cannot start file because pendingCreates is non-null
051104 200438 Server handler on 5466 call error: java.io.IOException:
Cannot create file /opt/sitesell/sbider
_data/nutch/segments/20051104185259/20051104185300/crawl_fetch/part-00011/data
java.io.IOException: Cannot create
file /opt/sitesell/sbider_data/nutch/segments/20051104185259/2005110418530
0/crawl_fetch/part-00011/data
at org.apache.nutch.ndfs.NameNode.create(NameNode.java:98)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.nutch.ipc.RPC$1.call(RPC.java:186)
at org.apache.nutch.ipc.Server$Handler.run(Server.java:198)
051104 200440 Server connection on port 5466 from 192.168.100.11:
exiting
051104 200504 Server connection on port 5466 from 192.168.100.11:
starting
051104 200504 Cannot start file because pendingCreates is non-null
051104 200504 Server handler on 5466 call error: java.io.IOException:
Cannot create
file /opt/sitesell/sbider_data/nutch/segments/20051104185259/20051104185300/crawl_fetch/part-00011/data
java.io.IOException: Cannot create
file /opt/sitesell/sbider_data/nutch/segments/20051104185259/20051104185300/crawl_fetch/part-00011/data
at org.apache.nutch.ndfs.NameNode.create(NameNode.java:98)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.nutch.ipc.RPC$1.call(RPC.java:186)
at org.apache.nutch.ipc.Server$Handler.run(Server.java:198)
051104 200505 Server connection on port 5466 from 192.168.100.11:
exiting
051104 200506 Removing lease [Lease. Holder: NDFSClient_1755346663,
heldlocks: 0, pendingcreates: 0], leases remaining: 1
051104 200529 Server connection on port 5466 from 192.168.100.11:
starting
051104 201807 Server connection on port 5466 from 192.168.100.11:
exiting
051104 201812 Server connection on port 5466 from 192.168.100.15:
exiting
051104 201823 Server connection on port 5466 from 192.168.100.15:
starting
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Stefan Groschupf <sg...@media-style.com>.
>
> I tried running one datanode per machine connecting back to the
> same SAN
> but it seemed pretty clunky.
SAN in general is a bad idea. A SAN is too slow for a serious setup.
... and it is the single point of failure...
Better use many local hdd.
Stefan
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Mon, 2005-11-07 at 18:12 -0800, Paul Baclace wrote:
> Rod Taylor wrote:
> > NDFS accomplishes the above path finding by auto-prefixing any path not
> > beginning with / with a /user/$USER. I didn't think it was appropriate
> > for LocalFileSystem.java to be mucking around trying to automatically
> > adjust paths to what the user may have intended.
> >
>
> Grep-ing for /user, NDFSFileSystem has:
> "/user/" + System.getProperty("user.name") + "/"
>
> I would think this is not consistent with the idea that properties
> and filenames working identically on all machines. Perhaps this
> NDFSFileSystem line should use mapred.system.dir
Quite possibly. Either way I was just trying to demonstrate that
multiple tasktrackers, regardless of whether it is NDFS or local
filesystem, requires the path expansion.
I don't think it should be a filesystem level item at all and should be
up to the code requesting the job to be done.
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Paul Baclace <pe...@baclace.net>.
Rod Taylor wrote:
> NDFS accomplishes the above path finding by auto-prefixing any path not
> beginning with / with a /user/$USER. I didn't think it was appropriate
> for LocalFileSystem.java to be mucking around trying to automatically
> adjust paths to what the user may have intended.
>
Grep-ing for /user, NDFSFileSystem has:
"/user/" + System.getProperty("user.name") + "/"
I would think this is not consistent with the idea that properties
and filenames working identically on all machines. Perhaps this
NDFSFileSystem line should use mapred.system.dir
Paul
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Mon, 2005-11-07 at 17:26 -0800, Paul Baclace wrote:
> Rod Taylor wrote:
> > The attached patches for Generator.java and Injector.java allow a
> > specific temporary directory to be specified. This gives Nutch the full
> > path to these temporary directories and seems to fix the "No input
> > directories" issue when using a local filesystem with multiple task
> > trackers.
>
> Is your patch with the new property mapred.temp.dir is meant to help
> finding files that should not be separate between different
> processes on the same host? Is the user id different?
Generate and Inject both issue 2 jobs. In order for the second job to
find the files, the first job needs to write them in a predictable and
common location. The current path doesn't seem to be enough even if all
daemons are started within it. I believe it needs to be a common path
for all hosts like mapred.system.dir which I considered using instead.
NDFS accomplishes the above path finding by auto-prefixing any path not
beginning with / with a /user/$USER. I didn't think it was appropriate
for LocalFileSystem.java to be mucking around trying to automatically
adjust paths to what the user may have intended.
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Paul Baclace <pe...@baclace.net>.
Rod Taylor wrote:
> The attached patches for Generator.java and Injector.java allow a
> specific temporary directory to be specified. This gives Nutch the full
> path to these temporary directories and seems to fix the "No input
> directories" issue when using a local filesystem with multiple task
> trackers.
Is your patch with the new property mapred.temp.dir is meant to help
finding files that should not be separate between different
processes on the same host? Is the user id different?
Paul
Re: mapred bug -- bad part calculation?
Posted by Doug Cutting <cu...@nutch.org>.
Rod Taylor wrote:
> The attached patches for Generator.java and Injector.java allow a
> specific temporary directory to be specified. This gives Nutch the full
> path to these temporary directories and seems to fix the "No input
> directories" issue when using a local filesystem with multiple task
> trackers.
This looks like a good patch. I've committed it.
This is a recent bug. The nutch-daemon.sh script connects all daemons
to the Nutch root, so that relative paths are consistent. And,
previously, child processes were always connected to the same place as
the parent process. But I changed that recently so that child processes
are now connected to the directory where their job's jar (if any) is
unpacked. This was so that if the jar contains scripts (e.g., a
parse-ext plugin script) then these scripts are easy to run.
In NDFS the current working directory is always /user/$USER. On the
local filesystem with a local jobtracker, paths are relative to the
current working directory of the process (since there's only one
process). The problematic case is when the local filesystem is used
with multiple processes. The prior convention of making paths relative
to the nutch root was fragile. Better to supply absolute paths, as your
patch does.
Doug
Re: mapred bug -- bad part calculation?
Posted by Paul Baclace <pe...@archive.org>.
Rod Taylor wrote:
> The attached patches for Generator.java and Injector.java allow a
> specific temporary directory to be specified. This gives Nutch the full
> path to these temporary directories and seems to fix the "No input
> directories" issue when using a local filesystem with multiple task
> trackers.
Is your patch with the new property mapred.temp.dir is meant to help
finding files that should not be separate between different
processes on the same host? Is the user id different?
Paul
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
The attached patches for Generator.java and Injector.java allow a
specific temporary directory to be specified. This gives Nutch the full
path to these temporary directories and seems to fix the "No input
directories" issue when using a local filesystem with multiple task
trackers.
On Mon, 2005-11-07 at 09:57 -0500, Rod Taylor wrote:
> On Fri, 2005-11-04 at 20:41 -0800, Doug Cutting wrote:
> > Rod Taylor wrote:
> > > Here you go. local filesystem and a single job tracker on another
> > > machine. When the tasktracker and jobtracker are on the same box there
> > > isn't a problem. When they are on different machines it runs into
> > > issues.
> > >
> > > This is using mapred.local.dir on the local machine (not sharedd between
> > > sbider4 and sbider5):
> >
> > > parsing /home/sitesell/localt/taskTracker/task_m_o59djj/job.xml
> > > [Fatal Error] :-1:-1: Premature end of file.
> >
> > What is mapred.system.dir? That must be shared. Also, filenames you
> > pass to commands must be pathnames that work on all hosts.
>
> I managed to get past all of the initial injection problems by running a
> local crawl (no jobtracker) which created the crawldb/current/part-00000
> files. So I was able to do a real inject, with jobtracker, for all of
> the urls system wide without any complaints about files or directories
> not existing.
>
> Now, when trying to run a generate with a jobtracker it seems to have a
> hard time finding the temporary working areas from one job to the next.
> I cannot figure out where it is creating generate-temp-908680235. With
> NDFS it would be /user/$USER/
>
> <-- nutch generate -->
> 051107 091256 topN: 10000
> 051107 091256 Generator: starting
> 051107 091256 Generator:
> segment: /opt/sitesell/sbider_data/test2/segments/20051107091256
> 051107 091256 Generator: Selecting most-linked urls due for fetch.
> 051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
> 051107 091256 parsing file:/opt/nutch-0.8_7/conf/mapred-default.xml
> 051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
> 051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
> 051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
> 051107 091256 Client connection to 192.168.100.14:5464: starting
> 051107 091256 Running job: job_xhvq9b
> 051107 091258 map 0%
> 051107 091300 map 5%
> 051107 091303 map 16%
> 051107 091305 map 21%
> 051107 091306 map 26%
> 051107 091308 map 32%
> 051107 091309 map 37%
> 051107 091312 map 47%
> 051107 091315 map 58%
> 051107 091318 map 68%
> 051107 091320 map 74%
> 051107 091321 map 79%
> 051107 091324 map 89%
> 051107 091327 map 100%
> 051107 091330 reduce 5%
> 051107 091332 reduce 11%
> 051107 091333 reduce 16%
> 051107 091335 reduce 21%
> 051107 091337 reduce 26%
> 051107 091339 reduce 37%
> 051107 091342 reduce 47%
> 051107 091344 reduce 53%
> 051107 091345 reduce 58%
> 051107 091347 reduce 63%
> 051107 091348 reduce 68%
> 051107 091351 reduce 79%
> 051107 091354 reduce 89%
> 051107 091357 reduce 100%
> 051107 091359 Job complete: job_xhvq9b
> 051107 091359 Generator: Partitioning selected urls by host, for
> politeness.
> 051107 091359 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
> 051107 091359 parsing file:/opt/nutch-0.8_7/conf/mapred-default.xml
> 051107 091359 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
> Exception in thread "main" java.io.IOException: No input directories
> specified in: NutchConf: nutch-default.xml ,
> mapred-default.xml , /home/sitesell/local/jobTracker/job_h22fvi.xml ,
> nutch-site.xml
> at org.apache.nutch.ipc.Client.call(Client.java:294)
> at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
> at $Proxy0.submitJob(Unknown Source)
> at
> org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
> at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
> at org.apache.nutch.crawl.Generator.generate(Generator.java:213)
> at org.apache.nutch.crawl.Generator.main(Generator.java:258)
>
> [sitesell@sbider5 sbider_data]$
> cat /home/sitesell/local/jobTracker/job_h22fvi.xml | grep input
> <property><name>mapred.input.format.class</name><value>org.apache.nutch.mapred.SequenceFileInputFormat</value></property>
> <property><name>mapred.input.dir</name><value>generate-temp-908680235</value></property>
> <property><name>mapred.input.value.class</name><value>org.apache.nutch.io.UTF8</value></property>
> <property><name>mapred.input.key.class</name><value>org.apache.nutch.crawl.CrawlDatum</value></property>
>
> --
> Rod Taylor <rb...@sitesell.com>
>
>
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Massimo Miccoli <mm...@iltrovatore.it>.
Hello Nutch devs,
I have same problems. I have 10 hosts and one master. For each host I
have a datanode and tasktracer.
My mapred conf is 100 maps and 25 reducers. Belove the logs with errors.
Thanks
051107 144101 task_r_pd3ybk 0.224% reduce > copy >
051107 144102 Moving bad file
/tmp/nutch/mapred/local/task_m_mmdwzs/part-18.out to
/tmp/bad_files/part-18.out.-1505193967
051107 144102 Server handler on 48724 caught: java.io.IOException:
Checksum error: /tmp/nutch/mapred/local/task_m_mmdwzs/pa
rt-18.out
java.io.IOException: Checksum error:
/tmp/nutch/mapred/local/task_m_mmdwzs/part-18.out
at
org.apache.nutch.fs.NFSDataInputStream$Checker.verifySum(NFSDataInputStream.java:115)
at
org.apache.nutch.fs.NFSDataInputStream$Checker.read(NFSDataInputStream.java:95)
at
org.apache.nutch.fs.NFSDataInputStream$PositionCache.read(NFSDataInputStream.java:152)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
at java.io.DataInputStream.read(DataInputStream.java:80)
at
org.apache.nutch.mapred.MapOutputFile.write(MapOutputFile.java:95)
at
org.apache.nutch.io.ObjectWritable.writeObject(ObjectWritable.java:117)
at org.apache.nutch.io.ObjectWritable.write(ObjectWritable.java:64)
at org.apache.nutch.ipc.Server$Handler.run(Server.java:213)
051107 144103 task_r_pd3ybk 0.24400002% reduce > copy >
051107 144103 parsing file:/d1/mapred/conf/nutch-default.xml
051107 144103 parsing file:/d1/mapred/conf/mapred-default.xml
051107 144103 parsing
/tmp/nutch/mapred/local/taskTracker/task_r_pd3ybk/job.xml
051107 144103 parsing file:/d1/mapred/conf/nutch-site.xml
051107 144104 task_r_pd3ybk parsing file:/d1/mapred/conf/nutch-default.xml
051107 144104 task_r_pd3ybk parsing file:/d1/mapred/conf/nutch-site.xml
051107 144104 task_r_pd3ybk Child starting
051107 144104 task_r_pd3ybk Client connection to 0.0.0.0:33273: starting
051107 144104 Server connection on port 33273 from 127.0.0.1: starting
051107 144104 task_r_pd3ybk parsing file:/d1/mapred/conf/nutch-default.xml
051107 144104 task_r_pd3ybk parsing file:/d1/mapred/conf/mapred-default.xml
051107 144104 task_r_pd3ybk parsing
/tmp/nutch/mapred/local/taskTracker/task_r_pd3ybk/job.xml
051107 144104 task_r_pd3ybk parsing file:/d1/mapred/conf/nutch-site.xml
051107 144104 task_r_pd3ybk parsing file:/d1/mapred/conf/nutch-default.xml
051107 144104 task_r_pd3ybk parsing
/tmp/nutch/mapred/local/taskTracker/task_r_pd3ybk/job.xml
051107 144104 task_r_pd3ybk parsing file:/d1/mapred/conf/nutch-site.xml
051107 144105 task_r_pd3ybk 0.25640127% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_9b2agp.out
051107 144106 task_r_pd3ybk 0.26105025% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_iwbx48.out
051107 144107 task_r_pd3ybk 0.30607307% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_cphmud.out
051107 144108 task_r_pd3ybk 0.30645084% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_cphmud.out
051107 144109 task_r_pd3ybk 0.30679235% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_cphmud.out
051107 144110 task_r_pd3ybk 0.30714962% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_cphmud.out
051107 144111 task_r_pd3ybk 0.30751395% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_cphmud.out
051107 144112 task_r_pd3ybk 0.3078882% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_cphmud.out
051107 144113 task_r_pd3ybk 0.3246999% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_ahej3w.out
051107 144114 task_r_pd3ybk 0.33490744% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_rebwrf.out
051107 144115 task_r_pd3ybk 0.3441058% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_atf6cb.out
051107 144116 task_r_pd3ybk 0.3537717% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_objo5q.out
051107 144117 task_r_pd3ybk 0.35881257% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_ybv2xw.out
051107 144118 task_r_pd3ybk 0.36855537% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_pv6b9d.out
051107 144119 task_r_pd3ybk 0.37860525% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_lj8ljn.out
051107 144120 task_r_pd3ybk 0.3887727% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_5jjyb8.out
051107 144121 task_r_pd3ybk 0.39831316% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_q24lb2.out
051107 144122 task_r_pd3ybk 0.44835892% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_9yx6r2.out
051107 144123 task_r_pd3ybk 0.4488136% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_9yx6r2.out
051107 144124 task_r_pd3ybk 0.4492674% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_9yx6r2.out
051107 144125 task_r_pd3ybk 0.44971693% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_9yx6r2.out
051107 144126 task_r_pd3ybk 0.48041725% reduce > append >
/tmp/nutch/mapred/local/task_r_pd3ybk/task_m_xnbtvi.out
051107 144128 task_r_pd3ybk 0.5% reduce > sort
051107 144129 task_r_pd3ybk 0.5% reduce > sort
051107 144130 task_r_pd3ybk 0.5% reduce > sort
051107 144131 task_r_pd3ybk 0.5% reduce > sort
051107 144132 task_r_pd3ybk 0.5% reduce > sort
051107 144133 task_r_pd3ybk 0.5% reduce > sort
051107 144134 task_r_pd3ybk 0.5% reduce > sort
051107 144135 task_r_pd3ybk 0.5% reduce > sort
051107 144136 task_r_pd3ybk 0.5% reduce > sort
051107 144137 task_r_pd3ybk 0.5% reduce > sort
051107 144138 task_r_pd3ybk 0.5% reduce > sort
051107 144139 task_r_pd3ybk 0.5% reduce > sort
051107 144140 task_r_pd3ybk 0.5% reduce > sort
051107 144141 task_r_pd3ybk 0.5% reduce > sort
051107 144142 task_r_pd3ybk 0.5% reduce > sort
051107 144144 task_r_pd3ybk 0.5% reduce > sort
051107 144145 task_r_pd3ybk 0.5% reduce > sort
051107 144146 task_r_pd3ybk 0.5% reduce > sort
051107 144147 task_r_pd3ybk 0.5% reduce > sort
051107 144148 task_r_pd3ybk 0.5% reduce > sort
051107 144149 task_r_pd3ybk 0.5% reduce > sort
051107 144150 task_r_pd3ybk 0.5% reduce > sort
051107 144151 task_r_pd3ybk 0.5% reduce > sort
051107 144151 task_r_pd3ybk Client connection to 10.2.0.11:7000: starting
051107 144152 task_r_pd3ybk 0.75141895% reduce > reduce
051107 144153 task_r_pd3ybk 0.75535446% reduce > reduce
051107 144154 task_r_pd3ybk 0.7593212% reduce > reduce
051107 144155 task_r_pd3ybk 0.7630673% reduce > reduce
051107 144156 task_r_pd3ybk 0.7669503% reduce > reduce
051107 144157 task_r_pd3ybk 0.770851% reduce > reduce
051107 144158 task_r_pd3ybk 0.774693% reduce > reduce
051107 144159 task_r_pd3ybk 0.77830505% reduce > reduce
051107 144200 task_r_pd3ybk 0.78223264% reduce > reduce
051107 144201 task_r_pd3ybk 0.7861667% reduce > reduce
051107 144202 task_r_pd3ybk 0.7900911% reduce > reduce
051107 144203 Server connection on port 48724 from 10.2.0.9: exiting
051107 144203 task_r_pd3ybk 0.79412013% reduce > reduce
051107 144203 Server connection on port 48724 from 10.2.0.9: starting
051107 144203 Server handler on 48724 caught:
java.io.FileNotFoundException:
/tmp/nutch/mapred/local/task_m_mmdwzs/part-18.
out
java.io.FileNotFoundException:
/tmp/nutch/mapred/local/task_m_mmdwzs/part-18.out
at
org.apache.nutch.fs.LocalFileSystem.openRaw(LocalFileSystem.java:106)
at
org.apache.nutch.fs.NFSDataInputStream$Checker.<init>(NFSDataInputStream.java:45)
at
org.apache.nutch.fs.NFSDataInputStream.<init>(NFSDataInputStream.java:217)
at
org.apache.nutch.fs.NutchFileSystem.open(NutchFileSystem.java:143)
at
org.apache.nutch.fs.NutchFileSystem.open(NutchFileSystem.java:132)
at
org.apache.nutch.mapred.MapOutputFile.write(MapOutputFile.java:91)
at
org.apache.nutch.io.ObjectWritable.writeObject(ObjectWritable.java:117)
at org.apache.nutch.io.ObjectWritable.write(ObjectWritable.java:64)
at org.apache.nutch.ipc.Server$Handler.run(Server.java:213)
051107 144204 task_r_pd3ybk 0.79818034% reduce > reduce
051107 144205 task_r_pd3ybk 0.80157274% reduce > reduce
051107 144206 task_r_pd3ybk 0.8053863% reduce > reduce
051107 144207 task_r_pd3ybk 0.8092159% reduce > reduce
....
Rod Taylor ha scritto:
>On Fri, 2005-11-04 at 20:41 -0800, Doug Cutting wrote:
>
>
>>Rod Taylor wrote:
>>
>>
>>>Here you go. local filesystem and a single job tracker on another
>>>machine. When the tasktracker and jobtracker are on the same box there
>>>isn't a problem. When they are on different machines it runs into
>>>issues.
>>>
>>>This is using mapred.local.dir on the local machine (not sharedd between
>>>sbider4 and sbider5):
>>>
>>>
>>> parsing /home/sitesell/localt/taskTracker/task_m_o59djj/job.xml
>>> [Fatal Error] :-1:-1: Premature end of file.
>>>
>>>
>>What is mapred.system.dir? That must be shared. Also, filenames you
>>pass to commands must be pathnames that work on all hosts.
>>
>>
>
>I managed to get past all of the initial injection problems by running a
>local crawl (no jobtracker) which created the crawldb/current/part-00000
>files. So I was able to do a real inject, with jobtracker, for all of
>the urls system wide without any complaints about files or directories
>not existing.
>
>Now, when trying to run a generate with a jobtracker it seems to have a
>hard time finding the temporary working areas from one job to the next.
>I cannot figure out where it is creating generate-temp-908680235. With
>NDFS it would be /user/$USER/
>
><-- nutch generate -->
>051107 091256 topN: 10000
>051107 091256 Generator: starting
>051107 091256 Generator:
>segment: /opt/sitesell/sbider_data/test2/segments/20051107091256
>051107 091256 Generator: Selecting most-linked urls due for fetch.
>051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
>051107 091256 parsing file:/opt/nutch-0.8_7/conf/mapred-default.xml
>051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
>051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
>051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
>051107 091256 Client connection to 192.168.100.14:5464: starting
>051107 091256 Running job: job_xhvq9b
>051107 091258 map 0%
>051107 091300 map 5%
>051107 091303 map 16%
>051107 091305 map 21%
>051107 091306 map 26%
>051107 091308 map 32%
>051107 091309 map 37%
>051107 091312 map 47%
>051107 091315 map 58%
>051107 091318 map 68%
>051107 091320 map 74%
>051107 091321 map 79%
>051107 091324 map 89%
>051107 091327 map 100%
>051107 091330 reduce 5%
>051107 091332 reduce 11%
>051107 091333 reduce 16%
>051107 091335 reduce 21%
>051107 091337 reduce 26%
>051107 091339 reduce 37%
>051107 091342 reduce 47%
>051107 091344 reduce 53%
>051107 091345 reduce 58%
>051107 091347 reduce 63%
>051107 091348 reduce 68%
>051107 091351 reduce 79%
>051107 091354 reduce 89%
>051107 091357 reduce 100%
>051107 091359 Job complete: job_xhvq9b
>051107 091359 Generator: Partitioning selected urls by host, for
>politeness.
>051107 091359 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
>051107 091359 parsing file:/opt/nutch-0.8_7/conf/mapred-default.xml
>051107 091359 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
>Exception in thread "main" java.io.IOException: No input directories
>specified in: NutchConf: nutch-default.xml ,
>mapred-default.xml , /home/sitesell/local/jobTracker/job_h22fvi.xml ,
>nutch-site.xml
> at org.apache.nutch.ipc.Client.call(Client.java:294)
> at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
> at $Proxy0.submitJob(Unknown Source)
> at
>org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
> at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
> at org.apache.nutch.crawl.Generator.generate(Generator.java:213)
> at org.apache.nutch.crawl.Generator.main(Generator.java:258)
>
>[sitesell@sbider5 sbider_data]$
>cat /home/sitesell/local/jobTracker/job_h22fvi.xml | grep input
><property><name>mapred.input.format.class</name><value>org.apache.nutch.mapred.SequenceFileInputFormat</value></property>
><property><name>mapred.input.dir</name><value>generate-temp-908680235</value></property>
><property><name>mapred.input.value.class</name><value>org.apache.nutch.io.UTF8</value></property>
><property><name>mapred.input.key.class</name><value>org.apache.nutch.crawl.CrawlDatum</value></property>
>
>
>
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 20:41 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > Here you go. local filesystem and a single job tracker on another
> > machine. When the tasktracker and jobtracker are on the same box there
> > isn't a problem. When they are on different machines it runs into
> > issues.
> >
> > This is using mapred.local.dir on the local machine (not sharedd between
> > sbider4 and sbider5):
>
> > parsing /home/sitesell/localt/taskTracker/task_m_o59djj/job.xml
> > [Fatal Error] :-1:-1: Premature end of file.
>
> What is mapred.system.dir? That must be shared. Also, filenames you
> pass to commands must be pathnames that work on all hosts.
I managed to get past all of the initial injection problems by running a
local crawl (no jobtracker) which created the crawldb/current/part-00000
files. So I was able to do a real inject, with jobtracker, for all of
the urls system wide without any complaints about files or directories
not existing.
Now, when trying to run a generate with a jobtracker it seems to have a
hard time finding the temporary working areas from one job to the next.
I cannot figure out where it is creating generate-temp-908680235. With
NDFS it would be /user/$USER/
<-- nutch generate -->
051107 091256 topN: 10000
051107 091256 Generator: starting
051107 091256 Generator:
segment: /opt/sitesell/sbider_data/test2/segments/20051107091256
051107 091256 Generator: Selecting most-linked urls due for fetch.
051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051107 091256 parsing file:/opt/nutch-0.8_7/conf/mapred-default.xml
051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051107 091256 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051107 091256 Client connection to 192.168.100.14:5464: starting
051107 091256 Running job: job_xhvq9b
051107 091258 map 0%
051107 091300 map 5%
051107 091303 map 16%
051107 091305 map 21%
051107 091306 map 26%
051107 091308 map 32%
051107 091309 map 37%
051107 091312 map 47%
051107 091315 map 58%
051107 091318 map 68%
051107 091320 map 74%
051107 091321 map 79%
051107 091324 map 89%
051107 091327 map 100%
051107 091330 reduce 5%
051107 091332 reduce 11%
051107 091333 reduce 16%
051107 091335 reduce 21%
051107 091337 reduce 26%
051107 091339 reduce 37%
051107 091342 reduce 47%
051107 091344 reduce 53%
051107 091345 reduce 58%
051107 091347 reduce 63%
051107 091348 reduce 68%
051107 091351 reduce 79%
051107 091354 reduce 89%
051107 091357 reduce 100%
051107 091359 Job complete: job_xhvq9b
051107 091359 Generator: Partitioning selected urls by host, for
politeness.
051107 091359 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051107 091359 parsing file:/opt/nutch-0.8_7/conf/mapred-default.xml
051107 091359 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
Exception in thread "main" java.io.IOException: No input directories
specified in: NutchConf: nutch-default.xml ,
mapred-default.xml , /home/sitesell/local/jobTracker/job_h22fvi.xml ,
nutch-site.xml
at org.apache.nutch.ipc.Client.call(Client.java:294)
at org.apache.nutch.ipc.RPC$Invoker.invoke(RPC.java:127)
at $Proxy0.submitJob(Unknown Source)
at
org.apache.nutch.mapred.JobClient.submitJob(JobClient.java:259)
at org.apache.nutch.mapred.JobClient.runJob(JobClient.java:288)
at org.apache.nutch.crawl.Generator.generate(Generator.java:213)
at org.apache.nutch.crawl.Generator.main(Generator.java:258)
[sitesell@sbider5 sbider_data]$
cat /home/sitesell/local/jobTracker/job_h22fvi.xml | grep input
<property><name>mapred.input.format.class</name><value>org.apache.nutch.mapred.SequenceFileInputFormat</value></property>
<property><name>mapred.input.dir</name><value>generate-temp-908680235</value></property>
<property><name>mapred.input.value.class</name><value>org.apache.nutch.io.UTF8</value></property>
<property><name>mapred.input.key.class</name><value>org.apache.nutch.crawl.CrawlDatum</value></property>
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 20:41 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > Here you go. local filesystem and a single job tracker on another
> > machine. When the tasktracker and jobtracker are on the same box there
> > isn't a problem. When they are on different machines it runs into
> > issues.
> >
> > This is using mapred.local.dir on the local machine (not sharedd between
> > sbider4 and sbider5):
>
> > parsing /home/sitesell/localt/taskTracker/task_m_o59djj/job.xml
> > [Fatal Error] :-1:-1: Premature end of file.
>
> What is mapred.system.dir? That must be shared. Also, filenames you
> pass to commands must be pathnames that work on all hosts.
Had the rest, but failed to override system.dir (description is "local
directory" which isn't really true if it is shared).
That worked through the map but failed at the reduce. Both the remote
task tracker and the task tracker on the same physical machine as the
job tracker failed.
Both had similar errors logged:
051104 235758 task_m_r2dcvc
0.6336343% /opt/sitesell/sbider_data/test/urls/list-oct31:167034415
+1758257
051104 235758 Server connection on port 45644 from 192.168.100.13:
exiting
051104 235759 task_m_r2dcvc
0.7225661% /opt/sitesell/sbider_data/test/urls/list-oct31:167034415
+1758257
051104 235800 task_m_r2dcvc
0.8255505% /opt/sitesell/sbider_data/test/urls/list-oct31:167034415
+1758257
051104 235801 task_m_r2dcvc
0.9183419% /opt/sitesell/sbider_data/test/urls/list-oct31:167034415
+1758257
051104 235802 task_m_r2dcvc
1.0% /opt/sitesell/sbider_data/test/urls/list-oct31:167034415+1758257
051104 235802 Task task_m_r2dcvc is done.
051104 235802 Server connection on port 45644 from 192.168.100.13:
exiting
java.io.FileNotFoundException: /opt/sitesell/sbider_data/test/system/submit_fubqfe/job.xml (No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at org.apache.nutch.fs.LocalFileSystem
$LocalNFSFileInputStream.<init>(LocalFileSystem.java:64)
at
org.apache.nutch.fs.LocalFileSystem.openRaw(LocalFileSystem.java:108)
at org.apache.nutch.fs.FileUtil.copyContents(FileUtil.java:57)
at
org.apache.nutch.fs.LocalFileSystem.copyToLocalFile(LocalFileSystem.java:297)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.localizeTask(TaskTracker.java:328)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.<init>(TaskTracker.java:314)
at
org.apache.nutch.mapred.TaskTracker.offerService(TaskTracker.java:214)
at org.apache.nutch.mapred.TaskTracker.run(TaskTracker.java:268)
at
org.apache.nutch.mapred.TaskTracker.main(TaskTracker.java:633)
051104 235806 Lost connection to JobTracker
[sbider5.sitebuildit.com/192.168.100.14:5464]. Retrying...
051104 235811 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051104 235811 parsing file:/opt/nutch-0.8_7/conf/mapred-default.xml
051104 235811
parsing /home/sitesell/local/taskTracker/task_r_mdnul7/job.xml
[Fatal Error] :-1:-1: Premature end of file.
051104 235811 SEVERE error parsing conf file:
org.xml.sax.SAXParseException: Premature end of file.
java.lang.RuntimeException: org.xml.sax.SAXParseException: Premature end
of file.
at
org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:358)
at org.apache.nutch.util.NutchConf.getProps(NutchConf.java:293)
at org.apache.nutch.util.NutchConf.get(NutchConf.java:94)
at org.apache.nutch.mapred.JobConf.getJar(JobConf.java:81)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.localizeTask(TaskTracker.java:332)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.<init>(TaskTracker.java:314)
at
org.apache.nutch.mapred.TaskTracker.offerService(TaskTracker.java:214)
at org.apache.nutch.mapred.TaskTracker.run(TaskTracker.java:268)
at
org.apache.nutch.mapred.TaskTracker.main(TaskTracker.java:633)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown
Source)
at
javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:172)
at
org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:318)
... 8 more
051104 235811 Lost connection to JobTracker
[sbider5.sitebuildit.com/192.168.100.14:5464]. Retrying...
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Doug Cutting <cu...@nutch.org>.
Rod Taylor wrote:
> Here you go. local filesystem and a single job tracker on another
> machine. When the tasktracker and jobtracker are on the same box there
> isn't a problem. When they are on different machines it runs into
> issues.
>
> This is using mapred.local.dir on the local machine (not sharedd between
> sbider4 and sbider5):
> parsing /home/sitesell/localt/taskTracker/task_m_o59djj/job.xml
> [Fatal Error] :-1:-1: Premature end of file.
What is mapred.system.dir? That must be shared. Also, filenames you
pass to commands must be pathnames that work on all hosts.
Doug
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 22:57 -0500, Rod Taylor wrote:
> On Fri, 2005-11-04 at 19:43 -0800, Doug Cutting wrote:
> > Rod Taylor wrote:
> > > I tried running one datanode per machine connecting back to the same SAN
> > > but it seemed pretty clunky. A crash of any datanode would take down
> > > the entire system (no data replication since it's a common data-store in
> > > the end). Reducing it to a single datanode did not have this impact.
> >
> > Why use NDFS at all? Why not just mount the SAN on all hosts? You're
> > not using NDFS as a distributed file system, but rather as a centralized
> > file system.
>
> I was unable to make the mapred branch work by using 'local' as the
> filesystem and having more than one tasktracker. Tasktrackers were
> unable to complete any work, although it was quite a while ago when I
> last tried (September).
Here you go. local filesystem and a single job tracker on another
machine. When the tasktracker and jobtracker are on the same box there
isn't a problem. When they are on different machines it runs into
issues.
This is using mapred.local.dir on the local machine (not sharedd between
sbider4 and sbider5):
051104 230802 parsing
file:/opt/nutch-0.8_7/conf/nutch-default.xml
051104 230802 parsing
file:/opt/nutch-0.8_7/conf/mapred-default.xml
051104 230802
parsing /home/sitesell/localt/taskTracker/task_m_o59djj/job.xml
[Fatal Error] :-1:-1: Premature end of file.
051104 230802 SEVERE error parsing conf file:
org.xml.sax.SAXParseException: Premature end of file.
java.lang.RuntimeException: org.xml.sax.SAXParseException:
Premature end of file.
at
org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:358)
at
org.apache.nutch.util.NutchConf.getProps(NutchConf.java:293)
at
org.apache.nutch.util.NutchConf.get(NutchConf.java:94)
at
org.apache.nutch.mapred.JobConf.getJar(JobConf.java:81)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.localizeTask(TaskTracker.java:332)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.<init>(TaskTracker.java:314)
at
org.apache.nutch.mapred.TaskTracker.offerService(TaskTracker.java:214)
at
org.apache.nutch.mapred.TaskTracker.run(TaskTracker.java:268)
at
org.apache.nutch.mapred.TaskTracker.main(TaskTracker.java:633)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at org.apache.xerces.parsers.DOMParser.parse(Unknown
Source)
at
org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at
javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:172)
at
org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:318)
... 8 more
051104 230802 Lost connection to JobTracker
[sbider5.sitebuildit.com/192.168.100.14:5464]. Retrying...
This is using a shared mapred.local.dir on the SAN:
051104 232115 parsing
file:/opt/nutch-0.8_7/conf/nutch-default.xml
051104 232115 parsing
file:/opt/nutch-0.8_7/conf/mapred-default.xml
051104 232115
parsing /opt/sitesell/sbider_data/test/local/taskTracker/task_m_l86ntl/job.xml
[Fatal Error] :-1:-1: Premature end of file.
051104 232116 SEVERE error parsing conf file:
org.xml.sax.SAXParseException: Premature end of file.
java.lang.RuntimeException: org.xml.sax.SAXParseException:
Premature end of file.
at
org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:358)
at
org.apache.nutch.util.NutchConf.getProps(NutchConf.java:293)
at
org.apache.nutch.util.NutchConf.get(NutchConf.java:94)
at
org.apache.nutch.mapred.JobConf.getJar(JobConf.java:81)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.localizeTask(TaskTracker.java:332)
at org.apache.nutch.mapred.TaskTracker
$TaskInProgress.<init>(TaskTracker.java:314)
at
org.apache.nutch.mapred.TaskTracker.offerService(TaskTracker.java:214)
at
org.apache.nutch.mapred.TaskTracker.run(TaskTracker.java:268)
at
org.apache.nutch.mapred.TaskTracker.main(TaskTracker.java:633)
Caused by: org.xml.sax.SAXParseException: Premature end of file.
at org.apache.xerces.parsers.DOMParser.parse(Unknown
Source)
at
org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at
javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:172)
at
org.apache.nutch.util.NutchConf.loadResource(NutchConf.java:318)
... 8 more
051104 232116 Lost connection to JobTracker
[sbider5.sitebuildit.com/192.168.100.14:5464]. Retrying...
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 19:43 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > I tried running one datanode per machine connecting back to the same SAN
> > but it seemed pretty clunky. A crash of any datanode would take down
> > the entire system (no data replication since it's a common data-store in
> > the end). Reducing it to a single datanode did not have this impact.
>
> Why use NDFS at all? Why not just mount the SAN on all hosts? You're
> not using NDFS as a distributed file system, but rather as a centralized
> file system.
I was unable to make the mapred branch work by using 'local' as the
filesystem and having more than one tasktracker. Tasktrackers were
unable to complete any work, although it was quite a while ago when I
last tried (September).
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Doug Cutting <cu...@nutch.org>.
Rod Taylor wrote:
> I tried running one datanode per machine connecting back to the same SAN
> but it seemed pretty clunky. A crash of any datanode would take down
> the entire system (no data replication since it's a common data-store in
> the end). Reducing it to a single datanode did not have this impact.
Why use NDFS at all? Why not just mount the SAN on all hosts? You're
not using NDFS as a distributed file system, but rather as a centralized
file system.
Doug
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 19:15 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > There is only a single datanode and there are 20 hosts.
>
> That's a lot of load on one datanode. I typically run a datanode on
> every host, accessing the local drives on that host.
I tried running one datanode per machine connecting back to the same SAN
but it seemed pretty clunky. A crash of any datanode would take down
the entire system (no data replication since it's a common data-store in
the end). Reducing it to a single datanode did not have this impact.
The boxes themselves don't have much for local drives aside from a bit
of temp space.
Recently we moved the datanode, namenode and jobtracker to their own
machine per your earlier suggestion and upgraded Nutch sources to Nov
1st from about October 20th. This is when the difficulties started.
Earlier with the single datanode, namenode and jobtracker on an
overloaded worker machine (load average was around 20 normally) things
worked without errors, but slowly.
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Doug Cutting <cu...@nutch.org>.
Rod Taylor wrote:
> There is only a single datanode and there are 20 hosts.
That's a lot of load on one datanode. I typically run a datanode on
every host, accessing the local drives on that host.
Doug
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
On Fri, 2005-11-04 at 13:43 -0800, Doug Cutting wrote:
> Rod Taylor wrote:
> > Every segment that I fetch seems to be missing a part when stored on the
> > filesystem. The stranger thing is it is always the same part (very
> > reproducible).
>
> This sounds strange. Are the datanode errors always on the same host?
> How many hosts are you running this on?
There is only a single datanode and there are 20 hosts.
--
Rod Taylor <rb...@sitesell.com>
Re: mapred bug -- bad part calculation?
Posted by Doug Cutting <cu...@nutch.org>.
Rod Taylor wrote:
> Every segment that I fetch seems to be missing a part when stored on the
> filesystem. The stranger thing is it is always the same part (very
> reproducible).
This sounds strange. Are the datanode errors always on the same host?
How many hosts are you running this on?
Doug
Re: mapred bug -- bad part calculation?
Posted by Rod Taylor <rb...@sitesell.com>.
I forgot to provide this earlier. Here is nutch ndfs -ls output for the
directory structure of a segment with a failed part-00013.
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133
051103 162002 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162003 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162003 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162003 Client connection to 192.168.100.15:5466: starting
Found 6 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text <dir>
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content
051103 162010 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162011 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162011 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162011 Client connection to 192.168.100.15:5466: starting
Found 20 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00000 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00001 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00002 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00003 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00004 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00005 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00006 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00007 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00008 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00009 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00010 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00011 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00012 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00013 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00014 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00015 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00016 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00017 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00018 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00019 <dir>
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00012
051103 162017 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162017 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162017 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162017 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00012/data 439524693
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00012/index 56208
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00013
051103 162019 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162019 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162019 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162020 Client connection to 192.168.100.15:5466: starting
Found 0 items
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00014
051103 162021 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162022 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162022 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162022 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00014/data 440339945
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/content/part-00014/index 56183
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch
051103 162033 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162034 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162034 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162034 Client connection to 192.168.100.15:5466: starting
Found 20 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00000 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00001 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00002 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00003 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00004 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00005 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00006 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00007 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00008 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00009 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00010 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00011 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00012 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00013 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00014 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00015 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00016 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00017 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00018 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00019 <dir>
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00013
051103 162039 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162039 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162039 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162039 Client connection to 192.168.100.15:5466: starting
Found 0 items
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00012
051103 162041 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162041 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162042 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162042 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00012/data 8784520
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00012/index 56208
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00014
051103 162043 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162043 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162044 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162044 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00014/data 8788470
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_fetch/part-00014/index 56183
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate
051103 162055 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162055 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162055 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162055 Client connection to 192.168.100.15:5466: starting
Found 20 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00000 9531698
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00001 9684746
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00002 9762019
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00003 9715727
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00004 9518134
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00005 9676499
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00006 9722801
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00007 9715404
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00008 9514007
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00009 9668149
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00010 9649085
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00011 9726466
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00012 9534012
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00013 9744911
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00014 9694646
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00015 9652845
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00016 9505674
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00017 9700052
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00018 9714650
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_generate/part-00019 9714743
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse
051103 162108 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162109 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162109 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162109 Client connection to 192.168.100.15:5466: starting
Found 19 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00000 155306656
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00001 163093258
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00002 155290671
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00003 163551019
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00004 156198582
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00005 163963632
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00006 155873286
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00007 162752185
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00008 155215446
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00009 163084991
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00010 154982905
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00011 164212118
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00012 154450623
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00014 155279291
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00015 163724449
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00016 154542758
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00017 162865027
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00018 154375952
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/crawl_parse/part-00019 162991584
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data
051103 162121 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162122 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162122 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162122 Client connection to 192.168.100.15:5466: starting
Found 20 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00000 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00001 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00002 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00003 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00004 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00005 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00006 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00007 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00008 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00009 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00010 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00011 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00012 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00013 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00014 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00015 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00016 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00017 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00018 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00019 <dir>
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00012
051103 162127 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162127 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162127 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162127 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00012/data 128385655
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00012/index 56509
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00013
051103 162129 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162129 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162129 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162129 Client connection to 192.168.100.15:5466: starting
Found 0 items
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00014
051103 162131 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162131 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162131 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162131 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00014/data 128731018
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_data/part-00014/index 55566
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text
051103 162139 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162140 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162140 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162140 Client connection to 192.168.100.15:5466: starting
Found 20 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00000 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00001 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00002 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00003 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00004 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00005 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00006 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00007 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00008 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00009 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00010 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00011 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00012 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00013 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00014 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00015 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00016 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00017 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00018 <dir>
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00019 <dir>
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00012
051103 162145 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162145 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162145 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162145 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00012/data 111853821
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00012/index 56509
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00013
051103 162147 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162147 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162147 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162147 Client connection to 192.168.100.15:5466: starting
Found 0 items
[rbt@sbider5 ~]$ /opt/nutch/bin/nutch ndfs
-ls /opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00014
051103 162149 parsing file:/opt/nutch-0.8_7/conf/nutch-default.xml
051103 162149 parsing file:/opt/nutch-0.8_7/conf/nutch-site.xml
051103 162149 No FS indicated, using
default:master1.sitebuildit.com:5466
051103 162149 Client connection to 192.168.100.15:5466: starting
Found 2 items
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00014/data 111121278
/opt/sitesell/sbider_data/nutch/segments/20051102031132/20051102031133/parse_text/part-00014/index 55566
On Thu, 2005-11-03 at 15:32 -0500, Rod Taylor wrote:
> Sources are from October 31st. Sun Standard Edition 1.5.0_02-b09 for
> amd64
>
> Every segment that I fetch seems to be missing a part when stored on the
> filesystem. The stranger thing is it is always the same part (very
> reproducible).
>
> If I have mapred.reduce.tasks set to 20, the hole is at part 13. That
> is, the part-00013 directory is empty while the remainder (0 through 12,
> 14 through 19) all have data.
>
> If I have mapred.reduce.tasks set to 19, the hole is at part 11.
> content/part-00011 is empty.
>
> Attached are my site configuration (reduce.tasks is 19), task log for a
> failing task and the output from the job tracker.
>
> Below is a snippet from the datanode log (the only errors that exist are
> related to this task or others which process the above part #) and below
> that the output from localhost:7845 on the jobtracker machine for the
> job.
>
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at
> java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at
> java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
> at
> java.io.BufferedInputStream.read(BufferedInputStream.java:313)
> at java.io.DataInputStream.read(DataInputStream.java:134)
> at org.apache.nutch.ndfs.DataNode
> $DataXceiver.run(DataNode.java:369)
> at java.lang.Thread.run(Thread.java:595)
> java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at
> java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at
> java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
> at
> java.io.BufferedInputStream.read(BufferedInputStream.java:313)
> at java.io.DataInputStream.read(DataInputStream.java:134)
> at org.apache.nutch.ndfs.DataNode
> $DataXceiver.run(DataNode.java:369)
> at java.lang.Thread.run(Thread.java:595)
>
>
> Job 'job_k1p80p'
>
> Job File: /home/sitesell/system/submit_2pgex8/job.xml
> Start time: Thu Nov 03 12:04:43 EST 2005
> The job failed at: Thu Nov 03 16:00:42 EST 2005
>
> __________________________________________________________________________________________________
>
> Map Tasks
>
> Map Task Id Pct Complete State
> Diagnostic Text
> task_m_2m1twe 1.0 103189 pages, 5045 errors, 13.1
> pages/s, 1000 kb/s,
> task_m_4nzguk 1.0 103141 pages, 5193 errors, 12.9
> pages/s, 988 kb/s,
> task_m_5aprs2 1.0 103427 pages, 4756 errors, 13.4
> pages/s, 1027 kb/s,
> task_m_6pd5q7 1.0 102650 pages, 5081 errors, 12.6
> pages/s, 962 kb/s,
> task_m_8qzj8p 1.0 103610 pages, 4539 errors, 13.6
> pages/s, 1039 kb/s,
> task_m_aev1di 1.0 102666 pages, 4997 errors, 13.2
> pages/s, 1007 kb/s,
> task_m_f2zfyw 1.0 103235 pages, 4662 errors, 13.6
> pages/s, 1045 kb/s,
> task_m_f84hfi 1.0 103746 pages, 4657 errors, 13.0
> pages/s, 991 kb/s,
> task_m_hhv9b9 1.0 102909 pages, 4972 errors, 13.5
> pages/s, 1026 kb/s,
> task_m_kijqqx 1.0 103439 pages, 4858 errors, 13.4
> pages/s, 1024 kb/s,
> task_m_n5mxax 1.0 102894 pages, 4953 errors, 13.3
> pages/s, 1017 kb/s,
> task_m_p45m8c 1.0 103705 pages, 4969 errors, 13.1
> pages/s, 1007 kb/s,
> task_m_qfevss 1.0 102640 pages, 5006 errors, 13.2
> pages/s, 1011 kb/s,
> task_m_qg3816 1.0 103658 pages, 5039 errors, 13.3
> pages/s, 1014 kb/s,
> task_m_rlxmuw 1.0 103609 pages, 4491 errors, 13.6
> pages/s, 1038 kb/s,
> task_m_t9ksdc 1.0 103053 pages, 5287 errors, 12.9
> pages/s, 994 kb/s,
> task_m_wt3oyf 1.0 103006 pages, 5168 errors, 13.3
> pages/s, 1014 kb/s,
> task_m_xk3gxz 1.0 103294 pages, 5216 errors, 13.0
> pages/s, 996 kb/s,
> task_m_yjrejy 1.0 103158 pages, 4787 errors, 13.5
> pages/s, 1038 kb/s,
>
> __________________________________________________________________________________________________
>
> Reduce Task Id Pct Complete State Diagnostic Text
> task_r_2ktith 1.0 reduce > reduce
> task_r_6hwvi0 1.0 reduce > reduce
> task_r_8bi6h5 1.0 reduce > reduce
> task_r_bpisbi 1.0 reduce > reduce
> task_r_cfoo7z 1.0 reduce > reduce
> task_r_cmy1r3 1.0 reduce > reduce
> task_r_efnd4k 1.0 reduce > reduce
> task_r_ervlp5 1.0 reduce > reduce
> task_r_kvmno7 1.0 reduce > reduce
> task_r_n4q36e 1.0 reduce > reduce
> task_r_o4st5w 1.0 reduce > reduce
> task_r_ow0sul 1.0 reduce > reduce
> task_r_r7u152 1.0 reduce > reduce
> task_r_ra99xx 1.0 reduce > reduce
> task_r_ush85v 1.0 reduce > reduce
> task_r_vbmkfw 1.0 reduce > reduce
> task_r_wbirax 1.0 reduce > reduce
> task_r_z17yss 1.0 reduce > reduce
> task_r_o9mv91 0.9153447 reduce > reduce Timed
> out.java.io.IOException: Task process exit with nonzero status.
> at org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139)
> at
> org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92) Timed
> out.java.io.IOException: Task process exit
> with nonzero status. at
> org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139) at
> org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92) Timed
> out.java.io.IOException: Task process exit
> with nonzero status. at
> org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139) at
> org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92) Timed
> out.java.io.IOException: Task process exit
> with nonzero status. at
> org.apache.nutch.mapred.TaskRunner.runChild(TaskRunner.java:139) at
> org.apache.nutch.mapred.TaskRunner.run(TaskRunner.java:92)
>
>
--
Rod Taylor <rb...@sitesell.com>