You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Mr Shore <sh...@gmail.com> on 2009/06/05 06:43:28 UTC
org.apache.nutch.protocol.file.FileError: File Error: 404
During the crawling process,I see lots of report on
org.apache.nutch.protocol.file.FileError: File Error: 404,which are
all on locations with space in it.
I'm using nutch0.9,
is this really of bug?Any patch for it?
Here is part of the error logs:
/usr/local/apache2/resumes_txt/50/Summit
Point/Marissafolli/Receptionist/Administrative Assistant /Marissa
org.apache.nutch.protocol.file.FileError: File Error: 404
at org.apache.nutch.protocol.file.File.getProtocolOutput(File.java:100)
at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:145)
org.apache.nutch.protocol.file.FileError: File Error: 404
at org.apache.nutch.protocol.file.File.getProtocolOutput(File.java:100)
at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:145)
The exact file is actually:
[root@file ~]# ls /usr/local/apache2/resumes_txt/50/Summit\
Point/Marissafolli/Receptionist/Administrative\ Assistant\
/Marissa\'s\ Resume.txt.txt
/usr/local/apache2/resumes_txt/50/Summit
Point/Marissafolli/Receptionist/Administrative Assistant /Marissa's
Resume.txt.txt
Seems nutch has failed to parse the url?
I'm using the file protocol,
sample url:
fetching file:////usr/local/apache2/resumes_txt/50/Ronceverte/tonyobrien/Owner/Operator/Anthony
O
--
http://maishudi.com/OMegle.php
Anonymous private chatting,have fun!