You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Mr Shore <sh...@gmail.com> on 2009/06/05 06:43:28 UTC

org.apache.nutch.protocol.file.FileError: File Error: 404

During the crawling process,I see lots of report on
org.apache.nutch.protocol.file.FileError: File Error: 404,which are
all on locations with space in it.
I'm using nutch0.9,
is this really of bug?Any patch for it?

Here is part of the error logs:
/usr/local/apache2/resumes_txt/50/Summit
Point/Marissafolli/Receptionist/Administrative Assistant /Marissa
org.apache.nutch.protocol.file.FileError: File Error: 404
        at org.apache.nutch.protocol.file.File.getProtocolOutput(File.java:100)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:145)
org.apache.nutch.protocol.file.FileError: File Error: 404
        at org.apache.nutch.protocol.file.File.getProtocolOutput(File.java:100)
        at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:145)

The exact file is actually:
[root@file ~]# ls /usr/local/apache2/resumes_txt/50/Summit\
Point/Marissafolli/Receptionist/Administrative\ Assistant\
/Marissa\'s\ Resume.txt.txt
/usr/local/apache2/resumes_txt/50/Summit
Point/Marissafolli/Receptionist/Administrative Assistant /Marissa's
Resume.txt.txt

Seems nutch has failed to parse the url?
I'm using the file protocol,
sample url:
fetching file:////usr/local/apache2/resumes_txt/50/Ronceverte/tonyobrien/Owner/Operator/Anthony
O



-- 
http://maishudi.com/OMegle.php

Anonymous private chatting,have fun!