You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by bikram <bi...@yahoo.com> on 2007/08/22 09:27:06 UTC
Re: WIN XP PRO -Djava.protocol* file:///c:/folder/ Crawling Parents
Hi Vadim B
I am getting same error
org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=smb
were u able to rectify this error...
if yes, can u please tell me what you did which cleared this error..
already posted here all the details...
http://www.nabble.com/Windows-Share-Crawling---searching-tf4277499.html#a12175266
I am using Linux not cygwin on windows
thanx
Bikram
Hi,
I am working on the same issue as you, So far I could crawl file:///C:/* but
i am stucked on the smb part. It looks to me that this plugin isn't working
properly so it needs to be fixed for the newer version of nutch.
The error I get differs a bit from yours it is:
2007-05-25 18:06:29,573 INFO fetcher.Fetcher - fetching
smb://mobidick/test/
2007-05-25 18:06:29,573 INFO fetcher.Fetcher - fetch of
smb://mobidick/test/ failed with:
org.apache.nutch.protocol.ProtocolNotFound: protocol not found for url=smb
I will dive into the plugin-smb and try out to narrow the problem Maybe we
can work together to get a quick solution.
---SNIP---
# accept hosts in MY.DOMAIN.NAME
# Standart +^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/
+^file:///C:/Policies/ <<-- why you put it here it doesn't make sense
because the +^(file|smb) line above is already fitting so this will be
skipped
---SNIP ---
---SNIP ---
2007-05-24 14:04:22,000 WARN crawl.PartitionUrlByHost - Malformed URL:
'smb://sql1/Sales/DATA/'
//did you cuoted the url or is it displayed in the logs like this? I dont
get this error
---SNIP ---
try this in package org.apache.nutch.crawl.Crawl
public static void main(String args[]) throws Exception {
System.setProperty("java.protocol.handler.pkgs", "jcifs"); // new
LOG.info("SMB Info: " +
System.getProperty("java.protocol.handler.pkgs")); //new
LOG.info("SMB Info: " + new
java.util.PropertyPermission("java.protocol.handler.pkgs","read,
write").toString());//new
if (args.length < 1) {
System.out.println
("Usage: Crawl <urlDir> [-dir d] [-threads n] [-depth i] [-topN
N]");
return;
}
---SNIP---
check out this:
http://java.sun.com/developer/onlineTraining/protocolhandlers/
--
View this message in context: http://www.nabble.com/WIN-XP-PRO--Djava.protocol*-file%3A---c%3A-folder--Crawling-Parents-tf3809966.html#a12269503
Sent from the Nutch - User mailing list archive at Nabble.com.