You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Marc DELERUE <MD...@polepositioning.com> on 2005/06/02 10:05:57 UTC

inactive result links

Hi,

Here is my problem : i'm indexing a windows lan with Nutch.

The directory I need to index are mounted via samba.

Everything's good when I crawl but :

Each time I perform a query from a computer which is not the one where
nutch is installed, I can't access the file. Links are inactive and I
only can see the directories content with the cached page.

In fact, the URLs I see are file://xxxxxxxxxxxxxxxx
<file:///\\xxxxxxxxxxxxxxxx> , so I think it would be better if they
were http://192.168.xxxxx/xxxx but i don't know how to set it.

 

If someone could help me....

 

Best regards?

 

Marc Delerue

mdelerue@linux62.org

mdelerue@polepositioning.com

 

\_@< plop !

 


Re: inactive result links

Posted by Jérôme Charron <je...@gmail.com>.
Marc, first of all, it seems that your previous problem ("problems
with file protocol") is now solved. What was the real problem? How do
you solve it?

> In fact, the URLs I see are file://xxxxxxxxxxxxxxxx
> <file:///\\xxxxxxxxxxxxxxxx> , so I think it would be better if they
> were http://192.168.xxxxx/xxxx but i don't know how to set it.
> If someone could help me....
Marc, it assumes that an http server is running on all the machines
you access via samba, and that you can provides to nutch a mapping
from the local file system structure of all the mounted samba drives
to the documents structure of the http server (ie that you are able to
map file://mySambaMountPoint1/aPath/aFile to
http://aHost/aMappedPath/myFile for all the mounted machines)...
Not an easy problem to solve since it is very dependent on your
intranet structure (mainly some deployment considerations).

But take a look on the Nutch mailing lists, a previous discussion was
started on this point some weeks (months) ago.

Jerome

PS: Marc, it could probably be more convenient for us to speak on the
Frutch (French) mailing list, and then to post a summary in english on
the Nutch mailing list. No?


-- 
http://motrech.free.fr/
http://frutch.free.fr/