You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by nasm <ri...@gmail.com> on 2006/07/18 16:48:32 UTC

Could we configure nutch-site.xml with two directories?

 hi,
How can i configure nutch-site.xml with two search directories, the example
shown below:

<property>
          <name>searcher.dir </name>
          <value>/home/tyrell/nutch-0.7/crawl.virtusa/eng </value>  
</property>

<property>
          <name>searcher.dir </name>
          <value>/home/tyrell/nutch-0.7/crawl.virtusa/de </value>  
</property>


The real problem is that i have website with more than one language, so when
i crawl the web page it crawl all site. From this point, i want to direct
search.jsp page to different directories. could/how it possible to solve
this problem? 

thanks beforehand.   

-- 
View this message in context: http://www.nabble.com/Could-we-configure-nutch-site.xml-with-two-directories--tf1960930.html#a5379553
Sent from the Nutch - User forum at Nabble.com.


Re: Could we configure nutch-site.xml with two directories?

Posted by nasm <ri...@gmail.com>.
can you give an example exploition of two war files? 

thnx
-- 
View this message in context: http://www.nabble.com/Could-we-configure-nutch-site.xml-with-two-directories--tf1960930.html#a5463224
Sent from the Nutch - User forum at Nabble.com.


Re: Could we configure nutch-site.xml with two directories?

Posted by Sudhi Seshachala <su...@yahoo.com>.
There are couple of ways that this could be done as per the mailing lists.
  One is, user is given the choice of selecting the directory or you could deploy two war files with different searcher.dir configured to correspinding conf folder
   
  Thanks
  Sudhi

nasm <ri...@gmail.com> wrote:
  
hi,
How can i configure nutch-site.xml with two search directories, the example
shown below:



searcher.dir 
/home/tyrell/nutch-0.7/crawl.virtusa/eng 





searcher.dir 
/home/tyrell/nutch-0.7/crawl.virtusa/de 




The real problem is that i have website with more than one language, so when
i crawl the web page it crawl all site. From this point, i want to direct
search.jsp page to different directories. could/how it possible to solve
this problem? 

thanks beforehand. 

-- 
View this message in context: http://www.nabble.com/Could-we-configure-nutch-site.xml-with-two-directories--tf1960930.html#a5379553
Sent from the Nutch - User forum at Nabble.com.



 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: Could we configure nutch-site.xml with two directories?

Posted by Sudhi Seshachala <su...@yahoo.com>.
Oops, Ignore my previous mail. Just check the search.jsp. there is a parameter "lang". By default it is set to en. You could change based on the locale settings. Accordingly you could manage the search directories too.
  Please refer to search.jsp and Opensearchservlet. It is pretty straight forward.
   
  Thanks

nasm <ri...@gmail.com> wrote:
  
hi,
How can i configure nutch-site.xml with two search directories, the example
shown below:



searcher.dir 
/home/tyrell/nutch-0.7/crawl.virtusa/eng 





searcher.dir 
/home/tyrell/nutch-0.7/crawl.virtusa/de 




The real problem is that i have website with more than one language, so when
i crawl the web page it crawl all site. From this point, i want to direct
search.jsp page to different directories. could/how it possible to solve
this problem? 

thanks beforehand. 

-- 
View this message in context: http://www.nabble.com/Could-we-configure-nutch-site.xml-with-two-directories--tf1960930.html#a5379553
Sent from the Nutch - User forum at Nabble.com.



 		
---------------------------------
How low will we go? Check out Yahoo! Messenger’s low  PC-to-Phone call rates.

Re: Could we configure nutch-site.xml with two directories?

Posted by Bipin Parmar <bi...@yahoo.com>.
Hi,

I think that you should enable the "Language
Identification Parser/Filter" or write your own to
assign the language to each document in the index.
With this you will have just one index with each
document having language="en" or langugage="de".

Depending on whether the user is accessing your
English language site or German site, you can add 
+language:en or +language:de to your search query.

I have not tried this but it may work.

Thanks,

Bipin

--- nasm <ri...@gmail.com> wrote:

> 
>  hi,
> How can i configure nutch-site.xml with two search
> directories, the example
> shown below:
> 
> <property>
>           <name>searcher.dir </name>
>          
> <value>/home/tyrell/nutch-0.7/crawl.virtusa/eng
> </value>  
> </property>
> 
> <property>
>           <name>searcher.dir </name>
>          
> <value>/home/tyrell/nutch-0.7/crawl.virtusa/de
> </value>  
> </property>
> 
> 
> The real problem is that i have website with more
> than one language, so when
> i crawl the web page it crawl all site. From this
> point, i want to direct
> search.jsp page to different directories. could/how
> it possible to solve
> this problem? 
> 
> thanks beforehand.   
> 
> -- 
> View this message in context:
>
http://www.nabble.com/Could-we-configure-nutch-site.xml-with-two-directories--tf1960930.html#a5379553
> Sent from the Nutch - User forum at Nabble.com.
> 
>