You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Tsengtan A Shuy <tt...@sbcglobal.net> on 2007/07/07 19:45:11 UTC

mozdex as backend search engine.

I successfully implemented the "web search" menu in my www.epacificweb.com
website.
This menu uses mozdex.com as the backend search engine.

Adam Shuy, President
ePacific Web Design & Hosting
Professional Web/Software developer
TEL: 408-272-6946
www.epacificweb.com
-----Original Message-----
From: Insurance Squared Inc. [mailto:gcooke@insurancesquared.com] 
Sent: Thursday, July 05, 2007 8:55 AM
To: nutch-user@lucene.apache.org
Subject: Re: multiple sites run

Your conclusion is incorrect.  You can just call the URL with the search 
term from within a PHP script for example.  Mozdex will return XML 
formatted results.  Just take the results and format them within your 
page.  No link or other mention of mozdex is required.  It does do 
exactly what you're looking for it to do; act as an anonymous back end 
for your search engine.

If you Google (yeah, I appreciate the irony) for the term 'opensearch' 
you should find all the info you need.



Tsengtan A Shuy wrote:
> Someone suggest me to use Mozdex as a backend search engine, and bring the
> search result to your own web page.
>
> I did a study of this issue, and the conclusion is that Mozdex only allow
> you to link to their website and get the search result on their website.
If
> there is a way to treat Mozdex as a backend search engine, I like to see
the
> example. Thank you in advance.
>
> Adam Shuy, President
> ePacific Web Design & Hosting
> Professional Web/Software developer
> TEL: 408-272-6946
> www.epacificweb.com
> -----Original Message-----
> From: Tsengtan A Shuy [mailto:ttashuy@sbcglobal.net] 
> Sent: Wednesday, July 04, 2007 8:20 AM
> To: nutch-user@lucene.apache.org
> Subject: RE: multiple sites run
>
> The whole web crawl of "nutch version 0.8.* tutorial" is written for linux
> OS. Is there an similar article for windows eclipse ?
>
> Adam Shuy, President
> ePacific Web Design & Hosting
> Professional Web/Software developer
> TEL: 408-272-6946
> www.epacificweb.com
>
> -----Original Message-----
> From: Tsengtan A Shuy [mailto:ttashuy@sbcglobal.net] 
> Sent: Tuesday, July 03, 2007 2:40 PM
> To: nutch-user@lucene.apache.org
> Subject: RE: multiple sites run
>
> Ignore my last email, I fix the problem.
>
> Adam Shuy, President
> ePacific Web Design & Hosting
> Professional Web/Software developer
> TEL: 408-272-6946
> www.epacificweb.com
> -----Original Message-----
> From: Tsengtan A Shuy [mailto:ttashuy@sbcglobal.net] 
> Sent: Tuesday, July 03, 2007 2:33 PM
> To: nutch-user@lucene.apache.org
> Subject: RE: multiple sites run
>
> I was able to follow nutch version 0.8.* tutorial to run the whole-web
> crawl.
> I ran the inject and generate commands successfully in my windows eclipse
> environment.  
> But when I ran fetch command, I got the following error message:
>
> 2007-07-03 14:28:18,890 ERROR mapred.JobClient
> (JobClient.java:submitJob(273)) - Input directory
>
C:/JavaSearchEngine/nutch-0.8.1/crawl-epwl/segments/20070703140147/crawl_gen
> erate in local is invalid.
> Exception in thread "main" java.io.IOException: Input directory
>
C:/JavaSearchEngine/nutch-0.8.1/crawl-epwl/segments/20070703140147/crawl_gen
> erate in local is invalid.
> 	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:274)
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:327)
> 	at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:443)
> 	at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:477)
>
> Can anyone help me to solve this problem?
>
> Adam Shuy, President
> ePacific Web Design & Hosting
> Professional Web/Software developer
> TEL: 408-272-6946
> www.epacificweb.com
> -----Original Message-----
> From: Tsengtan A Shuy [mailto:ttashuy@sbcglobal.net] 
> Sent: Tuesday, July 03, 2007 12:27 PM
> To: nutch-user@lucene.apache.org
> Subject: RE: multiple sites run
>
> I ran this 1002 websites in my cygwin environment.
> I got the following error in the hadoop.log file:
> java.lang.ClassNotFoundException:
> org.apache.nutch.urlfilter.regex.RegexURLFilter
>
> How can I include this class into my cygwin environment.
>
> Adam Shuy, President
> ePacific Web Design & Hosting
> Professional Web/Software developer
> TEL: 408-272-6946
> www.epacificweb.com
> -----Original Message-----
> From: Tsengtan A Shuy [mailto:ttashuy@sbcglobal.net] 
> Sent: Tuesday, July 03, 2007 11:31 AM
> To: nutch-user@lucene.apache.org
> Subject: RE: multiple sites run
>
> I followed you advice, and change the JDK Compliance to include 1.4
> compatibility running Java 5.0.
> But the result folder of Crawl is still smaller than the folder only
running
> my own website.
> What is wrong with my 1002 websites run?
>
> Adam Shuy, President
> ePacific Web Design & Hosting
> Professional Web/Software developer
> TEL: 408-272-6946
> www.epacificweb.com
> -----Original Message-----
> From: Kai_testing Middleton [mailto:kai_testing@yahoo.com] 
> Sent: Tuesday, July 03, 2007 10:38 AM
> To: nutch-user@lucene.apache.org
> Subject: Re: multiple sites run
>
> Re eclipse:
>
> Navigate to Project, then Properties, then Java Compiler.  There's a place
> to specify "JDK Compliance" in the right hand pane.
>
> ----- Original Message ----
> From: Tsengtan A Shuy <tt...@sbcglobal.net>
> To: nutch-user@lucene.apache.org
> Sent: Tuesday, July 3, 2007 9:59:39 AM
> Subject: multiple sites run
>
> I follow the RunNutchInEclipse wiki article to run 1002 websites.
> I got all the five folders, but the size of the these folders is smaller
> then the one only running my own website.
>
> What went wrong with this 1002 websites run.
>
> How do you run Java 1.4 and 1.5 at the same time in Eclipse environment?
>
> Adam Shuy, President
> ePacific Web Design & Hosting
> Professional Web/Software developer
> TEL: 408-272-6946
> www.epacificweb.com
>
>
>
>
>
>
>
>
>        
>
____________________________________________________________________________
> ________
> Choose the right car based on your needs.  Check out Yahoo! Autos new Car
> Finder tool.
> http://autos.yahoo.com/carfinder/
>
>
>