You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2009/12/25 10:38:13 UTC

[Nutch Wiki] Update of "PublicServers" by RBalmes

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The "PublicServers" page has been changed by RBalmes.
http://wiki.apache.org/nutch/PublicServers?action=diff&rev1=71&rev2=72

--------------------------------------------------

  = Public search engines using Nutch =
+ 
  Please sort by name alphabetically
  
-  * [[http://askaboutoil.com|AskAboutOil]] is a vertical search portal for the petroleum industry.
+   * [[http://askaboutoil.com|AskAboutOil]] is a vertical search portal for the petroleum industry.
  
-  * [[http://www.asbestosinfo.info|Asbestos]] is a vertical search portal and discussion forum for the asbestos and related information.
+   * [[http://www.asbestosinfo.info|Asbestos]] is a vertical search portal and discussion forum for the asbestos and related information.
  
-  * [[http://www.baynote.com/go|Baynote]] provides free hosted Nutch search for businesses.
+   * [[http://www.baynote.com/go|Baynote]] provides free hosted Nutch search for businesses.
  
-  * [[http://betherebesquare.com|BeThere BeSquare]] is an Event Search Engine for the San Francisco Bay Area that allows users to specify keywords, date, city, address, and category and get details about events in 4 different views.
+   * [[http://betherebesquare.com|BeThere BeSquare]] is an Event Search Engine for the San Francisco Bay Area that allows users to specify keywords, date, city, address, and category and get details about events in 4 different views.
  
-  * [[http://www.bible-ref.om/|Biible]] is the first biblical search engine that allows people to search the web for comments of biblical verse or range of verse. 6 major languages are fully recognized and 150 partially for now. Based on Nutch.
+   * [[http://www.bigsearch.ca/|Bigsearch.ca]] uses nutch open source software to deliver its search results.
  
-  * [[http://www.bigsearch.ca/|Bigsearch.ca]] uses nutch open source software to deliver its search results.
+   * [[http://busytonight.com/|BusyTonight]]: Search for any event in the United States, by keyword, location, and date. Event listings are automatically crawled and updated from original source Web sites.
  
-  * [[http://busytonight.com/|BusyTonight]]: Search for any event in the United States, by keyword, location, and date. Event listings are automatically crawled and updated from original source Web sites.
+   * [[http://www.centralbudapest.com/search|Central Budapest Search]] is a search engine for English language sites focussing on Budapest news, restaurants, accommodation, life and events.
  
-  * [[http://www.centralbudapest.com/search|Central Budapest Search]] is a search engine for English language sites focussing on Budapest news, restaurants, accommodation, life and events.
+   * [[http://circuitscout.com|Circuit Scout]] is a search engine for electrical circuits.
  
-  * [[http://circuitscout.com|Circuit Scout]] is a search engine for electrical circuits.
+   * [[http://www.comtecsearch.com|Comtec Search]] is a search engine for UK Tour Operator Package Holiday Brochures.
  
-  * [[http://www.comtecsearch.com|Comtec Search]] is a search engine for UK Tour Operator Package Holiday Brochures.
+   * [[http://www.coder-suche.de|Coder-Suche.de]] searchs for coding stuff like apis, documentations, tutorials, openBooks and more. Its origin is german, its contents are mainly english.
  
-  * [[http://www.coder-suche.de|Coder-Suche.de]] searchs for coding stuff like apis, documentations, tutorials, openBooks and more. Its origin is german, its contents are mainly english.
+   * [[http://campusgw.library.cornell.edu/|Cornell University Library]] is collaborating with the research group of Thorsten Joachims to develop a learning search engine for library web pages based on Nutch. The nutch-based search engine is near the bottom of the page.
  
-  * [[http://campusgw.library.cornell.edu/|Cornell University Library]] is collaborating with the research group of Thorsten Joachims to develop a learning search engine for library web pages based on Nutch. The nutch-based search engine is near the bottom of the page.
+   * [[http://search.creativecommons.org/|Creative Commons]] is a search engine for creative commons licensed material.
  
-  * [[http://search.creativecommons.org/|Creative Commons]] is a search engine for creative commons licensed material.
+   * [[http://www.dadi360.com/|Dadi360]] Usee nutch search engine for providing search of Chinese language websites in North America.
  
-  * [[http://www.dadi360.com/|Dadi360]] Usee nutch search engine for providing search of Chinese language websites in North America.
+   * [[http://www.ecolicommunity.org/Websearch|Ecolhub Web Search]] an E. coli specific search engine based on Nutch. EcoliHub WebSearch includes only those sites relevant to E. coli, thereby reducing the number of spurious hits. Searches can be optionally limited to your choice of resources. More than 110,000 pages to search. More resources getting added.
  
-  * [[http://www.ecolicommunity.org/Websearch|Ecolhub Web Search]] an E. coli specific search engine based on Nutch. EcoliHub WebSearch includes only those sites relevant to E. coli, thereby reducing the number of spurious hits. Searches can be optionally limited to your choice of resources. More than 110,000 pages to search. More resources getting added.
+   * [[http://www.epivista.de/|Epivista]] is a search engine of epilepsy related web sites.
  
-  * [[http://www.epivista.de/|Epivista]] is a search engine of epilepsy related web sites.
+   * [[http://www.eroscanner.com/|eroscanner]] is a search engine for german adult stuff. Watching the quality of ranking in this hard-fought area might be very interesting. (Warning: '''NSFW''')
  
-  * [[http://www.eroscanner.com/|eroscanner]] is a search engine for german adult stuff. Watching the quality of ranking in this hard-fought area might be very interesting. (Warning: '''NSFW''')
+   * [[http://www.ertech.ch/|ertech]] uses nutch as its search engine. It is integrated with the CMS system aarcat from aarboard.
  
-  * [[http://www.ertech.ch/|ertech]] uses nutch as its search engine. It is integrated with the CMS system aarcat from aarboard.
+   * [[http://www.erzsuche.de|Erzsuche.de]] is a local search engine for the Erzgebirge (For what? It is the home of the nutcracker) With spell check feature
  
-  * [[http://www.erzsuche.de|Erzsuche.de]] is a local search engine for the Erzgebirge (For what? It is the home of the nutcracker) With spell check feature
+   * [[http://search.fileratings.com|FileRatings Search]] is a search engine of software product.
  
-  * [[http://search.fileratings.com|FileRatings Search]] is a search engine of software product.
+   * [[http://www.gensphere.org/|GenSphere]] - Genealogy Search Engine based on Nutch.
  
-  * [[http://www.gensphere.org/|GenSphere]] - Genealogy Search Engine based on Nutch.
+   * [[http://www.gina-erotic-search.net/|Gina Wild Erotic Search Engine]] is based on nutch and uses the language identifier modul to present results according to the choosen language.  (Warning: '''NSFW''')
  
-  * [[http://www.gina-erotic-search.net/|Gina Wild Erotic Search Engine]] is based on nutch and uses the language identifier modul to present results according to the choosen language.  (Warning: '''NSFW''')
+   * [[http://www.jboss.com/search.jsp?query=http&x=0&y=0|jboss homepage]] The jboss (tm) homepage runs a nutch as homepage search engine.
  
-  * [[http://www.jboss.com/search.jsp?query=http&x=0&y=0|jboss homepage]] The jboss (tm) homepage runs a nutch as homepage search engine.
+   * [[http://www.jcintersonic.com/|J&C Intersonic]] uses nutch as its search engine.
  
-  * [[http://www.jcintersonic.com/|J&C Intersonic]] uses nutch as its search engine.
+   * [[http://www.jumblefox.com.au/|Jumble Fox]] - The Australian Search Engine
  
-  * [[http://www.jumblefox.com.au/|Jumble Fox]] - The Australian Search Engine
+   * [[http://www.knowmydestination.com/|KnowMyDestination]] - Search Engine for Travel related stuff. We have created this search engine by using Google WebAPIs to fetch relavant start URLs and then use Nutch to crawl and index those URLs.
  
-  * [[http://www.knowmydestination.com/|KnowMyDestination]] - Search Engine for Travel related stuff. We have created this search engine by using Google WebAPIs to fetch relavant start URLs and then use Nutch to crawl and index those URLs.
+   * [[http://krugle.com|Krugle]] uses Nutch to crawl web pages for code, archives and technically-interesting content. We also use a modified version of Nutch to crawl CVS/Subversion repositories, and the NutchBean/distributed searcher support to search and generate hits for code and tech info queries.
  
-  * [[http://krugle.com|Krugle]] uses Nutch to crawl web pages for code, archives and technically-interesting content. We also use a modified version of Nutch to crawl CVS/Subversion repositories, and the NutchBean/distributed searcher support to search and generate hits for code and tech info queries.
+   * [[http://www.labforculture.org|LabforCulture]] - The essential tool for everyone in arts and culture who creates, collaborates, shares and produces across borders in Europe.
  
-  * [[http://www.labforculture.org|LabforCulture]] - The essential tool for everyone in arts and culture who creates, collaborates, shares and produces across borders in Europe.
+   * [[http://LOOQ.EU/|LOOQ.EU]] - European search engine which indexes sites in Europe.
  
-  * [[http://LOOQ.EU/|LOOQ.EU]] - European search engine which indexes sites in Europe.
+   * [[http://LDSsearch.com/|LDSsearch.com]] - Search engine which indexes sites with a positive bias toward the mormon church.
  
-  * [[http://LDSsearch.com/|LDSsearch.com]] - Search engine which indexes sites with a positive bias toward the mormon church.
+   * [[http://www.millionpixelsearchpage.com|The Million Pixel Search Page]] - Search engine for Alex Tew's [[http://www.milliondollarhomepage.com|Million Dollar Homepage]].
  
-  * [[http://www.millionpixelsearchpage.com|The Million Pixel Search Page]] - Search engine for Alex Tew's [[http://www.milliondollarhomepage.com|Million Dollar Homepage]].
+   * [[http://www.misterbot.fr|Misterbot.fr]] a search engine for french language web sites.
  
-  * [[http://www.misterbot.fr|Misterbot.fr]] a search engine for french language web sites.
+   * [[http://search.mountbatten.net|Mountbatten Search]] a search engine that crawls only the part of the Internet located in Uganda.
  
-  * [[http://search.mountbatten.net|Mountbatten Search]] a search engine that crawls only the part of the Internet located in Uganda.
+   * [[http://www.mozdex.com|mozDex]].com Running Nutch SVN release with Clustering & Ontology support enabled.
  
-  * [[http://www.mozdex.com|mozDex]].com Running Nutch SVN release with Clustering & Ontology support enabled.
+   * [[http://www.myopensourcejobs.com|MyOpensourcejobs]] A Opensource skills jobs site using NUTCH and LAMP based    DRUPAL CMS.
  
-  * [[http://www.myopensourcejobs.com|MyOpensourcejobs]] A Opensource skills jobs site using NUTCH and LAMP based    DRUPAL CMS.
+   * [[http://www.nsyght.com|Nsyght.com]] is a social search engine that customizes a users search based on their social graph.
  
-  * [[http://www.nsyght.com|Nsyght.com]] is a social search engine that customizes a users search based on their social graph.
+   * [[http://www.nursewebsearch.com|Nurse Web Search]] - Health Internet Search Engine.
  
-  * [[http://www.nursewebsearch.com|Nurse Web Search]] - Health Internet Search Engine.
+   * [[http://www.netluchs.de/|Netluchs.de]] Searchengine for german language websites.
  
-  * [[http://www.netluchs.de/|Netluchs.de]] Searchengine for german language websites.
+   * [[http://nowaccepting.com|NowAccepting.com]] is a job search engine.
  
-  * [[http://nowaccepting.com|NowAccepting.com]] is a job search engine.
+   * [[http://www.playfuls.com/|Playfuls.com]] is a search engine that indexes the most important english gaming-related websites.
  
-  * [[http://www.playfuls.com/|Playfuls.com]] is a search engine that indexes the most important english gaming-related websites.
+   * [[http://www.gouv.qc.ca/|Government of Quebec websites]] Over 400 websites of the government of Quebec (Canada) are indexed by Nutch. The Web application has been developped by [[http://www.doculibre.com/index_en.html/|Doculibre inc.]]
  
-  * [[http://www.gouv.qc.ca/|Government of Quebec websites]] Over 400 websites of the government of Quebec (Canada) are indexed by Nutch. The Web application has been developped by [[http://www.doculibre.com/index_en.html/|Doculibre inc.]]
+   * [[http://search2.net/|search2.net]] is a general search engine with an international index.
+   * [[http://www.searchmitchell.com/|SearchMitchell.com]] is a community search engine for businesses and organizations in Mitchell, SD.
  
+   * [[http://www.umkreisfinder.de/|UmkreisFinder.de]] is running the [[GeoPosition]] plugin for local searches in Germany and in German. Please insert a search term in the first field, a German city name in the second field and choose a perimeter at the last field.
-  * [[http://search2.net/|search2.net]] is a search engine based on Nutch.
-  * [[http://www.searchmitchell.com/|SearchMitchell.com]] is a community search engine for businesses and organizations in Mitchell, SD.
  
-  * [[http://www.umkreisfinder.de/|UmkreisFinder.de]] is running the GeoPosition plugin for local searches in Germany and in German. Please insert a search term in the first field, a German city name in the second field and choose a perimeter at the last field.
+   * [[http://webharvest.gov|Webharvest.gov]] offers full-text search of nearly 100 million resources collected from US Federal Government websites as part of the National Archive and Records Administration's 2004 Presidential Term Web Harvest
  
-  * [[http://webharvest.gov|Webharvest.gov]] offers full-text search of nearly 100 million resources collected from US Federal Government websites as part of the National Archive and Records Administration's 2004 Presidential Term Web Harvest
+   * [[http://www.werelate.org|WeRelate.org]] offers a verticle genealogy search and a MediaWiki site featuring 1.3 million sources plus information for names and places.
  
-  * [[http://www.werelate.org|WeRelate.org]] offers a verticle genealogy search and a MediaWiki site featuring 1.3 million sources plus information for names and places.
+   * [[http://www.synoo.com:8080|Synoo.com]] is a small web search engine
  
-  * [[http://www.synoo.com:8080|Synoo.com]] is a small web search engine
+   * [[http://www.tokenizer.org|Tokenizer]] is an online shopping search engine partially powered by Nutch
  
-  * [[http://www.tokenizer.org|Tokenizer]] is an online shopping search engine partially powered by Nutch
+   * [[http://www.utilitysearch.info/|UtilitySearch]] is a search engine for the regulated utility industries (Electricity, Water, Gas, and Telecommunications) in the United States and Canada.
+   * [[http://search.tamilsweb.com/|TamilSWeb Search]] is a search engine geared toward south asian web content.
  
-  * [[http://www.utilitysearch.info/|UtilitySearch]] is a search engine for the regulated utility industries (Electricity, Water, Gas, and Telecommunications) in the United States and Canada.
-  * [[http://search.tamilsweb.com/|TamilSWeb Search]] is a search engine geared toward south asian web content.
-