You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2011/07/05 08:42:20 UTC

[Solr Wiki] Update of "SolrEcosystem" by DavidSmiley

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SolrEcosystem" page has been changed by DavidSmiley:
http://wiki.apache.org/solr/SolrEcosystem

New page:
The concept of this page is to present some categories of software related to Solr in some way.  Some software is a direct Solr integration, others are merely conceptually related but are notable.

<<TableOfContents(2)>>

= Solr distributions / forks =

 * Constellio Enterprise
 * Lucid Works Enterprise (not free)

= Data acquisition =

There are numerous ways to bring data into Solr. Many people roll their own solution or use the DIH. 

== Crawlers ==

Web, email, and file crawlers.

 * [[http://lucene.apache.org/nutch/|Nutch]] (web, ...?)
 * [[http://en.wikipedia.org/wiki/Heritrix|Heritrix]] (web, ...?)
 * [[http://incubator.apache.org/droids/|Droids]] ( ? )
 * [[http://www.crawl-anywhere.com/|Crawl-Anywhere]] (web, ...?)
 * [[DataImportHandler]] (email, files)

== Pipelines / Document Processing ==
 
Frameworks for flexible document processing. See [[DocumentProcessing]] for more background.

 * ETL (Extract Transform Load)
  * [[http://sourceforge.net/projects/cloveretl/|CloverETL]]  LGPL
  * [[http://kettle.pentaho.com/|Pentaho Kettle]]
 * [[http://www.openpipeline.org|OpenPipeline]]
 * [[https://github.com/kolstae/openpipe|OpenPipe]]
 * [[DataImportHandler]]