You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2015/07/23 21:20:29 UTC

[Solr Wiki] Update of "SolrEcosystem" by PascalEssiembre

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "SolrEcosystem" page has been changed by PascalEssiembre:
https://wiki.apache.org/solr/SolrEcosystem?action=diff&rev1=19&rev2=20

Comment:
Added Norconex Collectors to "Crawlers And Connectors" section.  Sorted the entries.

  <<TableOfContents(2)>>
  
  = Solr distributions / forks =
- 
   * [[http://sourceforge.net/projects/rivues/|Rivulet Enterprise Search]]
   * [[http://www.constellio.com/|Constellio Enterprise]]
   * [[http://lucidworks.com/product/fusion/|Lucidworks Fusion]] - commercially supported platform extending and enhancing Solr with various enterprise demanded capabilities, including security, connectors, indexing and querying pipelines, signal handling, analytics, etc.
   * [[https://github.com/tjake/Solandra|Solandra]] - A tight integration of Solr and Cassandra. The result is Solr with the awesome scalability properties of Cassandra.
  
- 
  = Data acquisition =
- 
- There are numerous ways to bring data into Solr. Many people roll their own solution or use the [[DataImportHandler]] 
+ There are numerous ways to bring data into Solr. Many people roll their own solution or use the DataImportHandler
  
  == Crawlers And Connectors ==
+ Web, email, and file crawlers (alphabetically).
  
- Web, email, and file crawlers.
- 
-  * [[http://lucene.apache.org/nutch/|Nutch]] (web) [[http://wiki.apache.org/nutch/NutchTutorial|Solr Info (included as part of the Nutch Tutorial)]]
+  * [[http://lucene.apache.org/nutch/|Apache Nutch]] (web) [[http://wiki.apache.org/nutch/NutchTutorial|Solr Info (included as part of the Nutch Tutorial)]]
+  * [[http://aperture.sourceforge.net/|Aperture]] (web, email, file)
+  * [[http://www.crawl-anywhere.com/|Crawl-Anywhere]] (web)  [[http://www.crawl-anywhere.com/solr-indexer/|Solr Info]]
+  * DataImportHandler (email, file)
+  * [[http://incubator.apache.org/droids/|Droids]] ( none ) [[https://cwiki.apache.org/confluence/display/DROIDS/droids-solr|Solr Info]]
+   * Presently, more of a framework for a crawler.
   * [[http://en.wikipedia.org/wiki/Heritrix|Heritrix]] (web)
-  * [[http://www.crawl-anywhere.com/|Crawl-Anywhere]] (web)  [[http://www.crawl-anywhere.com/solr-indexer/|Solr Info]]
-  * [[DataImportHandler]] (email, file)
   * [[http://incubator.apache.org/connectors/|ManifoldCF]] (web, file) [[http://incubator.apache.org/connectors/end-user-documentation.html#solroutputconnector|Solr Info]]
+  * [[http://www.norconex.com/collectors/|Norconex Collectors]] (web, file) [[http://www.norconex.com/collectors/committer-solr/|Solr Info]]
-  * [[http://aperture.sourceforge.net/|Aperture]] (web, email, file) 
-  * [[http://incubator.apache.org/droids/|Droids]] ( none ) [[https://cwiki.apache.org/confluence/display/DROIDS/droids-solr|Solr Info]]
-    * Presently, more of a framework for a crawler.
  
  == Pipelines / Document Processing ==
-  
- Frameworks for flexible document processing. See [[DocumentProcessing]] for more background and criteria for a proposal. Some crawlers/connectors have their own pipeline capability and they are not repeated here.
+ Frameworks for flexible document processing. See DocumentProcessing for more background and criteria for a proposal. Some crawlers/connectors have their own pipeline capability and they are not repeated here.
  
   * [[http://aspire.searchtechnologies.com|Aspire (by Search Technologies)]] - integrates with Solr.  Not open-source but free.
   * [[http://findwise.github.com/Hydra/|Hydra (by Findwise)]] - integrates with Solr.
@@ -40, +36 @@

   * [[http://www.openpipeline.org|OpenPipeline]]
   * ETL (Extract Transform Load) -- many are applicable; these are a couple notable ones:
    * [[http://www.talend.com/products-data-integration/talend-open-studio.php|Talend Open Studio (TOS)]]
- 
-     * Custom Talend components for SOLR can be found on [[http://www.talendforge.org/exchange/|Talend Forge Exchange]] with associated [[http://inrage-blog.blogspot.fr/2012/03/solrtalend-components-tutorial-this.html|doc and tutorial]] 
+    * Custom Talend components for SOLR can be found on [[http://www.talendforge.org/exchange/|Talend Forge Exchange]] with associated [[http://inrage-blog.blogspot.fr/2012/03/solrtalend-components-tutorial-this.html|doc and tutorial]]
    * [[http://kettle.pentaho.com/|Kettle (Pentaho)]]
    * [[http://sourceforge.net/projects/cloveretl/|CloverETL]]
  
@@ -51, +46 @@

   * One of the [[http://xproc.org/implementations/|XProc implementations]] (an XML pipeline spec) such as [[http://xmlcalabash.com/|Calabash]]
  
  == Indexing ==
- 
  Generating the Lucene/Solr Index
  
  Hadoop:
+ 
   * [[http://www.cascading.org/|Cascading]] - [[https://github.com/bixolabs/cascading.solr|Solr "Tap"]]
-  * [[http://katta.sourceforge.net/|Katta]] - [[KattaIntegration]]
+  * [[http://katta.sourceforge.net/|Katta]] - KattaIntegration
  
-  
  = Monitoring =
- 
  Tools or services for monitoring Solr-specific performance metrics.
  
   * [[http://sematext.com/spm/solr-performance-monitoring/index.html|SPM for Solr]]