You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2007/09/04 22:41:24 UTC

[Solr Wiki] Trivial Update of "UpdateRichDocuments" by EricPugh

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by EricPugh:
http://wiki.apache.org/solr/UpdateRichDocuments

The comment on the change is:
tweaking the example...

------------------------------------------------------------------------------
  = Updating a Solr Index with Rich Documents such as PDF and MS Office =
  
- Solr has an extensible 
+ Solr has an extensible DocumentHandler architecture that allows you to feed it XML and CSV documents.  There is now a patch file available as part of [https://issues.apache.org/jira/browse/SOLR-284 SOLR-284] that adds support for parsing rich binary formats.
  
- Solr accepts index updates in [http://en.wikipedia.org/wiki/Comma-separated_values CSV] (Comma Separated Values) format.  Different separators are configurable, and multi-valued fields are supported.
+ This page talks about how to get started using this patch.  If you like it, please [https://issues.apache.org/jira/secure/ViewVoters!default.jspa?id=12372848 vote] for it on the JIRA issue tracker so we can get it added to the Solr codebase!
+ 
  
  [[TableOfContents]]
  
@@ -28, +29 @@

  
  4) Unzip the test-files.zip into SOLR_HOME/test/test-files/.  These are various test files for running the included unit tests.
  
- 5) Apply the rich.patch to your source.  Rich.patch has tweaks that add the solr.RichDocumentRequestHandler to your solrconfig.xml.
+ 5) Apply the rich.patch to your source.  Rich.patch has tweaks that add the solr.RichDocumentRequestHandler to your solrconfig.xml files.
  
  6) Copy the contents of source.zip into SOLR_HOME/src/java/org/apache/solr/handler
  
@@ -41, +42 @@

  All of the normal methods for [SolrContentStreams uploading content] are supported.
  
  === Example ===
+ These examples assume you have run {{{ant example}}} first and have it up and running using {{{java -jar start.jar}}}.
+ 
  There is a sample PDF file at {{{src/test/test-files/simple.pdf}}} that may be used to add a PDF to the solr example server.
  
- Example of using HTTP-POST to send the CSV data over the network to the Solr server:
+ Example of using HTTP-POST to send the PDF data over the network to the Solr server:
  {{{
- cd src/test/test-files/simple.pdf
+ cd src/test/test-files/
- curl http://localhost:8983/solr/update/rich --data-binary @simple.pdf -H 'Content-type:text/plain; charset=utf-8'
+ curl http://localhost:8983/solr/update/rich?stream.type=pdf --data-binary @simple.pdf -H 'Content-type:text/plain; charset=utf-8'
  }}}
  
  Uploading a binary file can be more efficient than sending it over the network via HTTP.
@@ -57, +60 @@

  
  The following request will cause Solr to directly read the input file:
  {{{
- curl http://localhost:8983/solr/update/rich?stream.file=src/test/test-files/simple.pdf
+ curl http://localhost:8983/solr/update/rich?stream.type=pdf&stream.file=src/test/test-files/simple.pdf&id=100&stream.fieldname=name
  #NOTE: The full path, or a path relative to the CWD of the running solr server must be used.
  }}}