You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by eh...@apache.org on 2014/11/05 10:48:34 UTC

svn commit: r1636841 - in /lucene/cms/branches/solr_6058/content/solr: quickstart.mdtext resources.mdtext tutorials.mdtext

Author: ehatcher
Date: Wed Nov  5 09:48:33 2014
New Revision: 1636841

URL: http://svn.apache.org/r1636841
Log:
Moving quick start tutorial to -e cloud mode to showcase SolrCloud too

Added:
    lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext
Removed:
    lucene/cms/branches/solr_6058/content/solr/tutorials.mdtext
Modified:
    lucene/cms/branches/solr_6058/content/solr/resources.mdtext

Added: lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext
URL: http://svn.apache.org/viewvc/lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext?rev=1636841&view=auto
==============================================================================
--- lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext (added)
+++ lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext Wed Nov  5 09:48:33 2014
@@ -0,0 +1,214 @@
+Title: Quick Start
+
+<ul class="breadcrumbs">
+  <li><a href="/solr">Home</a></li>
+  <li><a href="/solr/resources.html">Resources</a></li>
+</ul>
+
+# Solr Quick Start
+
+***
+
+## Overview
+
+<!--
+  TODO: Where to mention (or not?) the Solr version number this is for?   It's intentionally embedded in the examples below, at least.
+
+  4.10.2 was used to write this quick start guide
+-->
+
+This document covers getting Solr up and running, ingesting a variety of data sources into multiple collections, and getting a feel
+for the Solr administrative and search interfaces.
+
+***
+
+## Requirements
+
+<!-- TODO: Replace this section with an include?  Or at least link to a common system requirements page rather than duplicating here. -->
+
+To follow along with this tutorial, you will need...
+
+1. Java 1.7 or greater. Some places you can get it are from Oracle or Open JDK.
+    * Running java -version at the command line should indicate a version number starting with 1.7.
+    * Gnu's GCJ is not supported and does not work with Solr.
+2. A Solr release.
+    
+***
+
+## Getting Started
+
+Please run the browser showing this tutorial and the Solr server on the same machine so tutorial links will correctly point to your Solr server.
+
+Begin by unzipping the Solr release and changing your working directory to be the "example" directory. (Note that the base directory name may vary with the version of Solr downloaded.) For example, with a shell in UNIX, Cygwin, or MacOS:
+
+
+    /:$ ls solr*
+    solr-4.10.2.zip
+    /:$ unzip -q solr-4.10.2.zip
+    /:$ cd solr-4.10.2/
+
+To launch Solr, run `bin/solr start -e cloud -noprompt`:
+
+    /solr-4.10.2:$ bin/solr start -e cloud -noprompt 
+    Welcome to the SolrCloud example!
+
+
+    Starting up 2 Solr nodes for your example SolrCloud cluster.
+    ...
+
+    Started Solr server on port 8983 (pid=8404). Happy searching!
+    ...
+
+    Started Solr server on port 7574 (pid=8549). Happy searching!
+    ...
+
+    SolrCloud example running, please visit http://localhost:8983/solr 
+
+    /solr-4.10.2:$ 
+
+Solr will now be running two "nodes", one on port 7574 and one on port 8983.  There are two collections created automatically, "collection1" and "gettingstarted".
+These collections are different in a couple of ways: "collection1" is a single shard collection with two replicas and "gettingstarted" is a two shard
+collection, each with two replicas.  The "Cloud" tab in the admin console diagrams it nicely:
+
+  <!-- TODO: insert cloud diagram -->
+
+You can see that the Solr is running by loading <http://localhost:8983/solr/> in your web browser. This is the main starting point for administering Solr.
+
+***
+
+<section class="orange">
+      <h1>That wasn't too hard!</h1>
+      <p>
+        You nailed step 1. Take a deep breath, relax a bit before round 2 below.
+      </p>
+      <div class="down-arrow"><a data-scroll href="#indexing-data"><i class="fa fa-angle-down fa-2x red"></i></a></div>
+</section>
+
+## Indexing Data
+
+Your Solr server is up and running, but it doesn't contain any data. You can modify a Solr index by POSTing commands to Solr to add (or update) documents, delete documents, and commit pending adds and deletes. These commands can be in a [variety of formats]().
+
+The install includes sample files, under `example/exampledocs`, demonstrating the types of commands and formats Solr accepts. Also included is a Java utility for posting them from the command line.
+
+Let's first index local "rich" files (HTML, PDF, text, and many other supported formats).  The command-line is a bit hairy, and it will be described in detail below.  The command we'll use is:
+
+    java -classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar -Ddata=files -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/
+
+Here's what it'll look like:
+
+<!--
+    # TODO: does this command-line sound too hairy to put in here?   What's easier?   I like it, and will make at least Solr 5.x have it be 
+    # as simple `bin/post docs/` to do this same thing (see SOLR-6435)
+-->
+
+    /solr-4.10.2:$ java -classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar -Ddata=files -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/
+    SimplePostTool version 1.5
+    Posting files to base url http://localhost:8983/solr/update..
+    Entering auto mode. File endings considered are xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log
+    Entering recursive mode, max depth=999, delay=0s
+    Indexing directory docs (3 files, depth=0)
+    POSTing file index.html (text/html)
+    POSTing file SYSTEM_REQUIREMENTS.html (text/html)
+    POSTing file tutorial.html (text/html)
+    Indexing directory docs/changes (1 files, depth=1)
+    POSTing file Changes.html (text/html)
+    Indexing directory docs/solr-analysis-extras (8 files, depth=1)
+    ...
+    2945 files indexed.
+    COMMITting Solr index changes to http://localhost:8983/solr/update..
+    Time spent: 0:00:37.537
+
+<!-- TODO: Should we break down this command-line like this?  Why not?  Maybe make it blocked off so it can be readily skipped for the copy/pasters. -->
+
+The command-line breaks down as follows:
+
+   * `-classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar`: the JAR file containing Solr's SimplePostTool
+   * `-Ddata=files -Dauto -Drecursive`: Settings for directory recursing with automatic content type detection
+   * `org.apache.solr.util.SimplePostTool`: The tool we are invoking here
+   * `docs/`: a relative path of the Solr install docs/ directory
+
+You have now indexed thousands of documents into the "collection1" collection in Solr and committed these changes. You can now search for "solr" by loading the "[Query]()" tab in the Admin interface, and entering "solr" in the "q" text box. Clicking the "Execute Query" button should display the following URL containing one result...
+
+<http://localhost:8983/solr/collection1/select?q=solr&wt=xml>
+
+You can index all of the sample data, using the following command (assuming your command line shell supports the *.xml notation), this time making our command-line simpler by opening a terminal to the `example/exampledocs` directory and using post.jar.  Note: post.jar is a simple JAR file containing only the SimplePostTool used above.
+
+    /solr-4.10.2:$ cd example/exampledocs/
+    /solr-4.10.2/example/exampledocs:$ java -jar post.jar *.xml
+    SimplePostTool version 1.5
+    Posting files to base url http://localhost:8983/solr/update using content-type application/xml..
+    POSTing file gb18030-example.xml
+    POSTing file hd.xml
+    ...
+    14 files indexed.
+    COMMITting Solr index changes to http://localhost:8983/solr/update..
+    Time spent: 0:00:00.187
+
+...and now you can search for all sorts of things using the default [Solr Query Syntax]() (a superset of the Lucene query syntax)...
+
+* [video]()
+* [name:video]()
+* [+video +price:[* TO 400]]()
+
+There are many other different ways to import your data into Solr... one can
+
+* Import records from a database using the [Data Import Handler (DIH)]().
+    
+* [Load a CSV file]() (comma separated values), including those exported by Excel or MySQL.
+
+* [POST JSON documents]()
+
+* Index binary documents such as Word and PDF with [Solr Cell]() (ExtractingRequestHandler).
+
+* Use [SolrJ]() for Java or other Solr clients to programatically create documents to send to Solr.
+
+***
+
+## Updating Data
+
+You may have noticed that even though the file `solr.xml` has now been POSTed to the server twice, you still only get 1 result when searching for "solr". This is because the example `schema.xml` specifies a "`uniqueKey`" field called "id". Whenever you POST commands to Solr to add a document with the same value for the uniqueKey as an existing document, it automatically replaces it for you. You can see that that has happened by looking at the values for numDocs and maxDoc in the "CORE"/searcher section of the statistics page...
+
+<http://localhost:8983/solr/#/collection1/plugins/core?entry=searcher>
+
+numDocs represents the number of searchable documents in the index (and will be larger than the number of XML files since some files contained more than one <doc>). maxDoc may be larger as the maxDoc count includes logically deleted documents that have not yet been removed from the index. You can re-post the sample XML files over and over again as much as you want and numDocs will never increase, because the new documents will constantly be replacing the old.
+
+Go ahead and edit the existing XML files to change some of the data, and re-run the java -jar post.jar command, you'll see your changes reflected in subsequent searches.
+
+## Deleting Data
+
+You can delete data by POSTing a delete command to the update URL and specifying the value of the document's unique key field, or a query that matches multiple documents (be careful with that one!). Since these commands are smaller, we will specify them right on the command line rather than reference an XML file.
+
+Execute the following command to delete a specific document
+
+    java -Ddata=args -Dcommit=false -jar post.jar "<delete><id>SP2514N</id></delete>"
+
+***
+
+<section class="orange">
+      <h1>Way to go!!!</h1>
+      <p>
+        Round 2, check. Now get up and do some jumping jacks. Heck, go for a run and leave your house, you deserve it.
+      </p>
+      <div class="down-arrow"><a data-scroll href="#indexing-data"><i class="fa fa-angle-down fa-2x red"></i></a></div>
+</section>
+
+
+Cleanup:
+   bin/solr stop -all ; rm -Rf node1/ node2/ 
+
+
+Full script and then console output:
+
+date ;
+bin/solr start -e cloud -noprompt ; 
+   open http://localhost:8983/solr ;
+   java -classpath example/solr-webapp/webapp/WEB-INF/lib/solr-core-*.jar -Ddata=files -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/ ; 
+   open http://localhost:8983/solr/collection1/browse ;
+date ;
+
+
+
+
+
+
+

Modified: lucene/cms/branches/solr_6058/content/solr/resources.mdtext
URL: http://svn.apache.org/viewvc/lucene/cms/branches/solr_6058/content/solr/resources.mdtext?rev=1636841&r1=1636840&r2=1636841&view=diff
==============================================================================
--- lucene/cms/branches/solr_6058/content/solr/resources.mdtext (original)
+++ lucene/cms/branches/solr_6058/content/solr/resources.mdtext Wed Nov  5 09:48:33 2014
@@ -1,11 +1,13 @@
 Title: Resources
-## Tutorial ##
+## Tutorials ##
 
-A copy of the tutorial for each version of Solr is included in the documentation for that release.
+<!-- 
+   TODO: this was previously mentioned.  do we retain something like this?  or...?
+   A copy of the tutorial for each version of Solr is included in the documentation for that release.
+-->
 
-Copies of the tutorial for the most recent release of each major branch under active development can also be found online:
-
-* [Solr Tutorial](/solr/tutorials.html)
+* [Solr Quick Start](/solr/quickstart.html)
+* More to come: Ideas include "Solr in a Day", "Solr and JSON", "Solr and CSV", "Solr and XML"
 
 Users who have completed the tutorial are encouraged to review the [other documentation available](#documentation).