You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by sa...@apache.org on 2014/11/11 15:02:32 UTC

svn commit: r1638105 - /lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext

Author: sarowe
Date: Tue Nov 11 14:02:31 2014
New Revision: 1638105

URL: http://svn.apache.org/r1638105
Log:
SOLR-6058: added basic searching subsection

Modified:
    lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext

Modified: lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext
URL: http://svn.apache.org/viewvc/lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext?rev=1638105&r1=1638104&r2=1638105&view=diff
==============================================================================
--- lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext (original)
+++ lucene/cms/branches/solr_6058/content/solr/quickstart.mdtext Tue Nov 11 14:02:31 2014
@@ -120,13 +120,13 @@ The command-line breaks down as follows:
    * `org.apache.solr.util.SimplePostTool`: Our easy to use POSTing friend in this tutorial
    * `docs/`: a relative path of the Solr install `docs/` directory
 
-You have now indexed thousands of documents into the `collection1` collection in Solr and committed these changes.  You can search for "solr" by loading the Admin UI [Query tab](http://localhost:8983/solr/#/collection1/query), and enter "solr" in the "q" text box (replacing `*:*`, which matches all documents).  See the [Searching](#searching) section below for more information. 
+You have now indexed thousands of documents into the `collection1` collection in Solr and committed these changes.  You can search for "solr" by loading the Admin UI [Query tab](http://localhost:8983/solr/#/collection1/query), and enter "solr" in the `q` param (replacing `*:*`, which matches all documents).  See the [Searching](#searching) section below for more information. 
 
 To index your own data, re-run the directory indexing command pointed to your own directory of documents.  For example, on a Mac instead of `docs/` try `~/Documents/` or `~/Desktop/` !   You may want to start from a clean, empty system again, rather than have your content in addition to the Solr `docs/` directory; see the Cleanup section [below](#cleanup) for how to get back to a clean starting point.
 
 ### Indexing Solr XML
 
-Solr supports indexing structured content in a variety of incoming formats.  The historically predominant format for getting structured content into Solr has been [Solr XML](https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-XMLFormattedIndexUpdates).  Many Solr indexers have been coded to process domain content into Solr XML output, generally HTTP POSTed directly to Solr's /update endpoint.
+Solr supports indexing structured content in a variety of incoming formats.  The historically predominant format for getting structured content into Solr has been [Solr XML](https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-XMLFormattedIndexUpdates).  Many Solr indexers have been coded to process domain content into Solr XML output, generally HTTP POSTed directly to Solr's `/update` endpoint.
 
 Solr's install includes a handful of Solr XML formatted files with example data (mostly mocked tech product data).  
 
@@ -160,11 +160,7 @@ Here's what you'll see:
 ...and now you can search for all sorts of things using the default [Solr Query Syntax](https://cwiki.apache.org/confluence/display/solr/The+Standard+Query+Parser#TheStandardQueryParser-SpecifyingTermsfortheStandardQueryParser) (a superset of the Lucene query syntax)...
 
 NOTE:
-You can browse the documents indexed at <http://localhost:8983/solr/collection1/browse>.
-The `/browse` UI allows getting a feel for how Solr's technical capabilities can be
-worked with in a familiar, though a bit rough* and prototypical, interactive HTML view.  *The /browse view defaults to assuming the
-"collection1" schema and data are a catch-all mix of structured XML, JSON, CSV example data, and unstructured rich documents.
-Your own data may not look ideal at first, though the /browse templates are customizable.
+You can browse the documents indexed at <http://localhost:8983/solr/collection1/browse>.  The `/browse` UI allows getting a feel for how Solr's technical capabilities can be worked with in a familiar, though a bit rough and prototypical, interactive HTML view.  (The `/browse` view defaults to assuming the `collection1` schema and data are a catch-all mix of structured XML, JSON, CSV example data, and unstructured rich documents.  Your own data may not look ideal at first, though the `/browse` templates are customizable.)
 
 ### Indexing JSON
 
@@ -195,6 +191,10 @@ A great conduit of data into Solr is via
 
 Using SimplePostTool and the included example CSV data file, index it:
 
+    java -Dauto org.apache.solr.util.SimplePostTool example/exampledocs/books.csv
+
+In your terminal you'll see:
+
     /solr-4.10.2:$ java -Dauto org.apache.solr.util.SimplePostTool example/exampledocs/books.csv
     SimplePostTool version 1.5
     Posting files to base url http://localhost:8983/solr/update..
@@ -210,15 +210,17 @@ Using SimplePostTool and the included ex
     
 * Use [SolrJ](https://cwiki.apache.org/confluence/display/solr/Using+SolrJ) for Java or other Solr clients to programatically create documents to send to Solr.
 
+* Use the Admin UI [Documents tab](http://localhost:8983/solr/#/collection1/documents) to paste in a document to be indexed, or select `Document Builder` from the `Document Type` dropdown to build a document one field at a time.  Click on the `Submit Document` button below the form to index your document.
+
 ***
 
 ## Updating Data
 
-You may notice that even if you index content in this guide more than once, it does not duplicate the results found. This is because the example `schema.xml` specifies a "`uniqueKey`" field called "id". Whenever you POST commands to Solr to add a document with the same value for the uniqueKey as an existing document, it automatically replaces it for you. You can see that that has happened by looking at the values for `numDocs` and `maxDoc` in the "CORE"/searcher section of the statistics page...
+You may notice that even if you index content in this guide more than once, it does not duplicate the results found. This is because the example `schema.xml` specifies a "`uniqueKey`" field called "`id`". Whenever you POST commands to Solr to add a document with the same value for the `uniqueKey` as an existing document, it automatically replaces it for you. You can see that that has happened by looking at the values for `numDocs` and `maxDoc` in the "CORE"/searcher section of the statistics page...
 
 <http://localhost:8983/solr/#/collection1/plugins/core?entry=searcher>
 
-numDocs represents the number of searchable documents in the index (and will be larger than the number of XML, JSON, or CSV files since some files contained more than one document).  The maxDoc value may be larger as the maxDoc count includes logically deleted documents that have not yet been removed from the index. You can re-post the sample files over and over again as much as you want and numDocs will never increase, because the new documents will constantly be replacing the old.
+`numDocs` represents the number of searchable documents in the index (and will be larger than the number of XML, JSON, or CSV files since some files contained more than one document).  The maxDoc value may be larger as the maxDoc count includes logically deleted documents that have not yet been removed from the index. You can re-post the sample files over and over again as much as you want and `numDocs` will never increase, because the new documents will constantly be replacing the old.
 
 Go ahead and edit any of the existing example data files, change some of the data, and re-run the SimplePostTool command.  You'll see your changes reflected in subsequent searches.
 
@@ -243,20 +245,110 @@ Execute the following command to delete 
 
 ## Searching
 
-Solr can be queried via REST calls using cURL, wget, Chrome POSTMAN REST client, etc., in addition to native clients available.
+Solr can be queried via REST clients cURL, wget, Chrome POSTMAN, etc., as well as via the native clients available for many programming languages.
 
-The Solr Admin UI includes a query builder interface see the `collection1` query screen at <http://localhost:8983/solr/#/collection1/query>.  If you click the `Execute Query` button without changing anything in the form, you'll get 10 random documents in JSON format:
+The Solr Admin UI includes a query builder interface - see the `collection1` query tab at <http://localhost:8983/solr/#/collection1/query>.  If you click the `Execute Query` button without changing anything in the form, you'll get 10 random documents in JSON format (`*:*` in the `q` param matches all documents):
 
-<img style="border:1px solid #ccc" src="/solr/assets/images/quickstart-query-screen.png" alt="Solr Quick Start: collection1 Query screen" class="float-right"/>
+<img style="border:1px solid #ccc" src="/solr/assets/images/quickstart-query-screen.png" alt="Solr Quick Start: collection1 Query tab" class="float-right"/>
 
-The URL sent by the Admin UI to Solr is shown in the top right of the above screenshot - if you click on it, your browser will show you the raw response.  To get the same response via cURL, just give the same URL in quotes on the `curl` command line:
+The URL sent by the Admin UI to Solr is shown in light grey near the top right of the above screenshot - if you click on it, your browser will show you the raw response.  To use cURL, just give the same URL in quotes on the `curl` command line:
 
     curl "http://localhost:8983/solr/collection1/select?q=*%3A*&wt=json&indent=true"
 
+In the above URL, the "`:`" in "`q=*:*`" has been URL-encoded as "`%3A`", but since "`:`" has no reserved purpose in the query component of the URL (after the "`?`"), you don't need to URL encode it.  So the following also works:
+
+    curl "http://localhost:8983/solr/collection1/select?q=*:*&wt=json&indent=true"
 
 ### Basics
 
-TODO: terms, phrases, boolean, learn more
+#### Search for a single term
+
+To search for a term, give it as the `q` param value - in the Admin UI [Query tab](http://localhost:8983/solr/#/collection1/query), replace `*:*` with the term you want to find.  To search for "foundation":
+
+    curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=foundation"
+
+You'll see:
+
+    /solr-4.10.2$ curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=foundation"
+    {
+      "responseHeader":{
+        "status":0,
+        "QTime":0,
+        "params":{
+          "indent":"true",
+          "q":"foundation",
+          "wt":"json"}},
+      "response":{"numFound":2812,"start":0,"docs":[
+          {
+            "id":"0553293354",
+            "cat":["book"],
+            "name":"Foundation",
+    ...
+
+The response indicates that there are 2,812 hits (`"numFound":2812`), of which the first 10 were returned, since by default `start`=`0` and `rows`=`10`.  You can specify these params to page through results, where `start` is the position of the first result to return, and `rows` is the page size.
+
+To restrict fields returned in the response, use the `fl` param, which takes a comma-separated list of field names.  E.g. to only return the `id` field:
+
+    curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=foundation&fl=id" 
+
+`q=foundation` matches nearly all of the docs we've indexed, since most of the files under `docs/` contain "The Apache Software Foundation".  To restrict search to a particular field, use the syntax "`q=field:value`", e.g. to search for `foundation` only in the `name` field:
+
+    curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=name:foundation" 
+
+The above request returns only one document (`"numFound":1`) - from the response:
+
+    ...
+      "response":{"numFound":1,"start":0,"docs":[
+          {
+            "id":"0553293354",
+            "cat":["book"],
+            "name":"Foundation",
+    ...
+
+#### Phrase search
+
+To search for a multi-term phrase, enclose it in double quotes: `q="multiple terms here"`.  E.g. to search for "CAS latency" - note that the space between terms must be converted to "`+`" in a URL (the Admin UI will handle URL encoding for you automatically):
+
+    curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=\"CAS+latency\"" 
+    
+You'll get back:
+
+    {
+      "responseHeader":{
+        "status":0,
+        "QTime":0,
+        "params":{
+          "indent":"true",
+          "q":"\"CAS latency\"",
+          "wt":"json"}},
+      "response":{"numFound":2,"start":0,"docs":[
+          {
+            "id":"VDBDB1A16",
+            "name":"A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - OEM",
+            "manu":"A-DATA Technology Inc.",
+            "manu_id_s":"corsair",
+            "cat":["electronics", "memory"],
+            "features":["CAS latency 3,\t 2.7v"],
+    ...
+
+#### Combining searches
+
+By default, when you search for multiple terms and/or phrases in a single query, Solr will only require that one of them is present in order for a document to match.  Documents containing more terms will be sorted higher in the results list.
+ 
+You can require that a term or phrase is present by prefixing it with a "`+`"; conversely, to disallow the presence of a term or phrase, prefix it with a "`-`".
+
+To find documents that contain both terms "`one`" and "`three`", enter `+one +three` in the `q` param in the Admin UI [Query tab](http://localhost:8983/solr/#/collection1/query).  Because the "`+`" character has a reserved purpose in URLs (encoding the space character), you must URL encode it for `curl` as "`%2B`": 
+
+    curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=%2Bone+%2Bthree"
+
+To search for documents that contain the term "`two`" but **don't** contain the term "`one`", enter `+two -one` in the `q` param in the Admin UI.  Again, URL encode "`+`" as "`%2B`":
+
+    curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=%2Btwo+-one"
+
+#### In depth
+
+For more Solr search options, see the Solr Reference Guide's [Searching](https://cwiki.apache.org/confluence/display/solr/Searching) section.
+
 
 ### Faceting
 
@@ -278,20 +370,28 @@ If you've run the full set of commands i
 * Opened the admin console, used its query interface to get JSON formatted results
 * Opened the /browse interface to explore Solr's features in a more friendly and familiar interface
 
-Nice work!   The script (see below) to run all of these items took under two minutes! (your run time may vary, depending on your computers power and resources available)
+Nice work!   The script (see below) to run all of these items took under two minutes! (Your run time may vary, depending on your computer's power and resources available.)
 
 Here's a full Unix script for convenient copying and pasting in order to run all of the commands for this quick start guide:
 
     export CLASSPATH=dist/solr-core-4.10.2.jar
     date ;
     bin/solr start -e cloud -noprompt ; 
-       open http://localhost:8983/solr ;
-       java -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/ ; 
-       open http://localhost:8983/solr/collection1/browse ;
-       java org.apache.solr.util.SimplePostTool example/exampledocs/*.xml ;
-       java -Dauto org.apache.solr.util.SimplePostTool example/exampledocs/books.json ;
-       java -Dauto org.apache.solr.util.SimplePostTool example/exampledocs/books.csv ;
-       java -Ddata=args org.apache.solr.util.SimplePostTool "<delete><id>SP2514N</id></delete>" ;
+      open http://localhost:8983/solr ;
+      java -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/ ; 
+      open http://localhost:8983/solr/collection1/browse ;
+      java org.apache.solr.util.SimplePostTool example/exampledocs/*.xml ;
+      java -Dauto org.apache.solr.util.SimplePostTool example/exampledocs/books.json ;
+      java -Dauto org.apache.solr.util.SimplePostTool example/exampledocs/books.csv ;
+      open "http://localhost:8983/solr/#/collection1/plugins/core?entry=searcher" ;
+      java -Ddata=args org.apache.solr.util.SimplePostTool "<delete><id>SP2514N</id></delete>" ;
+      curl "http://localhost:8983/solr/collection1/select?q=*:*&wt=json&indent=true" ;
+      curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=foundation" ;
+      curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=foundation&fl=id" ;
+      curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=name:foundation" ;
+      curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=\"CAS+latency\"" ;
+      curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=%2Bone+%2Bthree" ;
+      curl "http://localhost:8983/solr/collection1/select?wt=json&indent=true&q=%2Btwo+-one" ;
     date ;
 
 ## Where to next?