You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by da...@apache.org on 2018/09/14 03:30:55 UTC

[38/43] lucene-solr:jira/http2: SOLR-12361: ref guide changes & CHANGES.txt organization

SOLR-12361: ref guide changes & CHANGES.txt organization


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/6e8c05f6
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/6e8c05f6
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/6e8c05f6

Branch: refs/heads/jira/http2
Commit: 6e8c05f6fe083544fb7f8fdd01df08ac54d7742e
Parents: 41e972e
Author: David Smiley <ds...@apache.org>
Authored: Wed Sep 12 17:34:28 2018 -0400
Committer: David Smiley <ds...@apache.org>
Committed: Wed Sep 12 17:34:28 2018 -0400

----------------------------------------------------------------------
 solr/CHANGES.txt                                | 45 ++++++------
 .../src/uploading-data-with-index-handlers.adoc | 75 +++++++++++++-------
 2 files changed, 74 insertions(+), 46 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/6e8c05f6/solr/CHANGES.txt
----------------------------------------------------------------------
diff --git a/solr/CHANGES.txt b/solr/CHANGES.txt
index b0da693..113ca13 100644
--- a/solr/CHANGES.txt
+++ b/solr/CHANGES.txt
@@ -131,9 +131,30 @@ New Features
 
 * SOLR-12474: Add an UpdateRequest Object that implements RequestWriter.ContentWriter (noble)
 
+* SOLR-12361: Allow nested child documents to be in field values of a SolrInputDocument as an alternative to
+  add/get ChildDocuments off to the side.  The latter is now referred to as "anonymous" child documents as opposed to
+  "labelled" (by the field name).  Anonymous child docs might be deprecated in the future.  This is an internal change
+  that should work for javabin/SolrJ; separate issues will address XML & JSON formats populating nested docs in this way.
+  AddUpdateCommand and it's relationship with DirectUpdateHandler2 was reworked substantially. (Moshe Bla, David Smiley)
+
 * SOLR-12362: Uploading docs in JSON now supports child documents as field values, thus providing a label to the
-  relationship instead of the current "anonymous" relationship.  Use of this experimental feature requires
-  anonChildDocs=false parameter.  (Moshe Bla, David Smiley)
+  relationship instead of the current "anonymous" relationship.  Use of this experimental feature sometimes requires a
+  anonChildDocs=false parameter until Solr 8 due to syntax ambiguities.  (Moshe Bla, David Smiley)
+
+* SOLR-12485: Uploading docs in XML now supports child documents as field values, thus providing a label to the
+  relationship instead of the current "anonymous" relationship. (Moshe Bla, David Smiley)
+
+* SOLR-12441: (EXPERIMENTAL) New NestedUpdateProcessorFactory (URP) to populate special fields _nest_parent_ and
+  _nest_path_ of nested (child) documents. It will generate a uniqueKey of nested docs if they were blank too.
+  (Moshe Bla, David Smiley)
+
+* SOLR-12519: The [child] transformer now returns a nested child doc structure (attached as fields if provided this way)
+  provided the schema has the _nest_path_ field.  This is part of a broader enhancement of nested docs.
+  (Moshe Bla, David Smiley)
+
+* SOLR-12722: The [child] transformer now takes an 'fl' param to specify which fields to return.  It will evaluate
+  doc transformers if present.  In 7.5 a missing 'fl' defaults to the current behavior of all fields, but in 8.0
+  defaults to the top/request "fl". (Moshe Bla, David Smiley)
 
 * SOLR-11578: Solr 7 Admin UI (Cloud > Graph) should reflect the Replica type to give a more accurate representation
   of the cluster. (Rhoit Singh via Erick Erickson)
@@ -155,9 +176,6 @@ New Features
 
 * SOLR-12495: An #EQUAL function for replica in autoscaling policy to equally distribute replicas (noble)
 
-* SOLR-12441: New NestedUpdateProcessorFactory (URP) to populate special fields _nest_parent_ and _nest_path_ of nested
-  (child) documents.  It will generate a uniqueKey of nested docs if they were blank too. (Moshe Bla, David Smiley)
-
 * SOLR-11986: Allow percentage in freedisk attribute in autoscaling policy rules (noble)
 
 * SOLR-12522: Support a runtime function `#ALL` for 'replica' in autoscaling policies (noble)
@@ -185,16 +203,9 @@ New Features
 
 * SOLR-12592: support #EQUAL function, range operator, decimal and percentage in cores in autoscaling policies (noble)
 
-* SOLR-12485: Uploading docs in XML now supports child documents as field values, thus providing a label to the
-  relationship instead of the current "anonymous" relationship. (Moshe Bla, David Smiley)
-
 * SOLR-12655: Add Korean morphological analyzer ("nori") to default distribution. This also adds examples
   for configuration in Solr's schema.  (Uwe Schindler)
 
-* SOLR-12519: The [child] transformer now returns a nested child doc structure (attached as fields if provided this way)
-  provided the schema is enabled for nested documents.  This is part of a broader enhancement of nested docs.
-  (Moshe Bla, David Smiley)
-
 * SOLR-11863: Add knnRegress Stream Evaluator to support nearest neighbor regression (Joel Bernstein)
 
 * SOLR-12702: Add zscores Stream Evaluator (Joel Bernstein)
@@ -219,10 +230,6 @@ New Features
 * SOLR-12716: NodeLostTrigger should support deleting replicas from lost nodes by setting preferredOperation=deletenode.
   (shalin)
 
-* SOLR-12722: The [child] transformer now takes an 'fl' param to specify which fields to return.  It will evaluate
-  doc transformers if present.  In 7.5 a missing 'fl' defaults to the current behavior of all fields, but in 8.0
-  defaults to the top/request "fl". (Moshe Bla, David Smiley)
-
 * SOLR-9418: Added a new (experimental) PhrasesIdentificationComponent for identifying potential phrases
   in query input based on overlapping shingles in the index. (Akash Mehta, Trey Grainger, hossman)
 
@@ -381,12 +388,6 @@ Optimizations
 Other Changes
 ----------------------
 
-* SOLR-12361: Allow nested child documents to be in field values of a SolrInputDocument as an alternative to
-  add/get ChildDocuments off to the side.  The latter is now referred to as "anonymous" child documents as opposed to
-  "labelled" (by the field name).  Anonymous child docs might be deprecated in the future.
-  This is an internal change not yet plumbed into /update formats.
-  AddUpdateCommand and it's relationship with DirectUpdateHandler2 was reworked substantially. (Moshe Bla, David Smiley)
-
 * SOLR-12208: Renamed the autoscaling variable 'INDEX.sizeInBytes' to 'INDEX.sizeInGB' (noble)
 
 * SOLR-12523: Improve error reporting and docs regarding Collection backup feature shared-fs requirement (janhoy)

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/6e8c05f6/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc b/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc
index 0f523d8..93ffdc2 100644
--- a/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc
+++ b/solr/solr-ref-guide/src/uploading-data-with-index-handlers.adoc
@@ -541,27 +541,57 @@ The `/update/csv` path may be useful for clients sending in CSV formatted update
 
 == Nested Child Documents
 
-Solr indexes nested documents in blocks as a way to model documents containing other documents, such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents. At query time, the <<other-parsers.adoc#block-join-query-parsers,Block Join Query Parsers>> can search these relationships. In terms of performance, indexing the relationships between documents may be more efficient than attempting to do joins only at query time, since the relationships are already stored in the index and do not need to be computed.
+Solr supports indexing nested documents such as a blog post parent document and comments as child documents -- or products as parent documents and sizes, colors, or other variations as child documents.
+The parent with all children is referred to as a "block" and it explains some of the nomenclature of related features.
+At query time, the <<other-parsers.adoc#block-join-query-parsers,Block Join Query Parsers>> can search these relationships,
+ and the `[child]` <<transforming-result-documents.adoc#transforming-result-documents,Document Transformer>> can attach child documents to the result documents.
+In terms of performance, indexing the relationships between documents usually yields much faster queries than an equivalent "query time join",
+ since the relationships are already stored in the index and do not need to be computed.
+However, nested documents are less flexible than query time joins as it imposes rules that some applications may not be able to accept.
 
-Nested documents may be indexed via either the XML or JSON data syntax (or using <<using-solrj.adoc#using-solrj,SolrJ)>> - but regardless of syntax, you must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <<other-parsers.adoc#block-join-query-parsers,block join query parsers>>.
+.Note
+[NOTE]
+====
+A big limitation is that the whole block of parent-children documents must be updated or deleted together, not separately.
+In other words, even if a single child document or the parent document is changed, the whole block of parent-child documents must be indexed together.
+_Solr does not enforce this rule_; if it's violated, you may get sporadic query failures or incorrect results.
+====
+
+Nested documents may be indexed via either the XML or JSON data syntax, and is also supported by <<using-solrj.adoc#using-solrj,SolrJ>> with javabin.
 
-To support nested documents, the schema must include an indexed/non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth.
+=== Schema Notes
+
+ * The schema must include an indexed, non-stored field `\_root_`. The value of that field is populated automatically and is the same for all documents in the block, regardless of the inheritance depth.
+ * Nested documents are very much documents in their own right even if certain nested documents hold different information from the parent.
+   Therefore:
+ ** the schema must be able to represent the fields of any document
+ ** it may be infeasible to use `required`
+ ** even child documents need a unique `id`
+ * You must include a field that identifies the parent document as a parent; it can be any field that suits this purpose, and it will be used as input for the <<other-parsers.adoc#block-join-query-parsers,block join query parsers>>.
+ * If you associate a child document as a field (e.g. comment), that field need not be defined in the schema, and probably
+   shouldn't be as it would be confusing.  There is no child document field type.
 
 === XML Examples
 
-For example, here are two documents and their child documents:
+For example, here are two documents and their child documents.
+It illustrates two styles of adding child documents; the first is associated via a field "comment" (preferred),
+and the second is done in the classic way now referred to as an "anonymous" or "unlabelled" child document.
+This field label relationship is available to the URP chain in Solr but is ultimately discarded.
+Solr 8 will save the relationship.
 
 [source,xml]
 ----
 <add>
   <doc>
-  <field name="id">1</field>
-  <field name="title">Solr adds block join support</field>
-  <field name="content_type">parentDocument</field>
-    <doc>
-      <field name="id">2</field>
-      <field name="comments">SolrCloud supports it too!</field>
-    </doc>
+    <field name="id">1</field>
+    <field name="title">Solr adds block join support</field>
+    <field name="content_type">parentDocument</field>
+    <field name="content">
+      <doc>
+        <field name="id">2</field>
+        <field name="comments">SolrCloud supports it too!</field>
+      </doc>
+    </field>
   </doc>
   <doc>
     <field name="id">3</field>
@@ -575,11 +605,15 @@ For example, here are two documents and their child documents:
 </add>
 ----
 
-In this example, we have indexed the parent documents with the field `content_type`, which has the value "parentDocument". We could have also used a boolean field, such as `isParent`, with a value of "true", or any other similar approach.
+In this example, we have indexed the parent documents with the field `content_type`, which has the value "parentDocument".
+We could have also used a boolean field, such as `isParent`, with a value of "true", or any other similar approach.
 
 === JSON Examples
 
-This example is equivalent to the XML example above, note the special `\_childDocuments_` key need to indicate the nested documents in JSON.
+This example is equivalent to the XML example above.
+Again, the field labelled relationship is preferred.
+The labelled relationship here is one child document but could have been wrapped in array brackets.
+For the anonymous relationship, note the special `\_childDocuments_` key whose contents must be an array of child documents.
 
 [source,json]
 ----
@@ -588,12 +622,10 @@ This example is equivalent to the XML example above, note the special `\_childDo
     "id": "1",
     "title": "Solr adds block join support",
     "content_type": "parentDocument",
-    "_childDocuments_": [
-      {
-        "id": "2",
-        "comments": "SolrCloud supports it too!"
-      }
-    ]
+    "comment": {
+      "id": "2",
+      "comments": "SolrCloud supports it too!"
+    }
   },
   {
     "id": "3",
@@ -609,8 +641,3 @@ This example is equivalent to the XML example above, note the special `\_childDo
 ]
 ----
 
-.Note
-[NOTE]
-====
-One limitation of indexing nested documents is that the whole block of parent-children documents must be updated together whenever any changes are required. In other words, even if a single child document or the parent document is changed, the whole block of parent-child documents must be indexed together.
-====