You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/07/29 18:23:03 UTC

[GitHub] [solr] ctargett commented on a change in pull request #239: SOLR-15567: Document Schema Designer screen in ref guide.

ctargett commented on a change in pull request #239:
URL: https://github.com/apache/solr/pull/239#discussion_r679380240



##########
File path: solr/solr-ref-guide/src/schema-designer.adoc
##########
@@ -0,0 +1,229 @@
+= Schema Designer
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The Schema Designer screen lets you interactively design a new schema using sample data.
+
+.Schema Designer screen
+image::images/solr-admin-ui/schema-designer.png[image]
+
+There are a number of panels on the Schema Designer screen to provide immediate feedback when you make changes to the schema, including:
+
+* Upload / paste sample documents to find fields and guess the correct field type and indexing strategy
+* Schema Editor tree to edit Fields, Dynamic Fields, Field Types, and supporting Files
+* Text Analysis panel to show the text analysis pipeline for sample text based on the selected field
+* Query Tester panel to see how schema change impact query matching, sorting, faceting, and hit highlighting
+* Show Changes dialog to view a report of all changes made by the designer before publishing
+
+The Schema Designer allows you to edit an existing schema, however its main purpose is to help you safely design a new schema from sample data.
+You can safely experiment with changes and see the impact on query results immediately.
+Once data is indexed using a published schema, there are severe restrictions on the type of changes you can make to the schema without needing a full re-index.
+When designing a new schema, the Schema Designer re-indexes your sample data automatically when you make changes. However, the designer does not re-index data in collections using a published schema.
+
+.Security Requirements
+[NOTE]
+====
+If the <<rule-based-authorization-plugin.adoc#,Rule-based Authorization Plugin>> is enabled for your Solr installation, then users need to have the `config-edit` and `config-read` permissions to use the Schema Designer.
+====
+
+== Getting Started
+
+Upon entering the Schema Designer for the first time, you'll be prompted to create a New Schema.
+
+image::images/schema-designer/new-schema.png[image,width=600]
+
+Choose a short name that reflects the intended use case for your new schema. You'll need to choose a source schema to copy as the starting point for your new schema.
+Solr includes a `_default` schema which provides a good starting place for building a custom schema for your search application.
+Once a schema is published, it can be used to create new schemas and will be listed in the *Copy from* drop-down list in the dialog.
+
+Once you create the new schema, the next step is to upload or paste a sample of the data you intend to index into Solr.
+The Schema Designer supports JSON, CSV, TSV, XML, and JSON lines (jsonl).
+
+image::images/schema-designer/analyze-sample-docs.png[image,width=500]
+
+The advantage of pasting sample documents into the text area is that you can edit the sample and see the impact of your changes immediately in the analyzed schema.
+The upload feature is useful if you have large or many sample documents; the Schema Designer API allows up to 1,000 sample documents or a max of 5MB upload, but in most cases you only need a handful of documents to get started.
+
+Click on the *Analyze Documents* button to submit the sample documents to the Schema Designer API to generate your new schema.
+
+=== Temporary Configset and Collection
+
+Behind the scenes, the Schema Designer API creates a temporary <<config-sets.adoc#,Configset>> (schema + solrconfig.xml + supporting files) in Zookeeper.
+In addition, the Schema Designer API creates a temporary collection with a single shard and replica to hold sample documents.
+These temporary resources are persisted to disk and exist until the schema is published or manually deleted using the Schema Designer API cleanup endpoint (`/api/schema-designer/cleanup`).
+
+If you close your browser screen while designing a new schema, it will be available when you return.
+Simply choose the name of the schema you created previously in the select box and your schema will load into the designer UI.
+
+image::images/schema-designer/reload-schema.png[image,width=400]
+
+Previously uploaded sample documents are indexed in the temporary collection even though they do not display in the text area.
+
+[TIP]
+====
+Click on the *Edit Documents* button on the *Query Results* panel to load a JSON representation of indexed documents into the text area.
+====
+
+=== Iteratively Post Sample Documents
+
+If you have sample documents spread across multiple files, you can POST them to the Schema Designer API and then load your schema in the Designer UI to design your schema.
+Here's an example of how to use the API to "prep" a new schema and then iteratively post Solr's techproducts example files to the Schema Designer:
+
+[source,bash]
+----
+#!/bin/bash
+
+SOLR_INSTALL_DIR="path/to/solr/install"
+
+DIR_WITH_SAMPLE_FILES="$SOLR_INSTALL_DIR/example/exampledocs"
+
+SOLR_URL=http://localhost:8983
+
+MY_NEW_SCHEMA="myNewSchema"
+
+echo "Preparing new schema: ${MY_NEW_SCHEMA}"
+curl -s -o /dev/null -w "%{http_code}" -XPOST \
+  "$SOLR_URL/api/schema-designer/prep?configSet=${MY_NEW_SCHEMA}&copyFrom=_default"
+echo ""
+
+SAMPLE_FILES=( $(ls ${DIR_WITH_SAMPLE_FILES}/*.{xml,csv,json,jsonl}) )
+for f in "${SAMPLE_FILES[@]}"
+do
+  echo "POST'ing contents of $f to Schema Designer analyze endpoint ..."
+  curl -s -o /dev/null -w "%{http_code}" -XPOST \
+    "$SOLR_URL/api/schema-designer/analyze?configSet=${MY_NEW_SCHEMA}" -d @"$f"
+  echo ""
+done
+----
+
+After sending the sample documents to the Schema Designer `/analyze` endpoint, you can open the schema in the UI in your browser.
+
+== Schema Editor
+
+After analyzing your sample documents, the Schema Designer loads the schema in the *Schema Editor* in the middle panel.
+The editor renders the schema as a tree component composed of Fields, Dynamic Fields, Field Types, and Files.
+For more information about schema objects, see <<fields-and-schema-design.adoc#,Fields and Schema Design>>
+
+image::images/schema-designer/schema-editor-root.png[image,width=700]
+
+.Schema vs. Configset
+[NOTE]
+====
+A Configset includes a schema, so technically the Schema Designer works with a Configset behind the scenes.
+However, Configset is more of a technical implementation detail and your primary focus when designing a new search application should be on the fields and their types.
+Consequently, the Schema Designer focuses primarily on the schema aspects of a Configset vs. exposing complexities of a Configset in the UI.
+====
+
+When you click on the root node of the Schema Editor tree, you can refine top-level schema properties, including:
+
+* Languages: The `_default` schema includes text fields for a number of common languages. You can include all text analyzers in your schema or select a subset based on the languages your search application needs to support. The designer will remove all the unnecessary field types for languages you don't need. For more information about text analysis and languages, see: <<language-analysis.adoc#,Language Analysis>>
+* Dynamic fields allow Solr to index fields that you did not explicitly define in your schema. Dynamic fields can make your application less brittle by providing some flexibility in the documents you can add to Solr. It is recommended to keep the default set of dynamic fields enabled for your schema. Unchecking this option removes all dynamic fields from your schema. For more information about dynamic fields, see: <<dynamic-fields.adoc#,Dynamic Fields>>
+* Field guessing (aka "schemaless mode") allows Solr to detect the "best" field type for unknown fields encountered during indexing. Field guessing also performs some field transformations, such as removing spaces from field names. If you use the schema designer to create your schema based on sample documents, you may not need to enable this feature. However, with this feature disabled, you need to make sure the incoming data matches the schema exactly or indexing errors may occur. For more information about schemaless mode, see: <<schemaless-mode.adoc#,Schemaless Mode>>
+* Enabling this feature adds the _root_ and _nest_path_ fields to your schema. For more information about indexing nested child documents, see: <<indexing-nested-documents.adoc#,Indexing Nested Documents>>

Review comment:
       The `_root_` and `_nest_path_` fields here are getting rendered in HTML in italics. I think if you add backticks to make those monospace you won't need to do any escaping of the underscores.

##########
File path: solr/solr-ref-guide/src/solr-admin-ui.adoc
##########
@@ -85,6 +85,19 @@ This server resides at https://issues.apache.org/jira/browse/SOLR.
 
 These links cannot be modified without editing the `index.html` in the `server/solr/solr-webapp` directory that contains the Admin UI files.
 
+== Schema Designer
+
+The Schema Designer screen provides an interactive process to create a schema using sample data.

Review comment:
       Should there be a link in here to the main docs page about the schema designer?

##########
File path: solr/solr-ref-guide/src/schema-designer.adoc
##########
@@ -0,0 +1,229 @@
+= Schema Designer
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The Schema Designer screen lets you interactively design a new schema using sample data.
+
+.Schema Designer screen
+image::images/solr-admin-ui/schema-designer.png[image]
+
+There are a number of panels on the Schema Designer screen to provide immediate feedback when you make changes to the schema, including:
+
+* Upload / paste sample documents to find fields and guess the correct field type and indexing strategy
+* Schema Editor tree to edit Fields, Dynamic Fields, Field Types, and supporting Files
+* Text Analysis panel to show the text analysis pipeline for sample text based on the selected field
+* Query Tester panel to see how schema change impact query matching, sorting, faceting, and hit highlighting
+* Show Changes dialog to view a report of all changes made by the designer before publishing
+
+The Schema Designer allows you to edit an existing schema, however its main purpose is to help you safely design a new schema from sample data.
+You can safely experiment with changes and see the impact on query results immediately.
+Once data is indexed using a published schema, there are severe restrictions on the type of changes you can make to the schema without needing a full re-index.
+When designing a new schema, the Schema Designer re-indexes your sample data automatically when you make changes. However, the designer does not re-index data in collections using a published schema.
+
+.Security Requirements
+[NOTE]
+====
+If the <<rule-based-authorization-plugin.adoc#,Rule-based Authorization Plugin>> is enabled for your Solr installation, then users need to have the `config-edit` and `config-read` permissions to use the Schema Designer.
+====
+
+== Getting Started
+
+Upon entering the Schema Designer for the first time, you'll be prompted to create a New Schema.
+
+image::images/schema-designer/new-schema.png[image,width=600]
+
+Choose a short name that reflects the intended use case for your new schema. You'll need to choose a source schema to copy as the starting point for your new schema.
+Solr includes a `_default` schema which provides a good starting place for building a custom schema for your search application.
+Once a schema is published, it can be used to create new schemas and will be listed in the *Copy from* drop-down list in the dialog.
+
+Once you create the new schema, the next step is to upload or paste a sample of the data you intend to index into Solr.
+The Schema Designer supports JSON, CSV, TSV, XML, and JSON lines (jsonl).
+
+image::images/schema-designer/analyze-sample-docs.png[image,width=500]
+
+The advantage of pasting sample documents into the text area is that you can edit the sample and see the impact of your changes immediately in the analyzed schema.
+The upload feature is useful if you have large or many sample documents; the Schema Designer API allows up to 1,000 sample documents or a max of 5MB upload, but in most cases you only need a handful of documents to get started.
+
+Click on the *Analyze Documents* button to submit the sample documents to the Schema Designer API to generate your new schema.

Review comment:
       You could change this to use Asciidoctor's button macro if you want (syntax: `btn:[Analyze Documents]`). Right now it only adds some CSS and HTML elements that would output: **[ Analyze Documents ]**, which is kind of bland today but someday I'd like to jazz that up a little more. But no one's done this documentation consistently yet, so that's just an idea, and the way you have it is also totally fine.

##########
File path: solr/solr-ref-guide/src/schema-designer.adoc
##########
@@ -0,0 +1,229 @@
+= Schema Designer
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The Schema Designer screen lets you interactively design a new schema using sample data.
+
+.Schema Designer screen
+image::images/solr-admin-ui/schema-designer.png[image]
+
+There are a number of panels on the Schema Designer screen to provide immediate feedback when you make changes to the schema, including:
+
+* Upload / paste sample documents to find fields and guess the correct field type and indexing strategy
+* Schema Editor tree to edit Fields, Dynamic Fields, Field Types, and supporting Files
+* Text Analysis panel to show the text analysis pipeline for sample text based on the selected field
+* Query Tester panel to see how schema change impact query matching, sorting, faceting, and hit highlighting
+* Show Changes dialog to view a report of all changes made by the designer before publishing
+
+The Schema Designer allows you to edit an existing schema, however its main purpose is to help you safely design a new schema from sample data.
+You can safely experiment with changes and see the impact on query results immediately.
+Once data is indexed using a published schema, there are severe restrictions on the type of changes you can make to the schema without needing a full re-index.
+When designing a new schema, the Schema Designer re-indexes your sample data automatically when you make changes. However, the designer does not re-index data in collections using a published schema.
+
+.Security Requirements
+[NOTE]
+====
+If the <<rule-based-authorization-plugin.adoc#,Rule-based Authorization Plugin>> is enabled for your Solr installation, then users need to have the `config-edit` and `config-read` permissions to use the Schema Designer.
+====
+
+== Getting Started
+
+Upon entering the Schema Designer for the first time, you'll be prompted to create a New Schema.
+
+image::images/schema-designer/new-schema.png[image,width=600]
+
+Choose a short name that reflects the intended use case for your new schema. You'll need to choose a source schema to copy as the starting point for your new schema.
+Solr includes a `_default` schema which provides a good starting place for building a custom schema for your search application.
+Once a schema is published, it can be used to create new schemas and will be listed in the *Copy from* drop-down list in the dialog.
+
+Once you create the new schema, the next step is to upload or paste a sample of the data you intend to index into Solr.
+The Schema Designer supports JSON, CSV, TSV, XML, and JSON lines (jsonl).
+
+image::images/schema-designer/analyze-sample-docs.png[image,width=500]
+
+The advantage of pasting sample documents into the text area is that you can edit the sample and see the impact of your changes immediately in the analyzed schema.
+The upload feature is useful if you have large or many sample documents; the Schema Designer API allows up to 1,000 sample documents or a max of 5MB upload, but in most cases you only need a handful of documents to get started.
+
+Click on the *Analyze Documents* button to submit the sample documents to the Schema Designer API to generate your new schema.
+
+=== Temporary Configset and Collection
+
+Behind the scenes, the Schema Designer API creates a temporary <<config-sets.adoc#,Configset>> (schema + solrconfig.xml + supporting files) in Zookeeper.
+In addition, the Schema Designer API creates a temporary collection with a single shard and replica to hold sample documents.
+These temporary resources are persisted to disk and exist until the schema is published or manually deleted using the Schema Designer API cleanup endpoint (`/api/schema-designer/cleanup`).
+
+If you close your browser screen while designing a new schema, it will be available when you return.
+Simply choose the name of the schema you created previously in the select box and your schema will load into the designer UI.
+
+image::images/schema-designer/reload-schema.png[image,width=400]
+
+Previously uploaded sample documents are indexed in the temporary collection even though they do not display in the text area.
+
+[TIP]
+====
+Click on the *Edit Documents* button on the *Query Results* panel to load a JSON representation of indexed documents into the text area.
+====
+
+=== Iteratively Post Sample Documents
+
+If you have sample documents spread across multiple files, you can POST them to the Schema Designer API and then load your schema in the Designer UI to design your schema.
+Here's an example of how to use the API to "prep" a new schema and then iteratively post Solr's techproducts example files to the Schema Designer:
+
+[source,bash]
+----
+#!/bin/bash
+
+SOLR_INSTALL_DIR="path/to/solr/install"
+
+DIR_WITH_SAMPLE_FILES="$SOLR_INSTALL_DIR/example/exampledocs"
+
+SOLR_URL=http://localhost:8983
+
+MY_NEW_SCHEMA="myNewSchema"
+
+echo "Preparing new schema: ${MY_NEW_SCHEMA}"
+curl -s -o /dev/null -w "%{http_code}" -XPOST \
+  "$SOLR_URL/api/schema-designer/prep?configSet=${MY_NEW_SCHEMA}&copyFrom=_default"
+echo ""
+
+SAMPLE_FILES=( $(ls ${DIR_WITH_SAMPLE_FILES}/*.{xml,csv,json,jsonl}) )
+for f in "${SAMPLE_FILES[@]}"
+do
+  echo "POST'ing contents of $f to Schema Designer analyze endpoint ..."
+  curl -s -o /dev/null -w "%{http_code}" -XPOST \
+    "$SOLR_URL/api/schema-designer/analyze?configSet=${MY_NEW_SCHEMA}" -d @"$f"
+  echo ""
+done
+----
+
+After sending the sample documents to the Schema Designer `/analyze` endpoint, you can open the schema in the UI in your browser.
+
+== Schema Editor
+
+After analyzing your sample documents, the Schema Designer loads the schema in the *Schema Editor* in the middle panel.
+The editor renders the schema as a tree component composed of Fields, Dynamic Fields, Field Types, and Files.
+For more information about schema objects, see <<fields-and-schema-design.adoc#,Fields and Schema Design>>

Review comment:
       Need a period to end the sentence.

##########
File path: solr/solr-ref-guide/src/schema-designer.adoc
##########
@@ -0,0 +1,229 @@
+= Schema Designer
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The Schema Designer screen lets you interactively design a new schema using sample data.
+
+.Schema Designer screen
+image::images/solr-admin-ui/schema-designer.png[image]
+
+There are a number of panels on the Schema Designer screen to provide immediate feedback when you make changes to the schema, including:
+
+* Upload / paste sample documents to find fields and guess the correct field type and indexing strategy
+* Schema Editor tree to edit Fields, Dynamic Fields, Field Types, and supporting Files
+* Text Analysis panel to show the text analysis pipeline for sample text based on the selected field
+* Query Tester panel to see how schema change impact query matching, sorting, faceting, and hit highlighting
+* Show Changes dialog to view a report of all changes made by the designer before publishing
+
+The Schema Designer allows you to edit an existing schema, however its main purpose is to help you safely design a new schema from sample data.
+You can safely experiment with changes and see the impact on query results immediately.
+Once data is indexed using a published schema, there are severe restrictions on the type of changes you can make to the schema without needing a full re-index.
+When designing a new schema, the Schema Designer re-indexes your sample data automatically when you make changes. However, the designer does not re-index data in collections using a published schema.
+
+.Security Requirements
+[NOTE]
+====
+If the <<rule-based-authorization-plugin.adoc#,Rule-based Authorization Plugin>> is enabled for your Solr installation, then users need to have the `config-edit` and `config-read` permissions to use the Schema Designer.
+====
+
+== Getting Started
+
+Upon entering the Schema Designer for the first time, you'll be prompted to create a New Schema.
+
+image::images/schema-designer/new-schema.png[image,width=600]
+
+Choose a short name that reflects the intended use case for your new schema. You'll need to choose a source schema to copy as the starting point for your new schema.
+Solr includes a `_default` schema which provides a good starting place for building a custom schema for your search application.
+Once a schema is published, it can be used to create new schemas and will be listed in the *Copy from* drop-down list in the dialog.
+
+Once you create the new schema, the next step is to upload or paste a sample of the data you intend to index into Solr.
+The Schema Designer supports JSON, CSV, TSV, XML, and JSON lines (jsonl).
+
+image::images/schema-designer/analyze-sample-docs.png[image,width=500]
+
+The advantage of pasting sample documents into the text area is that you can edit the sample and see the impact of your changes immediately in the analyzed schema.
+The upload feature is useful if you have large or many sample documents; the Schema Designer API allows up to 1,000 sample documents or a max of 5MB upload, but in most cases you only need a handful of documents to get started.
+
+Click on the *Analyze Documents* button to submit the sample documents to the Schema Designer API to generate your new schema.
+
+=== Temporary Configset and Collection
+
+Behind the scenes, the Schema Designer API creates a temporary <<config-sets.adoc#,Configset>> (schema + solrconfig.xml + supporting files) in Zookeeper.
+In addition, the Schema Designer API creates a temporary collection with a single shard and replica to hold sample documents.
+These temporary resources are persisted to disk and exist until the schema is published or manually deleted using the Schema Designer API cleanup endpoint (`/api/schema-designer/cleanup`).
+
+If you close your browser screen while designing a new schema, it will be available when you return.
+Simply choose the name of the schema you created previously in the select box and your schema will load into the designer UI.
+
+image::images/schema-designer/reload-schema.png[image,width=400]
+
+Previously uploaded sample documents are indexed in the temporary collection even though they do not display in the text area.
+
+[TIP]
+====
+Click on the *Edit Documents* button on the *Query Results* panel to load a JSON representation of indexed documents into the text area.
+====
+
+=== Iteratively Post Sample Documents
+
+If you have sample documents spread across multiple files, you can POST them to the Schema Designer API and then load your schema in the Designer UI to design your schema.

Review comment:
       This API is mentioned a couple of times, most prominently here, but it doesn't seem to be documented in full anywhere?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org