You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/01/18 12:27:41 UTC

[GitHub] [lucene-solr] epugh opened a new pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

epugh opened a new pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215


   <!--
   _(If you are a project committer then you may remove some/all of the following template.)_
   
   Before creating a pull request, please file an issue in the ASF Jira system for Lucene or Solr:
   
   * https://issues.apache.org/jira/projects/LUCENE
   * https://issues.apache.org/jira/projects/SOLR
   
   You will need to create an account in Jira in order to create an issue.
   
   The title of the PR should reference the Jira issue number in the form:
   
   * LUCENE-####: <short description of problem or changes>
   * SOLR-####: <short description of problem or changes>
   
   LUCENE and SOLR must be fully capitalized. A short description helps people scanning pull requests for items they can work on.
   
   Properly referencing the issue in the title ensures that Jira is correctly updated with code review comments and commits. -->
   
   
   # Description
   
   This PR supersecedes the work done in #2016, as it doesn't drag all the commits made to master.   I followed the steps recommend by Joel Bernstein in another PR to clean up the commit history in creating this PR.
   
   To improve our security posture, this moves the ScriptingUpdateProcessor to a new contrib module that isn't installed in Solr by default.   This is also a chance to clean up the name of the processor from the old slightly awkward name "StatelessScriptingUpdateProcessor" to a simpler name.
   
   # Solution
   
   * Created a new `/contrib/scripting` module, and move the code and tests related to scripting under it.   
   * Updated all the references to `StatelessScriptingUpdateProcessor` to `ScriptingUpdateProcessor` in code and ref guide.
   
   
   
   # Tests
   
   Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [ X] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [ X] I have created a Jira issue and added the issue ID to my pull request title.
   - [ X] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [ X] I have developed this patch against the `master` branch.
   - [ X] I have run `./gradlew check`.
   - [ X] I have added tests for my changes.
   - [ X] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559696636



##########
File path: solr/solr-ref-guide/src/scripting-update-processor.adoc
##########
@@ -0,0 +1,295 @@
+= Scripting Update Processor

Review comment:
       Changes made!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559588597



##########
File path: solr/contrib/scripting/README.md
##########
@@ -0,0 +1,14 @@
+Welcome to Apache Solr Scripting!
+===============================
+
+# Introduction
+
+The Scripting contrib module pulls together various scripting related functions.  
+
+Today, the ScriptUpdateProcessorFactory allows Java scripting engines to be used during the Solr document update processing, allowing dramatic flexibility in expressing custom document processing before being indexed.  It also allows hooks to commit, delete, etc, but add is the most common usage.  It is implemented as an UpdateProcessor to be placed in an UpdateChain.

Review comment:
       Lets list a few popular options here -- I'm thinking JavaScript, Ruby, Python, Groovy

##########
File path: solr/contrib/scripting/src/java/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.java
##########
@@ -58,34 +60,34 @@
 
 /**
  * <p>
- * An update request processor factory that enables the use of update 
- * processors implemented as scripts which can be loaded by the 
- * {@link SolrResourceLoader} (usually via the <code>conf</code> dir for 
- * the SolrCore).
+ * An update request processor factory that enables the use of update
+ * processors implemented as scripts which can be loaded by the
+ * {@link SolrResourceLoader} (usually via the <code>conf</code> dir for
+ * the SolrCore).  Previously known as the StatelessScriptUpdateProcessor.

Review comment:
       ```suggestion
    * processors implemented as scripts which can be loaded from the
    * configSet.  Previously known as the StatelessScriptUpdateProcessor.
   ```

##########
File path: solr/solr-ref-guide/src/scripting-update-processor.adoc
##########
@@ -0,0 +1,295 @@
+= Scripting Update Processor

Review comment:
       ```suggestion
   = Script Update Processor
   ```
   
   And Can we rename this file to remove the "ing"?
   The PR shows this file as new; did you just write all this?

##########
File path: solr/solr-ref-guide/src/scripting-update-processor.adoc
##########
@@ -0,0 +1,295 @@
+= Scripting Update Processor
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The {solr-javadocs}/contrib/scripting/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.html[ScriptUpdateProcessor] allows Java scripting engines to be used
+during Solr document update processing, allowing dramatic flexibility in
+expressing custom document processing logic before being indexed.  It has hooks to the
+commit, delete, rollback, etc indexing actions, however add is the most common usage.
+It is implemented as an UpdateProcessor to be placed in an UpdateChain.
+
+TIP: This used to be known as the _StatelessScriptingUpdateProcessor_ and was renamed to clarify the key aspect of this update processor is it enables scripting.
+
+The script can be written in any scripting language supported by your JVM (such
+as JavaScript), and executed dynamically so no pre-compilation is necessary.
+
+WARNING: Being able to run a script of your choice as part of the indexing pipeline is a really powerful tool, that I sometimes call the
+_Get out of jail free_ card because you can solve some problems this way that you can't in any other way.  However, you are introducing some
+potential security vulnerabilities.
+
+== Installing the ScriptingUpdateProcessor and Scripting Engines
+
+The scripting update processor lives in the contrib module `/contrib/scripting`, and you need to explicitly add it to your Solr setup.
+
+Java 11 and previous versions come with a JavaScript engine called Nashorn, but Java 12 will require you to add your own JavaScript engine.   Other supported scripting engines like
+JRuby, Jython, Groovy, all require you to add JAR files.
+
+
+You can either add the `dist/solr-scripting-*.jar` file into Solr’s resource loader in a core `lib/` directory, or via `<lib>` directives in `solrconfig.xml`:
+
+[source,xml]
+----
+<lib dir="${solr.install.dir:../../..}/dist/" regex="solr-scripting-\d.*\.jar" />
+----
+
+Likewise you will need to add some JAR files depending on which scripting engines you choose.
+
+
+== Configuration
+
+[source,xml]
+----
+<updateRequestProcessorChain name="script">
+   <processor class="org.apache.solr.scripting.update.ScriptUpdateProcessorFactory">
+     <str name="script">update-script.js</str>
+   </processor>
+   <!--  optional parameters passed to script
+     <lst name="params">
+       <str name="config_param">example config parameter</str>
+     </lst>
+   -->
+   <processor class="solr.LogUpdateProcessorFactory" />
+   <processor class="solr.RunUpdateProcessorFactory" />
+ </updateRequestProcessorChain>
+----
+
+NOTE: The processor supports the defaults/appends/invariants concept for its config.
+However, it is also possible to skip this level and configure the parameters directly underneath the `<processor>` tag.
+
+Below follows a list of each configuration parameters and their meaning:
+
+`script`::
+The script file name. The script file must be placed in the `conf/ directory.
+There can be one or more "script" parameters specified; multiple scripts are executed in the order specified.
+
+`engine`::
+Optionally specifies the scripting engine to use. This is only needed if the extension
+of the script file is not a standard mapping to the scripting engine. For example, if your
+script file was coded in JavaScript but the file name was called `update-script.foo`,
+use "javascript" as the engine name.
+
+`params`::
+Optional parameters that are passed into the script execution context. This is
+specified as a named list (`<lst>`) structure with nested typed parameters. If
+specified, the script context will get a "params" object, otherwise there will be no "params" object available.
+
+
+== Script execution context
+
+Every script has some variables provided to it.
+
+`logger`::
+Logger (org.slf4j.Logger) instance. This is useful for logging information from the script.
+
+`req`::
+{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryRequest] instance.
+
+`rsp`::
+{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryResponse] instance.
+
+`params`::
+The "params" object, if any specified, from the configuration.
+
+== Examples
+
+The `processAdd()` and the other script methods can return false to skip further
+processing of the document. All methods must be defined, though generally the
+`processAdd()` method is where the action is.
+
+Here's a URL that works with the techproducts example setup demonstrating specifying
+the "script" update chain: `http://localhost:8983/solr/techproducts/update?commit=true&stream.contentType=text/csv&fieldnames=id,description&stream.body=1,foo&update.chain=script`
+which logs the following:
+
+[source,text]
+----
+INFO: update-script#processAdd: id=1
+----
+
+You can see the message recorded in the Solr logging UI.
+
+=== Javascript
+
+Note: There is a JavaScript example `update-script.js` as part of the `techproducts` configset.
+Check `solrconfig.xml` and uncomment the update request processor definition to enable this feature.
+
+[source,javascript]
+----
+function processAdd(cmd) {
+
+  doc = cmd.solrDoc;  // org.apache.solr.common.SolrInputDocument
+  id = doc.getFieldValue("id");
+  logger.info("update-script#processAdd: id=" + id);
+
+// Set a field value:
+//  doc.setField("foo_s", "whatever");
+
+// Get a configuration parameter:
+//  config_param = params.get('config_param');  // "params" only exists if processor configured with <lst name="params">
+
+// Get a request parameter:
+// some_param = req.getParams().get("some_param")
+
+// Add a field of field names that match a pattern:
+//   - Potentially useful to determine the fields/attributes represented in a result set, via faceting on field_name_ss
+//  field_names = doc.getFieldNames().toArray();
+//  for(i=0; i < field_names.length; i++) {
+//    field_name = field_names[i];
+//    if (/attr_.*/.test(field_name)) { doc.addField("attribute_ss", field_names[i]); }
+//  }
+
+}
+
+function processDelete(cmd) {
+  // no-op
+}
+
+function processMergeIndexes(cmd) {
+  // no-op
+}
+
+function processCommit(cmd) {
+  // no-op
+}
+
+function processRollback(cmd) {
+  // no-op
+}
+
+function finish() {
+  // no-op
+}
+----
+
+=== JRuby
+
+To use JRuby as the scripting engine, add `jruby.jar` to Solr's resource loader.

Review comment:
       I think the term "resource loader" ought to be internal, and we should avoid using in documentation when we can be clearer.  We can just say add to a lib directory?  We could also link to the existing ref guide page on installing libs.

##########
File path: solr/contrib/scripting/src/java/org/apache/solr/scripting/update/package-info.java
##########
@@ -0,0 +1,21 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ *  Support for scripting during document updates.

Review comment:
       ```suggestion
    * Support for scripting during document updates.
   ```

##########
File path: solr/CHANGES.txt
##########
@@ -186,6 +186,9 @@ Other Changes
 
 * SOLR-14034: Remove deprecated min_rf references (Tim Dillon)
 
+* SOLR-14067: StatelessScriptUpdateProcessor moved to it's own /contrib/scripting/ package instead
+ of shipping as part of Solr due to security concerns.  Renamed to ScriptingUpdateProcessor. (Eric Pugh)

Review comment:
       ```suggestion
    of shipping as part of Solr due to security concerns.  Renamed to ScriptUpdateProcessorFactory. (Eric Pugh)
   ```

##########
File path: solr/server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml
##########
@@ -674,12 +679,12 @@
          *** WARNING ***
          Before enabling remote streaming, you should make sure your
          system has authentication enabled.
-
-    <requestParsers enableRemoteStreaming="false"
+       -->
+    <requestParsers enableRemoteStreaming="true"

Review comment:
       Why?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559661956



##########
File path: solr/solr-ref-guide/src/scripting-update-processor.adoc
##########
@@ -0,0 +1,295 @@
+= Scripting Update Processor

Review comment:
       Will update the name and the file.   I added this file to the Ref Guide, however much (most?) of the content was sourced from the old Solr cwiki.   Part of my goal in this work is to raise the profile of this powerful feature, so I wanted the great content to be visible.  I did manually test all of this stuff (jython, groovy etc) when I first started working on it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559696904



##########
File path: solr/solr-ref-guide/src/scripting-update-processor.adoc
##########
@@ -0,0 +1,295 @@
+= Scripting Update Processor
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+The {solr-javadocs}/contrib/scripting/org/apache/solr/scripting/update/ScriptUpdateProcessorFactory.html[ScriptUpdateProcessor] allows Java scripting engines to be used
+during Solr document update processing, allowing dramatic flexibility in
+expressing custom document processing logic before being indexed.  It has hooks to the
+commit, delete, rollback, etc indexing actions, however add is the most common usage.
+It is implemented as an UpdateProcessor to be placed in an UpdateChain.
+
+TIP: This used to be known as the _StatelessScriptingUpdateProcessor_ and was renamed to clarify the key aspect of this update processor is it enables scripting.
+
+The script can be written in any scripting language supported by your JVM (such
+as JavaScript), and executed dynamically so no pre-compilation is necessary.
+
+WARNING: Being able to run a script of your choice as part of the indexing pipeline is a really powerful tool, that I sometimes call the
+_Get out of jail free_ card because you can solve some problems this way that you can't in any other way.  However, you are introducing some
+potential security vulnerabilities.
+
+== Installing the ScriptingUpdateProcessor and Scripting Engines
+
+The scripting update processor lives in the contrib module `/contrib/scripting`, and you need to explicitly add it to your Solr setup.
+
+Java 11 and previous versions come with a JavaScript engine called Nashorn, but Java 12 will require you to add your own JavaScript engine.   Other supported scripting engines like
+JRuby, Jython, Groovy, all require you to add JAR files.
+
+
+You can either add the `dist/solr-scripting-*.jar` file into Solr’s resource loader in a core `lib/` directory, or via `<lib>` directives in `solrconfig.xml`:
+
+[source,xml]
+----
+<lib dir="${solr.install.dir:../../..}/dist/" regex="solr-scripting-\d.*\.jar" />
+----
+
+Likewise you will need to add some JAR files depending on which scripting engines you choose.
+
+
+== Configuration
+
+[source,xml]
+----
+<updateRequestProcessorChain name="script">
+   <processor class="org.apache.solr.scripting.update.ScriptUpdateProcessorFactory">
+     <str name="script">update-script.js</str>
+   </processor>
+   <!--  optional parameters passed to script
+     <lst name="params">
+       <str name="config_param">example config parameter</str>
+     </lst>
+   -->
+   <processor class="solr.LogUpdateProcessorFactory" />
+   <processor class="solr.RunUpdateProcessorFactory" />
+ </updateRequestProcessorChain>
+----
+
+NOTE: The processor supports the defaults/appends/invariants concept for its config.
+However, it is also possible to skip this level and configure the parameters directly underneath the `<processor>` tag.
+
+Below follows a list of each configuration parameters and their meaning:
+
+`script`::
+The script file name. The script file must be placed in the `conf/ directory.
+There can be one or more "script" parameters specified; multiple scripts are executed in the order specified.
+
+`engine`::
+Optionally specifies the scripting engine to use. This is only needed if the extension
+of the script file is not a standard mapping to the scripting engine. For example, if your
+script file was coded in JavaScript but the file name was called `update-script.foo`,
+use "javascript" as the engine name.
+
+`params`::
+Optional parameters that are passed into the script execution context. This is
+specified as a named list (`<lst>`) structure with nested typed parameters. If
+specified, the script context will get a "params" object, otherwise there will be no "params" object available.
+
+
+== Script execution context
+
+Every script has some variables provided to it.
+
+`logger`::
+Logger (org.slf4j.Logger) instance. This is useful for logging information from the script.
+
+`req`::
+{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryRequest] instance.
+
+`rsp`::
+{solr-javadocs}/core/org/apache/solr/response/SolrQueryResponse.html[SolrQueryResponse] instance.
+
+`params`::
+The "params" object, if any specified, from the configuration.
+
+== Examples
+
+The `processAdd()` and the other script methods can return false to skip further
+processing of the document. All methods must be defined, though generally the
+`processAdd()` method is where the action is.
+
+Here's a URL that works with the techproducts example setup demonstrating specifying
+the "script" update chain: `http://localhost:8983/solr/techproducts/update?commit=true&stream.contentType=text/csv&fieldnames=id,description&stream.body=1,foo&update.chain=script`
+which logs the following:
+
+[source,text]
+----
+INFO: update-script#processAdd: id=1
+----
+
+You can see the message recorded in the Solr logging UI.
+
+=== Javascript
+
+Note: There is a JavaScript example `update-script.js` as part of the `techproducts` configset.
+Check `solrconfig.xml` and uncomment the update request processor definition to enable this feature.
+
+[source,javascript]
+----
+function processAdd(cmd) {
+
+  doc = cmd.solrDoc;  // org.apache.solr.common.SolrInputDocument
+  id = doc.getFieldValue("id");
+  logger.info("update-script#processAdd: id=" + id);
+
+// Set a field value:
+//  doc.setField("foo_s", "whatever");
+
+// Get a configuration parameter:
+//  config_param = params.get('config_param');  // "params" only exists if processor configured with <lst name="params">
+
+// Get a request parameter:
+// some_param = req.getParams().get("some_param")
+
+// Add a field of field names that match a pattern:
+//   - Potentially useful to determine the fields/attributes represented in a result set, via faceting on field_name_ss
+//  field_names = doc.getFieldNames().toArray();
+//  for(i=0; i < field_names.length; i++) {
+//    field_name = field_names[i];
+//    if (/attr_.*/.test(field_name)) { doc.addField("attribute_ss", field_names[i]); }
+//  }
+
+}
+
+function processDelete(cmd) {
+  // no-op
+}
+
+function processMergeIndexes(cmd) {
+  // no-op
+}
+
+function processCommit(cmd) {
+  // no-op
+}
+
+function processRollback(cmd) {
+  // no-op
+}
+
+function finish() {
+  // no-op
+}
+----
+
+=== JRuby
+
+To use JRuby as the scripting engine, add `jruby.jar` to Solr's resource loader.

Review comment:
       Reworked it, and provide a link at the top to pint to the resource loader ref guide page.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
dsmiley commented on pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#issuecomment-764831689


   :+1: ship it!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559735328



##########
File path: solr/solr-ref-guide/src/script-update-processor.adoc
##########
@@ -35,19 +35,19 @@ potential security vulnerabilities.
 
 The scripting update processor lives in the contrib module `/contrib/scripting`, and you need to explicitly add it to your Solr setup.
 
-Java 11 and previous versions come with a JavaScript engine called Nashorn, but Java 12 will require you to add your own JavaScript engine.   Other supported scripting engines like
-JRuby, Jython, Groovy, all require you to add JAR files.
-
-
-You can either add the `dist/solr-scripting-*.jar` file into Solr’s resource loader in a core `lib/` directory, or via `<lib>` directives in `solrconfig.xml`:
+You can either add the `dist/solr-scripting-*.jar` file into Solr’s core `lib/` directory, or via `<lib>` directives in `solrconfig.xml`:

Review comment:
       It's more probable someone would use SOLR_HOME/lib than a lib directory on a core.  And FYI `<lib>` directives may be going away or be discouraged.  Let's link to `<<libs.adoc#lib-directories,Lib Directories>>` instead of enumerating how to do this, so we can just maintain this sort of info in one place in the ref guide.

##########
File path: solr/solr-ref-guide/src/script-update-processor.adoc
##########
@@ -267,8 +267,8 @@ def finish() {
 }
 ----
 
-=== Jython
-
+=== Python
+Python support is implemented via the https://www.jython.org/[Jython] project.
 Put the *standalone* `jython.jar` (the JAR that contains all the dependencies) into Solr's resource loader.

Review comment:
       Please remove "resource loader" from this page.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh merged pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh merged pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
dsmiley commented on pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#issuecomment-764831689


   :+1: ship it!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh merged pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh merged pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559731569



##########
File path: solr/CHANGES.txt
##########
@@ -186,6 +186,9 @@ Other Changes
 
 * SOLR-14034: Remove deprecated min_rf references (Tim Dillon)
 
+* SOLR-14067: StatelessScriptUpdateProcessor moved to it's own /contrib/scripting/ package instead
+ of shipping as part of Solr due to security concerns.  Renamed to ScriptingUpdateProcessor. (Eric Pugh)

Review comment:
       I agree 100% on seeing \*Factory all over the config being poor for the reason you gave.  It could also be argued that even the "UpdateProcessor" part is quite redundant based on where we declare it.  Have you noticed changes in Lucene to how schema analysis components are resolved, affecting the Solr schema (master only)?  See solr/server/solr/configsets/_default/conf/managed-schema -- `<filter name="lowercase"/>` it's beautiful.    No "FilterFactory" suffix.  Eventually I hope we can take the same approach throughout Solr.  Lucene uses an SPI approach which means a special file listing each implementation.  Something like that could be embraced.  No need to separate the factory from inner class over this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559695973



##########
File path: solr/server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml
##########
@@ -674,12 +679,12 @@
          *** WARNING ***
          Before enabling remote streaming, you should make sure your
          system has authentication enabled.
-
-    <requestParsers enableRemoteStreaming="false"
+       -->
+    <requestParsers enableRemoteStreaming="true"

Review comment:
       This was required for one of the demos..   Line 113 of script-update-processor shows a curl command:
   
   ```
   http://localhost:8983/solr/techproducts/update?commit=true&stream.contentType=text/csv&fieldnames=id,description&stream.body=1,foo&update.chain=script
   ```
   
   If this is too dangerous, I could revert this change and document the need to make the change in the directions.  It's just one more barrier to easily trying the feature with the tech products example.   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] ctargett commented on pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
ctargett commented on pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#issuecomment-763721063


   I gave the docs a rather quick once-over and the changes look good to me. One more update processor documented, only a few dozen left to go! ;-) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559674364



##########
File path: solr/CHANGES.txt
##########
@@ -186,6 +186,9 @@ Other Changes
 
 * SOLR-14034: Remove deprecated min_rf references (Tim Dillon)
 
+* SOLR-14067: StatelessScriptUpdateProcessor moved to it's own /contrib/scripting/ package instead
+ of shipping as part of Solr due to security concerns.  Renamed to ScriptingUpdateProcessor. (Eric Pugh)

Review comment:
       I don't love that we have the `Factory` suffix, as that feels like an implementation detail of how update processors work.
   
   I wish we could refer to this as the `ScriptUpdateProcessor`, even though this specific class is buried in the file `ScriptUpdateProcessorFactory.java` as:
   
   ```
   private static class ScriptUpdateProcessor extends UpdateRequestProcessor
   ```
   
   Would it be worth pulling the inner class to it's own `ScriptUpdateProcessor.java` file?   
   
   Then we could just link to that file everywhere.  Thoughts????




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559672654



##########
File path: solr/contrib/scripting/README.md
##########
@@ -0,0 +1,14 @@
+Welcome to Apache Solr Scripting!
+===============================
+
+# Introduction
+
+The Scripting contrib module pulls together various scripting related functions.  
+
+Today, the ScriptUpdateProcessorFactory allows Java scripting engines to be used during the Solr document update processing, allowing dramatic flexibility in expressing custom document processing before being indexed.  It also allows hooks to commit, delete, etc, but add is the most common usage.  It is implemented as an UpdateProcessor to be placed in an UpdateChain.

Review comment:
       Thanks, rewroked this.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] epugh commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
epugh commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r560967903



##########
File path: solr/CHANGES.txt
##########
@@ -186,6 +186,9 @@ Other Changes
 
 * SOLR-14034: Remove deprecated min_rf references (Tim Dillon)
 
+* SOLR-14067: StatelessScriptUpdateProcessor moved to it's own /contrib/scripting/ package instead
+ of shipping as part of Solr due to security concerns.  Renamed to ScriptingUpdateProcessor. (Eric Pugh)

Review comment:
       Yep.   I didn't know about that feature.   So, as far as this specific PR goes, can we leave it as ScriptUpdateProcessor, but of course the Javadoc links go to a ScriptUpdateProcessorFactory.html file?   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene-solr] dsmiley commented on a change in pull request #2215: SOLR-14067: v3 Create /contrib/scripting module with ScriptingUpdateProcessor

Posted by GitBox <gi...@apache.org>.
dsmiley commented on a change in pull request #2215:
URL: https://github.com/apache/lucene-solr/pull/2215#discussion_r559733095



##########
File path: solr/server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml
##########
@@ -674,12 +679,12 @@
          *** WARNING ***
          Before enabling remote streaming, you should make sure your
          system has authentication enabled.
-
-    <requestParsers enableRemoteStreaming="false"
+       -->
+    <requestParsers enableRemoteStreaming="true"

Review comment:
       Ok.... "sample_techproducts_configs" being something that would be just for demo/experimentation locally and never a shipping config.  I don't see why anyone would foolishly copy this particular configSet (in entirety) as the basis for their own either.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org