You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by sa...@apache.org on 2018/04/20 20:06:52 UTC

[2/2] lucene-solr:branch_7x: SOLR-4793: Document usage of ZooKeeper's jute.maxbuffer sysprop for increasing the file size limit above 1MB

SOLR-4793: Document usage of ZooKeeper's jute.maxbuffer sysprop for increasing the file size limit above 1MB


Project: http://git-wip-us.apache.org/repos/asf/lucene-solr/repo
Commit: http://git-wip-us.apache.org/repos/asf/lucene-solr/commit/95922211
Tree: http://git-wip-us.apache.org/repos/asf/lucene-solr/tree/95922211
Diff: http://git-wip-us.apache.org/repos/asf/lucene-solr/diff/95922211

Branch: refs/heads/branch_7x
Commit: 9592221193971732b8d2c4b2c2994417bd7a3072
Parents: e1ccb49
Author: Steve Rowe <sa...@apache.org>
Authored: Fri Apr 20 16:06:22 2018 -0400
Committer: Steve Rowe <sa...@apache.org>
Committed: Fri Apr 20 16:06:43 2018 -0400

----------------------------------------------------------------------
 ...ractNamedEntitiesUpdateProcessorFactory.java |  4 ++
 solr/solr-ref-guide/src/learning-to-rank.adoc   |  2 +
 ...tting-up-an-external-zookeeper-ensemble.adoc | 70 ++++++++++++++++++++
 .../src/update-request-processors.adoc          |  2 +-
 4 files changed, 77 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95922211/solr/contrib/analysis-extras/src/java/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.java
----------------------------------------------------------------------
diff --git a/solr/contrib/analysis-extras/src/java/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.java b/solr/contrib/analysis-extras/src/java/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.java
index aa6a97b..2a7514d 100644
--- a/solr/contrib/analysis-extras/src/java/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.java
+++ b/solr/contrib/analysis-extras/src/java/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.java
@@ -77,6 +77,10 @@ import org.slf4j.LoggerFactory;
  * <p>See the <a href="http://opennlp.apache.org/models.html">OpenNLP website</a>
  * for information on downloading pre-trained models.</p>
  *
+ * Note that in order to use model files larger than 1MB on SolrCloud, 
+ * <a href="https://lucene.apache.org/solr/guide/setting-up-an-external-zookeeper-ensemble#increasing-zookeeper-s-1mb-file-size-limit"
+ * >ZooKeeper server and client configuration is required</a>.
+ * 
  * <p>
  * The <code>source</code> field(s) can be configured as either:
  * </p>

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95922211/solr/solr-ref-guide/src/learning-to-rank.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/learning-to-rank.adoc b/solr/solr-ref-guide/src/learning-to-rank.adoc
index 4e79a7a..938fd44 100644
--- a/solr/solr-ref-guide/src/learning-to-rank.adoc
+++ b/solr/solr-ref-guide/src/learning-to-rank.adoc
@@ -560,6 +560,8 @@ NOTE: No `"features"` are configured in `myWrapperModel` because the features of
 
 CAUTION: `<lib dir="/path/to/models" regex=".*\.json" />` doesn't work as expected in this case, because `SolrResourceLoader` considers given resources as JAR if `<lib />` indicates files.
 
+As an alternative to the above-described `DefaultWrapperModel`, it is possible to <<setting-up-an-external-zookeeper-ensemble#increasing-zookeeper-s-1mb-file-size-limit,increase ZooKeeper's file size limit>>.
+
 === Applying Changes
 
 The feature store and the model store are both <<managed-resources.adoc#managed-resources,Managed Resources>>. Changes made to managed resources are not applied to the active Solr components until the Solr collection (or Solr core in single server mode) is reloaded.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95922211/solr/solr-ref-guide/src/setting-up-an-external-zookeeper-ensemble.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/setting-up-an-external-zookeeper-ensemble.adoc b/solr/solr-ref-guide/src/setting-up-an-external-zookeeper-ensemble.adoc
index f6bc525..ed4cfef 100644
--- a/solr/solr-ref-guide/src/setting-up-an-external-zookeeper-ensemble.adoc
+++ b/solr/solr-ref-guide/src/setting-up-an-external-zookeeper-ensemble.adoc
@@ -349,6 +349,76 @@ set ZK_HOST=zk1:2181,zk2:2181,zk3:2181/solr
 
 Now you will not have to enter the connection string when starting Solr.
 
+== Increasing ZooKeeper's 1MB File Size Limit
+
+ZooKeeper is designed to hold small files, on the order of kilobytes.  By default, ZooKeeper's file size limit is 1MB.  Attempting to write or read files larger than this will cause errors. 
+
+Some Solr features, e.g. text analysis synonyms, LTR, and OpenNLP named entity recognition, require configuration resources that can be larger than the default limit.  ZooKeeper can be configured, via Java system property https://zookeeper.apache.org/doc/r{ivy-zookeeper-version}/zookeeperAdmin.html#Unsafe+Options[`jute.maxbuffer`], to increase this limit.  Note that this configuration, which is required both for ZooKeeper server(s) and for all clients that connect to the server(s), must be the same everywhere it is specified.
+
+=== Configuring jute.maxbuffer on ZooKeeper nodes
+
+`jute.maxbuffer` must be configured on each external ZooKeeper node.  This can be achieved in any of the following ways; note though that only the first option works on Windows:  
+
+. In `<ZOOKEEPER_HOME>/conf/zoo.cfg`, e.g. to increase the file size limit to one byte less than 10MB, add this line:
++
+[source,properties]
+jute.maxbuffer=0x9fffff
+. In `<ZOOKEEPER_HOME>/conf/zookeeper-env.sh`, e.g. to increase the file size limit to 50MiB, add this line:
++
+[source,properties]
+JVMFLAGS="$JVMFLAGS -Djute.maxbuffer=50000000"
+. In `<ZOOKEEPER_HOME>/bin/zkServer.sh`, add a `JVMFLAGS` environment variable assignment near the top of the script, e.g. to increase the file size limit to 5MiB:
++
+[source,properties]
+JVMFLAGS="$JVMFLAGS -Djute.maxbuffer=5000000"
+
+=== Configuring jute.maxbuffer for ZooKeeper clients
+
+The `bin/solr` script invokes Java programs that act as ZooKeeper clients.  (When you use Solr's bundled ZooKeeper server instead of setting up an external ZooKeeper ensemble, the configuration described below will also configure the ZooKeeper server.) 
+  
+Add the setting to the `SOLR_OPTS` environment variable in Solr's include file (`bin/solr.in.sh` or `solr.in.cmd`):
+
+[.dynamic-tabs]
+--
+[example.tab-pane#linux2]
+====
+[.tab-label]*Linux: solr.in.sh*
+
+The section to look for will start:
+
+[source,properties]
+----
+# Anything you add to the SOLR_OPTS variable will be included in the java
+# start command line as-is, in ADDITION to other options. If you specify the
+# -a option on start script, those options will be appended as well. Examples:
+----
+
+Add the following line to increase the file size limit to 2MB:
+
+[source,properties]
+SOLR_OPTS="$SOLR_OPTS -Djute.maxbuffer=0x200000"
+====
+
+[example.tab-pane#zkwindows2]
+====
+[.tab-label]*Windows: solr.in.cmd*
+
+The section to look for will start:
+
+[source,bat]
+----
+REM Anything you add to the SOLR_OPTS variable will be included in the java
+REM start command line as-is, in ADDITION to other options. If you specify the
+REM -a option on start script, those options will be appended as well. Examples:
+----
+
+Add the following line to increase the file size limit to 2MB:
+
+[source,bat]
+set SOLR_OPTS=%SOLR_OPTS% -Djute.maxbuffer=0x200000
+====
+--
+
 == Securing the ZooKeeper Connection
 
 You may also want to secure the communication between ZooKeeper and Solr.

http://git-wip-us.apache.org/repos/asf/lucene-solr/blob/95922211/solr/solr-ref-guide/src/update-request-processors.adoc
----------------------------------------------------------------------
diff --git a/solr/solr-ref-guide/src/update-request-processors.adoc b/solr/solr-ref-guide/src/update-request-processors.adoc
index f394b08..6958628 100644
--- a/solr/solr-ref-guide/src/update-request-processors.adoc
+++ b/solr/solr-ref-guide/src/update-request-processors.adoc
@@ -355,7 +355,7 @@ The {solr-javadocs}/solr-uima/index.html[`uima`] contrib provides::
 
 The {solr-javadocs}/solr-analysis-extras/index.html[`analysis-extras`] contrib provides::
 
-{solr-javadocs}/solr-analysis-extras/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html[OpenNLPExtractNamedEntitiesUpdateProcessorFactory]::: Update document(s) to be indexed with named entities extracted using an OpenNLP NER model.
+{solr-javadocs}/solr-analysis-extras/org/apache/solr/update/processor/OpenNLPExtractNamedEntitiesUpdateProcessorFactory.html[OpenNLPExtractNamedEntitiesUpdateProcessorFactory]::: Update document(s) to be indexed with named entities extracted using an OpenNLP NER model.  Note that in order to use model files larger than 1MB on SolrCloud, <<setting-up-an-external-zookeeper-ensemble#increasing-zookeeper-s-1mb-file-size-limit,ZooKeeper server and client configuration is required>>.  
 
 === Update Processor Factories You Should _Not_ Modify or Remove