You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2022/09/23 20:51:05 UTC

[GitHub] [solr] dsmiley commented on a diff in pull request #1040: SOLR-16428: Add "permissive" mode to IgnoreLargeDocumentsProcessorFactory

dsmiley commented on code in PR #1040:
URL: https://github.com/apache/solr/pull/1040#discussion_r979055239


##########
solr/core/src/java/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactory.java:
##########
@@ -59,15 +71,25 @@ public void init(NamedList<?> args) {
   public UpdateRequestProcessor getInstance(
       SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next) {
     return new UpdateRequestProcessor(next) {
+
       @Override
       public void processAdd(AddUpdateCommand cmd) throws IOException {
         long docSize = ObjectSizeEstimator.estimate(cmd.getSolrInputDocument());
         if (docSize / 1024 > maxDocumentSize) {
+          handleViolatingDoc(cmd, docSize);
+        } else {
+          super.processAdd(cmd);
+        }
+      }
+
+      private void handleViolatingDoc(AddUpdateCommand cmd, long estimatedSizeBytes) {
+        if (onlyLogErrors) {
+          log.warn("Skipping doc {} bc size {} exceeds limit {}", cmd.getPrintableId(), estimatedSizeBytes / 1024, maxDocumentSize);

Review Comment:
   Are both numbers logged in the same units?  As a reader of this log message; I'm not sure what the unit is.



##########
solr/core/src/java/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactory.java:
##########
@@ -59,15 +71,25 @@ public void init(NamedList<?> args) {
   public UpdateRequestProcessor getInstance(
       SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next) {
     return new UpdateRequestProcessor(next) {
+
       @Override
       public void processAdd(AddUpdateCommand cmd) throws IOException {
         long docSize = ObjectSizeEstimator.estimate(cmd.getSolrInputDocument());
         if (docSize / 1024 > maxDocumentSize) {
+          handleViolatingDoc(cmd, docSize);
+        } else {
+          super.processAdd(cmd);
+        }
+      }
+
+      private void handleViolatingDoc(AddUpdateCommand cmd, long estimatedSizeBytes) {
+        if (onlyLogErrors) {
+          log.warn("Skipping doc {} bc size {} exceeds limit {}", cmd.getPrintableId(), estimatedSizeBytes / 1024, maxDocumentSize);
+        } else {
           throw new SolrException(
-              BAD_REQUEST,
-              "Size of the document " + cmd.getPrintableId() + " is too large, around:" + docSize);
+                  BAD_REQUEST,
+                  "Size of the document " + cmd.getPrintableId() + " is too large, around:" + estimatedSizeBytes);

Review Comment:
   If necessary, convert to the same units as that we'd log?



##########
solr/core/src/java/org/apache/solr/update/processor/IgnoreLargeDocumentProcessorFactory.java:
##########
@@ -41,14 +46,21 @@
  */
 public class IgnoreLargeDocumentProcessorFactory extends UpdateRequestProcessorFactory {
   public static final String LIMIT_SIZE_PARAM = "limit";
+  public static final String ONLY_LOG_ERRORS_PARAM = "onlyLogErrors";
+  private static final Logger log = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
 
   // limit of a SolrInputDocument size (in kb)
   private long maxDocumentSize = 1024 * 1024;
+  private boolean onlyLogErrors = false;
 
   @Override
   public void init(NamedList<?> args) {
-    maxDocumentSize = args.toSolrParams().required().getLong(LIMIT_SIZE_PARAM);
+    final SolrParams params = args.toSolrParams();
+    maxDocumentSize = params.required().getLong(LIMIT_SIZE_PARAM);

Review Comment:
   Can't be null because it doesn't notice the impact of `required()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org