You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by GitBox <gi...@apache.org> on 2021/09/01 21:47:17 UTC

[GitHub] [solr] HoustonPutman opened a new pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

HoustonPutman opened a new pull request #283:
URL: https://github.com/apache/solr/pull/283


   https://issues.apache.org/jira/browse/SOLR-14457
   
   Cleaned up the way that `GZIPInputStream` objects are created, since I think the current manner doesn't work when `GzipDecompressingEntity.getContent()` is called multiple times after having read parts of the data.
   
   I plan on adding a test before committing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #283:
URL: https://github.com/apache/solr/pull/283#discussion_r701947681



##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java
##########
@@ -447,14 +448,39 @@ public void process(final HttpResponse response, final HttpContext context)
     }
   }
   
-  private static class GzipDecompressingEntity extends HttpEntityWrapper {
+  protected static class GzipDecompressingEntity extends HttpEntityWrapper {
+    private boolean gzipInputStreamCreated = false;
+    private InputStream gzipInputStream = null;
+
     public GzipDecompressingEntity(final HttpEntity entity) {
       super(entity);
     }
-    
+
+    /**
+     * Return a InputStream of uncompressed data.
+     * If there is an issue with the compression of the data, a null InputStream will be returned,
+     * and the underlying compressed InputStream will be closed.
+     *
+     * The same input stream will be returned if the underlying entity is not repeatable.
+     * If the underlying entity is repeatable, then a new input stream will be created.

Review comment:
       Are you talking about the HttpEntity contract or the GzipDecompressingEntity contract? Because I'm just following what HttpEntities do. If you have an InputStreamEntity, there's really no way you can replay that stuff without buffering the entire thing. While a StringEntity is very easy to repeat. 
   
   I agree the contract is strange, it should probably just always be one-time-readable.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] samuelgmartinez commented on pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
samuelgmartinez commented on pull request #283:
URL: https://github.com/apache/solr/pull/283#issuecomment-913095892


   Sorry for joining late to the PR but I think that removing the interceptor entirely would be a better solution. The current HTTP Client version handles the compression itself when the right headers are present in the response, handling also the errors properly. 
   
   What do you think if I open a new improvement issue to remove that piece of code completely? (less code to maintain 😀)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #283:
URL: https://github.com/apache/solr/pull/283#discussion_r702030306



##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java
##########
@@ -447,14 +448,39 @@ public void process(final HttpResponse response, final HttpContext context)
     }
   }
   
-  private static class GzipDecompressingEntity extends HttpEntityWrapper {
+  protected static class GzipDecompressingEntity extends HttpEntityWrapper {
+    private boolean gzipInputStreamCreated = false;
+    private InputStream gzipInputStream = null;
+
     public GzipDecompressingEntity(final HttpEntity entity) {
       super(entity);
     }
-    
+
+    /**
+     * Return a InputStream of uncompressed data.
+     * If there is an issue with the compression of the data, a null InputStream will be returned,
+     * and the underlying compressed InputStream will be closed.
+     *
+     * The same input stream will be returned if the underlying entity is not repeatable.
+     * If the underlying entity is repeatable, then a new input stream will be created.

Review comment:
       For more context, a [StringEntity will return a whole new InputStream every time](https://github.com/apache/httpcomponents-core/blob/master/httpcore5/src/main/java/org/apache/hc/core5/http/io/entity/StringEntity.java#L122) and an [InputStreamEntity will return the same one every time](https://github.com/apache/httpcomponents-core/blob/master/httpcore5/src/main/java/org/apache/hc/core5/http/io/entity/InputStreamEntity.java#L77).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #283:
URL: https://github.com/apache/solr/pull/283#discussion_r701380560



##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java
##########
@@ -447,14 +448,39 @@ public void process(final HttpResponse response, final HttpContext context)
     }
   }
   
-  private static class GzipDecompressingEntity extends HttpEntityWrapper {
+  protected static class GzipDecompressingEntity extends HttpEntityWrapper {
+    private boolean gzipInputStreamCreated = false;
+    private InputStream gzipInputStream = null;
+
     public GzipDecompressingEntity(final HttpEntity entity) {
       super(entity);
     }
-    
+
+    /**
+     * Return a InputStream of uncompressed data.
+     * If there is an issue with the compression of the data, a null InputStream will be returned,
+     * and the underlying compressed InputStream will be closed.
+     *
+     * The same input stream will be returned if the underlying entity is not repeatable.
+     * If the underlying entity is repeatable, then a new input stream will be created.

Review comment:
       Tests for failing & successful, repeatable & non-repeatable enumerated.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] madrob commented on a change in pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #283:
URL: https://github.com/apache/solr/pull/283#discussion_r701264180



##########
File path: solr/solrj/src/test/org/apache/solr/client/solrj/impl/HttpClientUtilTest.java
##########
@@ -109,4 +111,21 @@ public void testToBooleanObject() throws Exception {
     assertEquals(null, HttpClientUtil.toBooleanObject("foo"));
     assertEquals(null, HttpClientUtil.toBooleanObject(null));
   }
+
+  @Test
+  public void testNonRepeatableMalformedGzipEntityAutoClosed() throws IOException {
+    HttpEntity baseEntity = new InputStreamEntity(new ByteArrayInputStream("this is not compressed".getBytes(StandardCharsets.UTF_8)));
+    HttpClientUtil.GzipDecompressingEntity gzipDecompressingEntity = new HttpClientUtil.GzipDecompressingEntity(baseEntity);
+    expectThrows(IOException.class, "An IOException wrapping a ZIPException should be thrown when loading a malformed GZIP Entity Content", gzipDecompressingEntity::getContent);

Review comment:
       Save the exception and make an assertion about the cause?

##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java
##########
@@ -447,14 +448,39 @@ public void process(final HttpResponse response, final HttpContext context)
     }
   }
   
-  private static class GzipDecompressingEntity extends HttpEntityWrapper {
+  protected static class GzipDecompressingEntity extends HttpEntityWrapper {
+    private boolean gzipInputStreamCreated = false;
+    private InputStream gzipInputStream = null;
+
     public GzipDecompressingEntity(final HttpEntity entity) {
       super(entity);
     }
-    
+
+    /**
+     * Return a InputStream of uncompressed data.
+     * If there is an issue with the compression of the data, a null InputStream will be returned,
+     * and the underlying compressed InputStream will be closed.
+     *
+     * The same input stream will be returned if the underlying entity is not repeatable.
+     * If the underlying entity is repeatable, then a new input stream will be created.

Review comment:
       do we have a test that covers this? what is the use case for repeatable entities?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] madrob commented on a change in pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #283:
URL: https://github.com/apache/solr/pull/283#discussion_r701920369



##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java
##########
@@ -447,14 +448,39 @@ public void process(final HttpResponse response, final HttpContext context)
     }
   }
   
-  private static class GzipDecompressingEntity extends HttpEntityWrapper {
+  protected static class GzipDecompressingEntity extends HttpEntityWrapper {
+    private boolean gzipInputStreamCreated = false;
+    private InputStream gzipInputStream = null;
+
     public GzipDecompressingEntity(final HttpEntity entity) {
       super(entity);
     }
-    
+
+    /**
+     * Return a InputStream of uncompressed data.
+     * If there is an issue with the compression of the data, a null InputStream will be returned,
+     * and the underlying compressed InputStream will be closed.
+     *
+     * The same input stream will be returned if the underlying entity is not repeatable.
+     * If the underlying entity is repeatable, then a new input stream will be created.

Review comment:
       The more I think about this, the stranger this API contract seems. The caller has now way of really knowing whether they are getting an in-progress stream that is somewhere in the middle, or a new repeatable stream set to the beginning.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] madrob commented on a change in pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
madrob commented on a change in pull request #283:
URL: https://github.com/apache/solr/pull/283#discussion_r702072134



##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java
##########
@@ -468,7 +494,11 @@ public long getContentLength() {
     public DeflateDecompressingEntity(final HttpEntity entity) {
       super(entity);
     }
-    
+
+    /**
+     * InflaterInputStream does not throw a ZipException in the constructor,
+     * so it does not need the same checks as the GZIPInputStream.
+     */

Review comment:
       Move this comment into the method, it's an implementation note, not javadoc.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman merged pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
HoustonPutman merged pull request #283:
URL: https://github.com/apache/solr/pull/283


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on pull request #283:
URL: https://github.com/apache/solr/pull/283#issuecomment-914393331


   @samuelgmartinez , Thanks for the input! I think that would be great to go as a new improvement ticket, and we will probably shoot for 9.0 (not 8.x) if we decide to go that direction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr] HoustonPutman commented on a change in pull request #283: SOLR-14457: Fix closing of malformed GZIPInputStream

Posted by GitBox <gi...@apache.org>.
HoustonPutman commented on a change in pull request #283:
URL: https://github.com/apache/solr/pull/283#discussion_r701356957



##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClientUtil.java
##########
@@ -447,14 +448,39 @@ public void process(final HttpResponse response, final HttpContext context)
     }
   }
   
-  private static class GzipDecompressingEntity extends HttpEntityWrapper {
+  protected static class GzipDecompressingEntity extends HttpEntityWrapper {
+    private boolean gzipInputStreamCreated = false;
+    private InputStream gzipInputStream = null;
+
     public GzipDecompressingEntity(final HttpEntity entity) {
       super(entity);
     }
-    
+
+    /**
+     * Return a InputStream of uncompressed data.
+     * If there is an issue with the compression of the data, a null InputStream will be returned,
+     * and the underlying compressed InputStream will be closed.
+     *
+     * The same input stream will be returned if the underlying entity is not repeatable.
+     * If the underlying entity is repeatable, then a new input stream will be created.

Review comment:
       I'm not exactly sure. I just included the option so that I don't have to confirm that we can expect a repeatable entity every time. I can add a better test for it, instead of just testing the failures




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org