You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/11 09:58:18 UTC

[GitHub] [druid] FrankChen021 opened a new pull request #10980: Fix Smile encoding for HTTP response

FrankChen021 opened a new pull request #10980:
URL: https://github.com/apache/druid/pull/10980


   This PR fixes #10945
   
   ### Description
   
   /druid/v2 endpoint supports jackson smile encoding both for request and response. But there's a bug there when processing a request with 'Accept:application/x-jackson-smile' header.
   
   Current implementation takes 'Accept' parameter, which indicates server that smile encoding is accepted by client to decode the HTTP response, to decode a JSON request. According to HTTP protocol on 'Content-Type' and 'Accept', server should alway decode the body according to 'Content-Type' header, while encode response by 'Accept' header( if not provided, 'Accept' defaults to 'Content-Type')
   
   The reason why current unit tests fails to detect this bug is because test cases in `QueryResourceTest` wrongly inject a Json ObjectMapper instead of Smile ObjectMapper for Smile decoding. All related test cases are fixed in this PR too.
   
   ### Smile encoding test
   
   I have never heard of Smile encoding before, and there's little doc in Druid to describe if we should exploit it. So I also make some investigation how it saves response sizes in Druid to see if this feature could be used in our environment.
   
   ### Test data
   
   Wikipedia data shipped by Druid
   
   ### Tests
   
   Sending following scan query by CURL with following parameters
   - `limit` parameter range in [100,200,300,400,500]
   - Accept:application/x-jackson-smile turned on and off
   - Content-Encoding:gzip turned on and off
   
   observe 'size_download' for each response outputted by CURL 
   
   ```
   {
     "queryType": "scan",
     "dataSource": {
       "type": "table",
       "name": "wikipedia"
     },
     "intervals": {
       "type": "intervals",
       "intervals": [
         "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
       ]
     },
     "virtualColumns": [],
     "batchSize": 20480,
     "limit": 100,
     "legacy": false,
     "descending": false,
     "granularity": {
       "type": "all"
     }
   }
   ```
   
   #### Result
   
   Limit |Response Bytes in JSON(GZip) | Response Bytes in JSON(No-GZip) | Response Bytes in Smile(GZip) | Response Bytes in Smile(No-GZip)
   -- | -- | -- | -- | --
   100 | 6753 | 43084 | 6550 | 17874
   200 | 12485 | 86427 | 11646 | 35781
   300 | 18879 | 129798 | 17316 | 53797
   400 | 24494 | 172290 | 22208 | 70896
   500 | 30886 | 215331 | 27920 | 88595
   
   The tests above do not cover all queries in druid, but the data in the table generally reflects advantages of Smile encoding in one aspect.
   
   From the table we can see that:
   - when gzip is not enabled in response, Smile encoding(column 5) takes about only 41% space over JSON(column 3)
   - when gzip is enabled, Smile encoding performs a little improvement(column 4 vs columns 2)
   
   In contrast to gzip which requires more resources on server side, Smile saves serialization time compared to JSON according to [some tests](http://www.theotherian.com/2013/12/kryo-vs-smile-vs-json-shootout-part-1.html). And there's no too much work to do expect replacing default object mapper to smile object mapper which usually is one line of work in many dependency injection system, I think Smile encoding is worthy to try.
   
   ### Note
   
   SQL endpoint does not support Smile encoding yet(#10931), this PR does not add Smile encoding to SQL endpoint. I think we could have a discussion. Based on the tests shown above, IMO, it's reasonable to provide such support because SQL query is more popular than native query, at least in our environment.
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `ResourceIOWriter` is extracted from `ResourceIOReaderWriter` to hold response content-type and right ObjectMapper while original ResourceIOReaderWriter holds right ObjectMapper to deserialize input request.
   
   <hr>
   
   
   This PR has:
   - [X] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [X] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [X] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [ ] added integration tests.
   - [X] been tested in a test Druid cluster.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson merged pull request #10980: Fix Smile encoding for HTTP response

Posted by GitBox <gi...@apache.org>.
jihoonson merged pull request #10980:
URL: https://github.com/apache/druid/pull/10980


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10980: Fix Smile encoding for HTTP response

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10980:
URL: https://github.com/apache/druid/pull/10980#issuecomment-821818165


   @jihoonson In the new commit, I didn't add new IT cases but made some changes to current `ITWikipediaQueryTest` to cover all supported value combinations of `Content-Type` and `Accept` headers.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10980: Fix Smile encoding for HTTP response

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10980:
URL: https://github.com/apache/druid/pull/10980#discussion_r616595628



##########
File path: integration-tests/src/main/java/org/apache/druid/testing/clients/AbstractQueryResourceTestClient.java
##########
@@ -21,67 +21,157 @@
 
 import com.fasterxml.jackson.core.type.TypeReference;
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.jaxrs.smile.SmileMediaTypes;
 import com.google.inject.Inject;
+import org.apache.druid.guice.annotations.Smile;
+import org.apache.druid.java.util.common.IAE;
 import org.apache.druid.java.util.common.ISE;
 import org.apache.druid.java.util.http.client.HttpClient;
 import org.apache.druid.java.util.http.client.Request;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHandler;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHolder;
 import org.apache.druid.java.util.http.client.response.StatusResponseHandler;
 import org.apache.druid.java.util.http.client.response.StatusResponseHolder;
-import org.apache.druid.testing.IntegrationTestingConfig;
 import org.apache.druid.testing.guice.TestClient;
+import org.jboss.netty.handler.codec.http.HttpHeaders;
 import org.jboss.netty.handler.codec.http.HttpMethod;
 import org.jboss.netty.handler.codec.http.HttpResponseStatus;
 
+import javax.ws.rs.core.MediaType;
+import java.io.IOException;
 import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Future;
 
 public abstract class AbstractQueryResourceTestClient<QueryType>
 {
-  private final ObjectMapper jsonMapper;
-  private final HttpClient httpClient;
+  private String contentTypeHeader = MediaType.APPLICATION_JSON;
+
+  /**
+   * a 'null' means the Content-Type in response defaults to Content-Type of request
+   */
+  private String acceptHeader = null;
+
+  final ObjectMapper jsonMapper;
+  final ObjectMapper smileMapper;
+  final HttpClient httpClient;
   final String routerUrl;
+  final Map<String, EncoderDecoder<QueryType>> encoderDecoderMap;
+
+  protected void setContentTypeHeader(String contentTypeHeader)

Review comment:
       Fixed

##########
File path: integration-tests/src/main/java/org/apache/druid/testing/clients/QueryResourceTestClient.java
##########
@@ -50,4 +62,23 @@ public String getBrokerURL()
     );
   }
 
+  /**
+   * clone a new instance of current object with given encoding.
+   * Note: For {@link AbstractQueryResourceTestClient#queryAsync(String, Object)} operation, contentType could only be application/json
+   *
+   * @param contentType Content-Type header of request. Cannot be NULL. Both application/json and application/x-jackson-smile are allowed
+   * @param accept      Accept header of request. Both application/json and application/x-jackson-smile are allowed
+   */
+  public QueryResourceTestClient withEncoding(String contentType, String accept)

Review comment:
       Fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on pull request #10980: Fix Smile encoding for HTTP response

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on pull request #10980:
URL: https://github.com/apache/druid/pull/10980#issuecomment-816356252


   > The fix LGTM, but I think we need some integration tests for this sort of bug. @FrankChen021 could you please add some?
   
   Sure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #10980: Fix Smile encoding for HTTP response

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #10980:
URL: https://github.com/apache/druid/pull/10980#discussion_r616360076



##########
File path: integration-tests/src/main/java/org/apache/druid/testing/clients/QueryResourceTestClient.java
##########
@@ -50,4 +62,23 @@ public String getBrokerURL()
     );
   }
 
+  /**
+   * clone a new instance of current object with given encoding.
+   * Note: For {@link AbstractQueryResourceTestClient#queryAsync(String, Object)} operation, contentType could only be application/json
+   *
+   * @param contentType Content-Type header of request. Cannot be NULL. Both application/json and application/x-jackson-smile are allowed
+   * @param accept      Accept header of request. Both application/json and application/x-jackson-smile are allowed
+   */
+  public QueryResourceTestClient withEncoding(String contentType, String accept)

Review comment:
       nit: please add `@Nullable` for `accept`.

##########
File path: integration-tests/src/main/java/org/apache/druid/testing/clients/AbstractQueryResourceTestClient.java
##########
@@ -21,67 +21,157 @@
 
 import com.fasterxml.jackson.core.type.TypeReference;
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.jaxrs.smile.SmileMediaTypes;
 import com.google.inject.Inject;
+import org.apache.druid.guice.annotations.Smile;
+import org.apache.druid.java.util.common.IAE;
 import org.apache.druid.java.util.common.ISE;
 import org.apache.druid.java.util.http.client.HttpClient;
 import org.apache.druid.java.util.http.client.Request;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHandler;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHolder;
 import org.apache.druid.java.util.http.client.response.StatusResponseHandler;
 import org.apache.druid.java.util.http.client.response.StatusResponseHolder;
-import org.apache.druid.testing.IntegrationTestingConfig;
 import org.apache.druid.testing.guice.TestClient;
+import org.jboss.netty.handler.codec.http.HttpHeaders;
 import org.jboss.netty.handler.codec.http.HttpMethod;
 import org.jboss.netty.handler.codec.http.HttpResponseStatus;
 
+import javax.ws.rs.core.MediaType;
+import java.io.IOException;
 import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Future;
 
 public abstract class AbstractQueryResourceTestClient<QueryType>
 {
-  private final ObjectMapper jsonMapper;
-  private final HttpClient httpClient;
+  private String contentTypeHeader = MediaType.APPLICATION_JSON;
+
+  /**
+   * a 'null' means the Content-Type in response defaults to Content-Type of request
+   */
+  private String acceptHeader = null;
+
+  final ObjectMapper jsonMapper;
+  final ObjectMapper smileMapper;
+  final HttpClient httpClient;
   final String routerUrl;
+  final Map<String, EncoderDecoder<QueryType>> encoderDecoderMap;
+
+  protected void setContentTypeHeader(String contentTypeHeader)

Review comment:
       I find mutable variables confusing because it makes the class to be stateful. This method is called only in `QueryResourceTestClient.withEncoding()`. Can we set `contentTypeHeader` and `acceptHeader` in the constructor and make them final? If you did this because of `SqlResourceTestClient` not supporting smile format, I think it's OK to make `SqlResourceTestClient` ugly (like its constructor accepts `contentTypeHeader` and `acceptHeader` but their values must be json) because we will eventually want to support smile for sqls.

##########
File path: integration-tests/src/main/java/org/apache/druid/testing/clients/AbstractQueryResourceTestClient.java
##########
@@ -21,67 +21,157 @@
 
 import com.fasterxml.jackson.core.type.TypeReference;
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.jaxrs.smile.SmileMediaTypes;
 import com.google.inject.Inject;
+import org.apache.druid.guice.annotations.Smile;
+import org.apache.druid.java.util.common.IAE;
 import org.apache.druid.java.util.common.ISE;
 import org.apache.druid.java.util.http.client.HttpClient;
 import org.apache.druid.java.util.http.client.Request;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHandler;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHolder;
 import org.apache.druid.java.util.http.client.response.StatusResponseHandler;
 import org.apache.druid.java.util.http.client.response.StatusResponseHolder;
-import org.apache.druid.testing.IntegrationTestingConfig;
 import org.apache.druid.testing.guice.TestClient;
+import org.jboss.netty.handler.codec.http.HttpHeaders;
 import org.jboss.netty.handler.codec.http.HttpMethod;
 import org.jboss.netty.handler.codec.http.HttpResponseStatus;
 
+import javax.ws.rs.core.MediaType;
+import java.io.IOException;
 import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Future;
 
 public abstract class AbstractQueryResourceTestClient<QueryType>
 {
-  private final ObjectMapper jsonMapper;
-  private final HttpClient httpClient;
+  private String contentTypeHeader = MediaType.APPLICATION_JSON;
+
+  /**
+   * a 'null' means the Content-Type in response defaults to Content-Type of request
+   */
+  private String acceptHeader = null;
+
+  final ObjectMapper jsonMapper;
+  final ObjectMapper smileMapper;
+  final HttpClient httpClient;
   final String routerUrl;
+  final Map<String, EncoderDecoder<QueryType>> encoderDecoderMap;

Review comment:
       nit: out of curiosity, are this map and the fancy `EncoderDecoder` interface for more extendable structure where we want to support new formats in the future?

##########
File path: integration-tests/src/main/java/org/apache/druid/testing/clients/AbstractQueryResourceTestClient.java
##########
@@ -21,67 +21,157 @@
 
 import com.fasterxml.jackson.core.type.TypeReference;
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.jaxrs.smile.SmileMediaTypes;
 import com.google.inject.Inject;
+import org.apache.druid.guice.annotations.Smile;
+import org.apache.druid.java.util.common.IAE;
 import org.apache.druid.java.util.common.ISE;
 import org.apache.druid.java.util.http.client.HttpClient;
 import org.apache.druid.java.util.http.client.Request;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHandler;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHolder;
 import org.apache.druid.java.util.http.client.response.StatusResponseHandler;
 import org.apache.druid.java.util.http.client.response.StatusResponseHolder;
-import org.apache.druid.testing.IntegrationTestingConfig;
 import org.apache.druid.testing.guice.TestClient;
+import org.jboss.netty.handler.codec.http.HttpHeaders;
 import org.jboss.netty.handler.codec.http.HttpMethod;
 import org.jboss.netty.handler.codec.http.HttpResponseStatus;
 
+import javax.ws.rs.core.MediaType;
+import java.io.IOException;
 import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Future;
 
 public abstract class AbstractQueryResourceTestClient<QueryType>
 {
-  private final ObjectMapper jsonMapper;
-  private final HttpClient httpClient;
+  private String contentTypeHeader = MediaType.APPLICATION_JSON;
+
+  /**
+   * a 'null' means the Content-Type in response defaults to Content-Type of request
+   */
+  private String acceptHeader = null;
+
+  final ObjectMapper jsonMapper;
+  final ObjectMapper smileMapper;
+  final HttpClient httpClient;
   final String routerUrl;
+  final Map<String, EncoderDecoder<QueryType>> encoderDecoderMap;
+
+  protected void setContentTypeHeader(String contentTypeHeader)
+  {
+    if (!this.encoderDecoderMap.containsKey(contentTypeHeader)) {
+      throw new IAE("Invalid Content-Type[%s]", contentTypeHeader);
+    }
+    this.contentTypeHeader = contentTypeHeader;
+  }
+
+  protected void setAcceptHeader(String acceptHeader)
+  {
+    if (acceptHeader != null) {
+      if (!this.encoderDecoderMap.containsKey(acceptHeader)) {
+        throw new IAE("Invalid Accept[%s]", acceptHeader);
+      }
+    }
+    this.acceptHeader = acceptHeader;
+  }
+
+  /**
+   * A encoder/decoder encodes/decods request/response based on value of Content-Type

Review comment:
       ```suggestion
      * A encoder/decoder that encodes/decodes requests/responses based on Content-Type.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] FrankChen021 commented on a change in pull request #10980: Fix Smile encoding for HTTP response

Posted by GitBox <gi...@apache.org>.
FrankChen021 commented on a change in pull request #10980:
URL: https://github.com/apache/druid/pull/10980#discussion_r616595374



##########
File path: integration-tests/src/main/java/org/apache/druid/testing/clients/AbstractQueryResourceTestClient.java
##########
@@ -21,67 +21,157 @@
 
 import com.fasterxml.jackson.core.type.TypeReference;
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.jaxrs.smile.SmileMediaTypes;
 import com.google.inject.Inject;
+import org.apache.druid.guice.annotations.Smile;
+import org.apache.druid.java.util.common.IAE;
 import org.apache.druid.java.util.common.ISE;
 import org.apache.druid.java.util.http.client.HttpClient;
 import org.apache.druid.java.util.http.client.Request;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHandler;
+import org.apache.druid.java.util.http.client.response.BytesFullResponseHolder;
 import org.apache.druid.java.util.http.client.response.StatusResponseHandler;
 import org.apache.druid.java.util.http.client.response.StatusResponseHolder;
-import org.apache.druid.testing.IntegrationTestingConfig;
 import org.apache.druid.testing.guice.TestClient;
+import org.jboss.netty.handler.codec.http.HttpHeaders;
 import org.jboss.netty.handler.codec.http.HttpMethod;
 import org.jboss.netty.handler.codec.http.HttpResponseStatus;
 
+import javax.ws.rs.core.MediaType;
+import java.io.IOException;
 import java.net.URL;
+import java.nio.charset.StandardCharsets;
+import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.concurrent.Future;
 
 public abstract class AbstractQueryResourceTestClient<QueryType>
 {
-  private final ObjectMapper jsonMapper;
-  private final HttpClient httpClient;
+  private String contentTypeHeader = MediaType.APPLICATION_JSON;
+
+  /**
+   * a 'null' means the Content-Type in response defaults to Content-Type of request
+   */
+  private String acceptHeader = null;
+
+  final ObjectMapper jsonMapper;
+  final ObjectMapper smileMapper;
+  final HttpClient httpClient;
   final String routerUrl;
+  final Map<String, EncoderDecoder<QueryType>> encoderDecoderMap;

Review comment:
       It's possible since the generic type 'QueryType' is defined on outer class and used by `query` function which is the caller of this `EncoderDecoder`. 
   
   This generic contract has been removed in the new commit, because I found that its implementation has nothing to do with type info. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org