You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2020/07/06 10:18:00 UTC
[jira] [Commented] (SOLR-14354) HttpShardHandler send requests in async

    [ https://issues.apache.org/jira/browse/SOLR-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151939#comment-17151939 ] 

ASF subversion and git services commented on SOLR-14354:
--------------------------------------------------------

Commit 6a92804f8b7e307045e977a0d92d705665b0c8c1 in lucene-solr's branch refs/heads/jira/SOLR-14354 from Cao Manh Dat
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6a92804 ]

Merge branch 'master' into jira/SOLR-14354


> HttpShardHandler send requests in async
> ---------------------------------------
>
>                 Key: SOLR-14354
>                 URL: https://issues.apache.org/jira/browse/SOLR-14354
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Cao Manh Dat
>            Assignee: Cao Manh Dat
>            Priority: Major
>         Attachments: image-2020-03-23-10-04-08-399.png, image-2020-03-23-10-09-10-221.png, image-2020-03-23-10-12-00-661.png
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> h2. 1. Current approach (problem) of Solr
> Below is the diagram describe the model on how currently handling a request.
> !image-2020-03-23-10-04-08-399.png!
> The main-thread that handles the search requests, will submit n requests (n equals to number of shards) to an executor. So each request will correspond to a thread, after sending a request that thread basically do nothing just waiting for response from other side. That thread will be swapped out and CPU will try to handle another thread (this is called context switch, CPU will save the context of the current thread and switch to another one). When some data (not all) come back, that thread will be called to parsing these data, then it will wait until more data come back. So there will be lots of context switching in CPU. That is quite inefficient on using threads.Basically we want less threads and most of them must busy all the time, because threads are not free as well as context switching. That is the main idea behind everything, like executor
> h2. 2. Async call of Jetty HttpClient
> Jetty HttpClient offers async API like this.
> {code:java}
> httpClient.newRequest("http://domain.com/path")
>         // Add request hooks
>         .onRequestQueued(request -> { ... })
>         .onRequestBegin(request -> { ... })
>         // Add response hooks
>         .onResponseBegin(response -> { ... })
>         .onResponseHeaders(response -> { ... })
>         .onResponseContent((response, buffer) -> { ... })
>         .send(result -> { ... }); {code}
> Therefore after calling {{send()}} the thread will return immediately without any block. Then when the client received the header from other side, it will call {{onHeaders()}} listeners. When the client received some {{byte[]}} (not all response) from the data it will call {{onContent(buffer)}} listeners. When everything finished it will call {{onComplete}} listeners. One main thing that will must notice here is all listeners should finish quick, if the listener block, all further data of that request won’t be handled until the listener finish.
> h2. 3. Solution 1: Sending requests async but spin one thread per response
>  Jetty HttpClient already provides several listeners, one of them is InputStreamResponseListener. This is how it is get used
> {code:java}
> InputStreamResponseListener listener = new InputStreamResponseListener();
> client.newRequest(...).send(listener);
> // Wait for the response headers to arrive
> Response response = listener.get(5, TimeUnit.SECONDS);
> if (response.getStatus() == 200) {
>   // Obtain the input stream on the response content
>   try (InputStream input = listener.getInputStream()) {
>     // Read the response content
>   }
> } {code}
> In this case, there will be 2 thread
>  * one thread trying to read the response content from InputStream
>  * one thread (this is a short-live task) feeding content to above InputStream whenever some byte[] is available. Note that if this thread unable to feed data into InputStream, this thread will wait.
> By using this one, the model of HttpShardHandler can be written into something like this
> {code:java}
> handler.sendReq(req, (is) -> {
>   executor.submit(() ->
>     try (is) {
>       // Read the content from InputStream
>     }
>   )
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-09-10-221.png!
> Notice that although “sending req to shard1” is wide, it won’t take long time since sending req is a very quick operation. With this operation, handling threads won’t be spin up until first bytes are sent back. Notice that in this approach we still have active threads waiting for more data from InputStream
> h2. 4. Solution 2: Buffering data and handle it inside jetty’s thread.
> Jetty have another listener called BufferingResponseListener. This is how it is get used
> {code:java}
> client.newRequest(...).send(new BufferingResponseListener() {
>   public void onComplete(Result result) {
>     try {
>       byte[] response = getContent();
>       //handling response
>     }
>   }
> }); {code}
> On receiving data, Jetty (one of its thread) will call the listener with the given data (data here is just byte[] represent part of the response). The listener will then buffer that byte[] into an internal buffer. When all the data are received, Jetty will call onComplete of the listener and inside that method we will get all the response.
> By using this one, the model of HttpShardHandler can be written into something like this
> {code:java}
> handle.send(req, (byte[]) -> {
>   // handling data here
> }) {code}
>  The first diagram will be changed into this
> !image-2020-03-23-10-12-00-661.png!
> Pros:
>  * We don’t need additional threads for each request → Less threads
>  * No thread are activately waiting for data from an InputStream → Threads are more busy
> Cons
>  * Data must be buffered all before able being to parse → double memory being used for parsing a response.
> h2. 5. Solution 3: Why not both?
> Solution 1 is good for parsing very large response or sometimes _unbounded_ (like in StreamingExpression) response.
> Solution 2 is good for parsing small response (may be < 10KB) since overhead is little.
> Should we combine both solutions above? After all what is returned by HttpSolrClient so far for all requests is a NamedList<>, so as long as we can return a NamedList<> using Solution 1 or Solution 2 are not matter with users.
> Therefore the idea here is based on “CONTENT_LENGTH” of the response’s headers. If the response body less than a certain size we will go with solution 2 and vice versa.
> _Note:_ Solr seems doesn’t return content-length accurately, need more investigation.
> h2. 6. Further improvement
>  The best approach to solve this problem is instead of converting InputStream to NamedList, why don’t we just converting byte by byte and make it resumable. Like this
> {code:java}
> Parser parser = new Parser();
> public void onContent(ByteBuffer buffer) {
>   parser.parse(buffer)
> }
> public void onComplete() {
>   NamedList<> result = parser.getResult();
> } {code}
>  Therefore, there will be no blocking operation inside parser, thus making a very efficient model. But doing this requires tons of change in Solr, rewrite all ResponseParsers in Solr, not mention the flow here must be rewritten. Not sure it is worth it for doing that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org