You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucy.apache.org by "Nick Wellnhofer (Created) (JIRA)" <ji...@apache.org> on 2011/12/23 01:15:30 UTC

[lucy-issues] [jira] [Created] (LUCY-204) Process ClusterSearcher RPCs in parallel

Process ClusterSearcher RPCs in parallel
----------------------------------------

                 Key: LUCY-204
                 URL: https://issues.apache.org/jira/browse/LUCY-204
             Project: Lucy
          Issue Type: Improvement
          Components: Search
            Reporter: Nick Wellnhofer
            Assignee: Nick Wellnhofer
         Attachments: LUCY-204.patch

The ClusterSearcher should process "multi" RPCs to the shards in parallel. This should speed things up and release connections to the SearchServers earlier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[lucy-issues] [jira] [Updated] (LUCY-204) Process ClusterSearcher RPCs in parallel

Posted by "Nick Wellnhofer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Wellnhofer updated LUCY-204:
---------------------------------

    Fix Version/s: 0.3.0 (incubating)
    
> Process ClusterSearcher RPCs in parallel
> ----------------------------------------
>
>                 Key: LUCY-204
>                 URL: https://issues.apache.org/jira/browse/LUCY-204
>             Project: Lucy
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Nick Wellnhofer
>            Assignee: Nick Wellnhofer
>             Fix For: 0.3.0 (incubating)
>
>         Attachments: LUCY-204.patch
>
>
> The ClusterSearcher should process "multi" RPCs to the shards in parallel. This should speed things up and release connections to the SearchServers earlier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[lucy-issues] [jira] [Commented] (LUCY-204) Process ClusterSearcher RPCs in parallel

Posted by "Marvin Humphrey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCY-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176327#comment-13176327 ] 

Marvin Humphrey commented on LUCY-204:
--------------------------------------

OK, then -- +1 to commit!

(And nice work on that earlier commit triggering incomplete read/write
within the test file.)
                
> Process ClusterSearcher RPCs in parallel
> ----------------------------------------
>
>                 Key: LUCY-204
>                 URL: https://issues.apache.org/jira/browse/LUCY-204
>             Project: Lucy
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Nick Wellnhofer
>            Assignee: Nick Wellnhofer
>         Attachments: LUCY-204.patch
>
>
> The ClusterSearcher should process "multi" RPCs to the shards in parallel. This should speed things up and release connections to the SearchServers earlier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[lucy-issues] [jira] [Commented] (LUCY-204) Process ClusterSearcher RPCs in parallel

Posted by "Nick Wellnhofer (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCY-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175429#comment-13175429 ] 

Nick Wellnhofer commented on LUCY-204:
--------------------------------------

bq. The "Blocking" argument to IO::Socket::INET#new is still commented out. If I uncomment "Blocking", the tests still pass. However, we had a situation before where the tests passed before yet users encountered errors – see <http://markmail.org/message/66iwazc6fho3tvy2>. Our current tests are not sufficient as they do not trigger partial writes and reads.

The new code should work with blocking and non-blocking sockets. When receiving data, it shouldn't make a difference. Sending the requests should be even more parallel with non-blocking.

To test partial reads, we could add a test mode to SearchServer and arbitrarily delay sending parts of the response.

bq. I believe that send/recv are for UDP packets and that we should be using syswrite/sysread. Tests still pass when they are swapped out.

You generally use send/recv to pass additional flags. Some of these flags can also be used with TCP. syswrite/sysread should behave exactly like send/recv without flags. I prefer send/recv because it makes clear that we're working with sockets but we can also switch to syswrite/sysread.

bq. I'd advocate using confess() instead of die, so that we get a full stack trace to work with.

OK, I'll address that.

                
> Process ClusterSearcher RPCs in parallel
> ----------------------------------------
>
>                 Key: LUCY-204
>                 URL: https://issues.apache.org/jira/browse/LUCY-204
>             Project: Lucy
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Nick Wellnhofer
>            Assignee: Nick Wellnhofer
>         Attachments: LUCY-204.patch
>
>
> The ClusterSearcher should process "multi" RPCs to the shards in parallel. This should speed things up and release connections to the SearchServers earlier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[lucy-issues] [jira] [Updated] (LUCY-204) Process ClusterSearcher RPCs in parallel

Posted by "Nick Wellnhofer (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Wellnhofer updated LUCY-204:
---------------------------------

    Attachment: LUCY-204.patch

Patch LUCY-204.patch implements a select loop with callbacks to process multiple requests in parallel.
                
> Process ClusterSearcher RPCs in parallel
> ----------------------------------------
>
>                 Key: LUCY-204
>                 URL: https://issues.apache.org/jira/browse/LUCY-204
>             Project: Lucy
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Nick Wellnhofer
>            Assignee: Nick Wellnhofer
>         Attachments: LUCY-204.patch
>
>
> The ClusterSearcher should process "multi" RPCs to the shards in parallel. This should speed things up and release connections to the SearchServers earlier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[lucy-issues] [jira] [Resolved] (LUCY-204) Process ClusterSearcher RPCs in parallel

Posted by "Nick Wellnhofer (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCY-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nick Wellnhofer resolved LUCY-204.
----------------------------------

    Resolution: Fixed

Committed in r1225239
                
> Process ClusterSearcher RPCs in parallel
> ----------------------------------------
>
>                 Key: LUCY-204
>                 URL: https://issues.apache.org/jira/browse/LUCY-204
>             Project: Lucy
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Nick Wellnhofer
>            Assignee: Nick Wellnhofer
>         Attachments: LUCY-204.patch
>
>
> The ClusterSearcher should process "multi" RPCs to the shards in parallel. This should speed things up and release connections to the SearchServers earlier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[lucy-issues] [jira] [Commented] (LUCY-204) Process ClusterSearcher RPCs in parallel

Posted by "Marvin Humphrey (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCY-204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13175237#comment-13175237 ] 

Marvin Humphrey commented on LUCY-204:
--------------------------------------

It's great to see this!

Some comments:

  * Very nice work integrating the callbacks with the select.
  * I like the refactoring you've done around the shards.  It's clear how
    you're handling the incomplete buffers (and it's better than what I'd had
    in mind).
  * The "Blocking" argument to IO::Socket::INET#new is still commented out.
    If I uncomment "Blocking", the tests still pass.  However, we had a
    situation before where the tests passed before yet users encountered
    errors -- see <http://markmail.org/message/66iwazc6fho3tvy2>.  Our current
    tests are not sufficient as they do not trigger partial writes and reads.
  * I believe that send/recv are for UDP packets and that we should be
    using syswrite/sysread.  Tests still pass when they are swapped out.
  * I'd advocate using confess() instead of die, so that we get a full stack
    trace to work with.

+1 to commit this even without tests for incomplete reads/writes (though I
think changing to syswrite/sysread is important).  If it turns out to have
problems, we'll just make a bugfix release.  This patch is a great step
forwards!

                
> Process ClusterSearcher RPCs in parallel
> ----------------------------------------
>
>                 Key: LUCY-204
>                 URL: https://issues.apache.org/jira/browse/LUCY-204
>             Project: Lucy
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Nick Wellnhofer
>            Assignee: Nick Wellnhofer
>         Attachments: LUCY-204.patch
>
>
> The ClusterSearcher should process "multi" RPCs to the shards in parallel. This should speed things up and release connections to the SearchServers earlier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira