You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by "Avery Ching (JIRA)" <ji...@apache.org> on 2012/10/27 06:21:12 UTC

[jira] [Created] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Avery Ching created GIRAPH-389:
----------------------------------

             Summary: Multithreading should intelligently allocate the thread pools
                 Key: GIRAPH-389
                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
             Project: Giraph
          Issue Type: Bug
            Reporter: Avery Ching


Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.

The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Avery Ching updated GIRAPH-389:
-------------------------------

    Attachment: GIRAPH-389.patch
    
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490484#comment-13490484 ] 

Avery Ching commented on GIRAPH-389:
------------------------------------

We've run around 360 workers pretty reliably, didn't try to test more...
                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486161#comment-13486161 ] 

Avery Ching commented on GIRAPH-389:
------------------------------------

Thanks Eli for the review.

We already read splits in parallel with multithreading, but this ensures that not all the inputs go to the same worker when mulithreading is turned on.  

This also deals with the compute threads only spinning up enough threads to cover the number of partitions they have, no more.  
                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486210#comment-13486210 ] 

Avery Ching commented on GIRAPH-389:
------------------------------------

Yes, multiple threads per worker are claiming splits.  But as you mentioned before the split list is read once and passed to all the input split threads.  Each one will then only read the zknodes to claim the split.

No, you don't need to do anything, Jenkins is happy.  Thanks again.
                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13485451#comment-13485451 ] 

Avery Ching commented on GIRAPH-389:
------------------------------------

https://reviews.apache.org/r/7754/

 Description:

Do not create more input split loading threads than input splits to allow workers to equally load up input splits
Do not create more compute threads than partitions on the worker
Removed duplicate reserveInputSplit method in BspServiceWorker
Minor optimization to only get the input split zknodes once and pass to all threads.

Testing Done:

passed unittests
ran pagerankbenchmark on a real cluster and observed that the limiting works


                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486424#comment-13486424 ] 

Eli Reisman commented on GIRAPH-389:
------------------------------------

Last time I instrumented that part of the code on cluster runs, I found that with lots of workers, each claim attempt starts to slow down the syncing of the ZK quorum because those claims don't just read, they try to write. This forces a sync(). Our quorum was not very large, and even then the response time of the quorum to reads and writes would drag to a crawl during this phase of input reads. These were all problems that start small and get much worse as you scale out to more workers.

Perhaps the speed increase from multithreading is that they simply get a split quickly and the whole list is covered, very few 2nd iterations on the split list are ever needed? Has anyone run big jobs with lots of machines using teh multithreaded input split phase yet? does it speed up the load-in? Is there a scale worker-wise where the ZK starts to bog down?

Come to think of it, if each thread group on a worker gets the same split list ordering from the input split organizer, we might want to modify it to help: the split organizer tries to keep workers from iterating on identical orderings of the split list. This eliminated a "mirroring" behavior where groups of workers would iterate the list from the same start index, and they would slow down the ZK response time for everyone by making it sync competing claims for the same splits all the time. The hashed index distributed the workers so that they don't tend to compete as often for the same split.

Anyway: the hashing is done with host:port so all the locality blocks and the rest of the original split list will be in identical order for all threads on the same worker, and the competition is back on again, especially (I bet) when you get into that nice 1200-1600 worker range where some use cases happen.

For a fix: each thread would need its own "variation" of the split list. the inplit split organizer could be equipped to generate iterators that build variations on the current split list it generates. This could be done by: 1. shuffle() on the locality block before inserting it back into the full split list and 2. use threadId + host:port for the hash key to make sure they iterate on different parts of the list if they don't get a local block.


                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486572#comment-13486572 ] 

Avery Ching commented on GIRAPH-389:
------------------------------------

Multithreading the loading is quite nice in practice, as least when using HCatalog.  We don't see the same slowdowns perhaps since we aren't using a shared ZK instance, just the one for this job.
                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490299#comment-13490299 ] 

Eli Reisman commented on GIRAPH-389:
------------------------------------

That makes sense, I was not able to use the "instantiate our own ZK" option last summer, I was always using the cluster quorum. I wonder if the single-instance scales well enough to make it the default, a lot of the stuff I was troubleshooting around ZK back then had a lot to do with slowdowns during frequent writes & quorum sync's that you guys never have to worry about. After all, we don't have to have a ZK that is 24-7 available, just per-job. May I ask whats the most worker tasks you've run on a single ZK this way?


                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Eli Reisman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486204#comment-13486204 ] 

Eli Reisman commented on GIRAPH-389:
------------------------------------

Are we talking about mutliple threads per-worker trying to claim splits? because that will really really slow ZK down.

I assumed the worker had an executor service, claimed splits, and passed the split reading job off to a thread after each claim (znode write) is successful. The fewer actors out there contending for splits on the ZK list, the happier ZK will be because each claim attempt involves a ZK write attempt.

Regarding this patch: do I need to do anything else or now that jenkins is happy, we're all set on this commit?

Thanks!

                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Avery Ching (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486189#comment-13486189 ] 

Avery Ching commented on GIRAPH-389:
------------------------------------

Btw, a re-run of the tests passed on hudson https://builds.apache.org/job/Giraph-trunk-Commit/257/console.
                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-389) Multithreading should intelligently allocate the thread pools

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/GIRAPH-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486095#comment-13486095 ] 

Hudson commented on GIRAPH-389:
-------------------------------

Integrated in Giraph-trunk-Commit #256 (See [https://builds.apache.org/job/Giraph-trunk-Commit/256/])
    GIRAPH-389: Multithreading should intelligently allocate the thread pools (aching via ereisman) (Revision 1403386)

     Result = FAILURE
ereisman : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1403386
Files : 
* /giraph/trunk/CHANGELOG
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/InputSplitPathOrganizer.java
* /giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/InputSplitsCallable.java

                
> Multithreading should intelligently allocate the thread pools
> -------------------------------------------------------------
>
>                 Key: GIRAPH-389
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-389
>             Project: Giraph
>          Issue Type: Bug
>            Reporter: Avery Ching
>            Assignee: Avery Ching
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-389.patch
>
>
> Even if the user suggests a very high number of threads, the input split threads should not exceed the number splits divided by the number of  workers.
> The number of compute threads should not be greater than the number of partitions on that worker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira