You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2009/06/02 21:39:08 UTC

[jira] Created: (HBASE-1477) Contention on META stalls mapreduce job execution

Contention on META stalls mapreduce job execution
-------------------------------------------------

Key: HBASE-1477
URL: https://issues.apache.org/jira/browse/HBASE-1477
Project: Hadoop HBase
Issue Type: Bug
Reporter: Andrew Purtell

>From Jeremy Pinkham up on hbase-users@:

bq. A typical mapper in the job takes several minutes, how many minutes depends on whether I use the the region partitioner and how many I let run concurrently... it's been anywhere from 2 minutes with no partitioner and small concurrency (5 mappers) to 8 minutes with the region partitioner and high concurrency (150 mappers). This seems to directly correlate with how long it takes to do a simple count of .META. while each job is running (2 seconds to 1 minute)

bq. I was able to get past this issue affecting my data load by reorganizing some of my workflow and data structures to force the ordering of keys without the region partitioner. Those changes appear to have side stepped the problem for me as I can now load from 100+ mappers without seeing the degradation that I was seeing with 40 when using the partitioner (and getting some sweet numbers in the requests column of the UI). It's still an interesting scaling situation with the region partitioner, but I'm good to go without it.

I have seen this also in the form of freezing of master UI during high load, where the UI comes back as soon as load is reduced. When I thread dump it looks like all IPC handlers on the region server hosting .META. are busy.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1477) Contention on META stalls mapreduce job execution

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715644#action_12715644 ] 

Andrew Purtell commented on HBASE-1477:
---------------------------------------

One option is for the master to publish region locations up in ZK via ephemeral nodes. Can be done out of ProcessRegion*.java. Client doesn't have to read .META., can iterate a hierarchy in ZK instead. ZK is designed for high read modest write workloads. Seems this is a good case of that. 

> Contention on META stalls mapreduce job execution
> -------------------------------------------------
>
>                 Key: HBASE-1477
>                 URL: https://issues.apache.org/jira/browse/HBASE-1477
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>
> From  Jeremy Pinkham up on hbase-users@: 
> bq. A typical mapper in the job takes several minutes, how many minutes depends on whether I use the the region partitioner and how many I let run concurrently... it's been anywhere from 2 minutes with no partitioner and small concurrency (5 mappers) to 8 minutes with the region partitioner and high concurrency (150 mappers).  This seems to directly correlate with how long it takes to do a simple count of .META. while each job is running (2 seconds to 1 minute)
> bq. I was able to get past this issue affecting my data load by reorganizing some of my workflow and data structures to force the ordering of keys without the region partitioner.  Those changes appear to have side stepped the problem for me as I can now load from 100+ mappers without seeing the degradation that I was seeing with 40 when using the partitioner (and getting some sweet numbers in the requests column of the UI).  It's still an interesting scaling situation with the region partitioner, but I'm good to go without it.
> I have seen this also in the form of freezing of master UI during high load, where the UI comes back as soon as load is reduced. When I thread dump it looks like all IPC handlers on the region server hosting .META. are busy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1477) Contention on META stalls mapreduce job execution

Posted by "Nitay Joffe (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715856#action_12715856 ] 

Nitay Joffe commented on HBASE-1477:
------------------------------------

I like this idea Andrew. We should try that out and see if it can handle the load.

> Contention on META stalls mapreduce job execution
> -------------------------------------------------
>
>                 Key: HBASE-1477
>                 URL: https://issues.apache.org/jira/browse/HBASE-1477
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>
> From  Jeremy Pinkham up on hbase-users@: 
> bq. A typical mapper in the job takes several minutes, how many minutes depends on whether I use the the region partitioner and how many I let run concurrently... it's been anywhere from 2 minutes with no partitioner and small concurrency (5 mappers) to 8 minutes with the region partitioner and high concurrency (150 mappers).  This seems to directly correlate with how long it takes to do a simple count of .META. while each job is running (2 seconds to 1 minute)
> bq. I was able to get past this issue affecting my data load by reorganizing some of my workflow and data structures to force the ordering of keys without the region partitioner.  Those changes appear to have side stepped the problem for me as I can now load from 100+ mappers without seeing the degradation that I was seeing with 40 when using the partitioner (and getting some sweet numbers in the requests column of the UI).  It's still an interesting scaling situation with the region partitioner, but I'm good to go without it.
> I have seen this also in the form of freezing of master UI during high load, where the UI comes back as soon as load is reduced. When I thread dump it looks like all IPC handlers on the region server hosting .META. are busy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1477) Contention on META stalls mapreduce job execution

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715858#action_12715858 ] 

ryan rawson commented on HBASE-1477:
------------------------------------

we need to revisit after HBASE-1304 - scans are getting a whole lot faster...

> Contention on META stalls mapreduce job execution
> -------------------------------------------------
>
>                 Key: HBASE-1477
>                 URL: https://issues.apache.org/jira/browse/HBASE-1477
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>
> From  Jeremy Pinkham up on hbase-users@: 
> bq. A typical mapper in the job takes several minutes, how many minutes depends on whether I use the the region partitioner and how many I let run concurrently... it's been anywhere from 2 minutes with no partitioner and small concurrency (5 mappers) to 8 minutes with the region partitioner and high concurrency (150 mappers).  This seems to directly correlate with how long it takes to do a simple count of .META. while each job is running (2 seconds to 1 minute)
> bq. I was able to get past this issue affecting my data load by reorganizing some of my workflow and data structures to force the ordering of keys without the region partitioner.  Those changes appear to have side stepped the problem for me as I can now load from 100+ mappers without seeing the degradation that I was seeing with 40 when using the partitioner (and getting some sweet numbers in the requests column of the UI).  It's still an interesting scaling situation with the region partitioner, but I'm good to go without it.
> I have seen this also in the form of freezing of master UI during high load, where the UI comes back as soon as load is reduced. When I thread dump it looks like all IPC handlers on the region server hosting .META. are busy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1477) Contention on META stalls mapreduce job execution

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716016#action_12716016 ] 

Andrew Purtell commented on HBASE-1477:
---------------------------------------

Using a nonblocking IPC layer would directly attack the root cause. Until then, scans may be getting faster but heavy I/O on the cluster can still tie up all available IPC handlers. Until then I'd vote for this.

> Contention on META stalls mapreduce job execution
> -------------------------------------------------
>
>                 Key: HBASE-1477
>                 URL: https://issues.apache.org/jira/browse/HBASE-1477
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>
> From  Jeremy Pinkham up on hbase-users@: 
> bq. A typical mapper in the job takes several minutes, how many minutes depends on whether I use the the region partitioner and how many I let run concurrently... it's been anywhere from 2 minutes with no partitioner and small concurrency (5 mappers) to 8 minutes with the region partitioner and high concurrency (150 mappers).  This seems to directly correlate with how long it takes to do a simple count of .META. while each job is running (2 seconds to 1 minute)
> bq. I was able to get past this issue affecting my data load by reorganizing some of my workflow and data structures to force the ordering of keys without the region partitioner.  Those changes appear to have side stepped the problem for me as I can now load from 100+ mappers without seeing the degradation that I was seeing with 40 when using the partitioner (and getting some sweet numbers in the requests column of the UI).  It's still an interesting scaling situation with the region partitioner, but I'm good to go without it.
> I have seen this also in the form of freezing of master UI during high load, where the UI comes back as soon as load is reduced. When I thread dump it looks like all IPC handlers on the region server hosting .META. are busy. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.