You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by Jinho Kim <jh...@apache.org> on 2014/03/04 06:27:36 UTC

Review Request 18728: Work unbalance on disk scheduling of DefaultScheduler

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/
-----------------------------------------------------------

Review request for Tajo.


Bugs: TAJO-647
    https://issues.apache.org/jira/browse/TAJO-647


Repository: tajo


Description
-------

The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
We should control the remote tasks.
Leaf scheduling priorities
1. tasks in a volume of host
2. unknown disk, non-splitable in local host
3. remote tasks in rack. (consider the remaining tasks)
4. random remote tasks. (tail concurrency control of scheduled tasks)


Diffs
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b 

Diff: https://reviews.apache.org/r/18728/diff/


Testing
-------

I've tested two cluster.
Cluster-1 : ( disk * 4) * 4 servers
Cluster-2 : ( disk * 8) * 10 servers
Test DataSet : TPCH 100GB, 1TB


Thanks,

Jinho Kim


Re: Review Request 18728: TAJO-647: Work unbalance on disk scheduling of DefaultScheduler

Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/#review36904
-----------------------------------------------------------


+1

I've deeply reviewed this patch with discussing Jinho in offline. Then, I added the explanation that we discussed to the source code. In addition, I renamed some variables and methods.

- Hyunsik Choi


On March 12, 2014, 2:52 p.m., Jinho Kim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18728/
> -----------------------------------------------------------
> 
> (Updated March 12, 2014, 2:52 p.m.)
> 
> 
> Review request for Tajo.
> 
> 
> Bugs: TAJO-647
>     https://issues.apache.org/jira/browse/TAJO-647
> 
> 
> Repository: tajo
> 
> 
> Description
> -------
> 
> The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
> We should control the remote tasks.
> Leaf scheduling priorities
> 1. tasks in a volume of host
> 2. unknown disk, non-splitable in local host
> 3. remote tasks in rack. (consider the remaining tasks)
> 4. random remote tasks. (tail concurrency control of scheduled tasks)
> 
> 
> Diffs
> -----
> 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac0f996d8f3854265f7d1dd1c58e050e14a 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java 57b3db4730ccaf80bcf61d0ef157908b2f5be2c9 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b652e2d75014cd7ffe2b20527fae2d948e 
> 
> Diff: https://reviews.apache.org/r/18728/diff/
> 
> 
> Testing
> -------
> 
> I've tested two cluster.
> Cluster-1 : ( disk * 4) * 4 servers
> Cluster-2 : ( disk * 8) * 10 servers
> Test DataSet : TPCH 100GB, 1TB
> 
> 
> Thanks,
> 
> Jinho Kim
> 
>


Re: Review Request 18728: TAJO-647: Work unbalance on disk scheduling of DefaultScheduler

Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/#review37040
-----------------------------------------------------------

Ship it!


Ship It!

- Hyunsik Choi


On March 13, 2014, 5:53 p.m., Jinho Kim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18728/
> -----------------------------------------------------------
> 
> (Updated March 13, 2014, 5:53 p.m.)
> 
> 
> Review request for Tajo.
> 
> 
> Bugs: TAJO-647
>     https://issues.apache.org/jira/browse/TAJO-647
> 
> 
> Repository: tajo
> 
> 
> Description
> -------
> 
> The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
> We should control the remote tasks.
> Leaf scheduling priorities
> 1. tasks in a volume of host
> 2. unknown disk, non-splitable in local host
> 3. remote tasks in rack. (consider the remaining tasks)
> 4. random remote tasks. (tail concurrency control of scheduled tasks)
> 
> 
> Diffs
> -----
> 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac0f996d8f3854265f7d1dd1c58e050e14a 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java 57b3db4730ccaf80bcf61d0ef157908b2f5be2c9 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b652e2d75014cd7ffe2b20527fae2d948e 
> 
> Diff: https://reviews.apache.org/r/18728/diff/
> 
> 
> Testing
> -------
> 
> I've tested two cluster.
> Cluster-1 : ( disk * 4) * 4 servers
> Cluster-2 : ( disk * 8) * 10 servers
> Test DataSet : TPCH 100GB, 1TB
> 
> 
> Thanks,
> 
> Jinho Kim
> 
>


Re: Review Request 18728: TAJO-647: Work unbalance on disk scheduling of DefaultScheduler

Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/#review37038
-----------------------------------------------------------


Hyunsik,
Sorry for confusing you. I missed the 'remove()' cost in linkedlist. so I've change to hashMap 
getlowsetVolumeId() should return descending order. so  the diskVolumeLoads change to ascending order

- Jinho Kim


On March 13, 2014, 8:53 a.m., Jinho Kim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18728/
> -----------------------------------------------------------
> 
> (Updated March 13, 2014, 8:53 a.m.)
> 
> 
> Review request for Tajo.
> 
> 
> Bugs: TAJO-647
>     https://issues.apache.org/jira/browse/TAJO-647
> 
> 
> Repository: tajo
> 
> 
> Description
> -------
> 
> The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
> We should control the remote tasks.
> Leaf scheduling priorities
> 1. tasks in a volume of host
> 2. unknown disk, non-splitable in local host
> 3. remote tasks in rack. (consider the remaining tasks)
> 4. random remote tasks. (tail concurrency control of scheduled tasks)
> 
> 
> Diffs
> -----
> 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac0f996d8f3854265f7d1dd1c58e050e14a 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java 57b3db4730ccaf80bcf61d0ef157908b2f5be2c9 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b652e2d75014cd7ffe2b20527fae2d948e 
> 
> Diff: https://reviews.apache.org/r/18728/diff/
> 
> 
> Testing
> -------
> 
> I've tested two cluster.
> Cluster-1 : ( disk * 4) * 4 servers
> Cluster-2 : ( disk * 8) * 10 servers
> Test DataSet : TPCH 100GB, 1TB
> 
> 
> Thanks,
> 
> Jinho Kim
> 
>


Re: Review Request 18728: TAJO-647: Work unbalance on disk scheduling of DefaultScheduler

Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/
-----------------------------------------------------------

(Updated March 13, 2014, 8:53 a.m.)


Review request for Tajo.


Bugs: TAJO-647
    https://issues.apache.org/jira/browse/TAJO-647


Repository: tajo


Description
-------

The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
We should control the remote tasks.
Leaf scheduling priorities
1. tasks in a volume of host
2. unknown disk, non-splitable in local host
3. remote tasks in rack. (consider the remaining tasks)
4. random remote tasks. (tail concurrency control of scheduled tasks)


Diffs (updated)
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac0f996d8f3854265f7d1dd1c58e050e14a 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java 57b3db4730ccaf80bcf61d0ef157908b2f5be2c9 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b652e2d75014cd7ffe2b20527fae2d948e 

Diff: https://reviews.apache.org/r/18728/diff/


Testing
-------

I've tested two cluster.
Cluster-1 : ( disk * 4) * 4 servers
Cluster-2 : ( disk * 8) * 10 servers
Test DataSet : TPCH 100GB, 1TB


Thanks,

Jinho Kim


Re: Review Request 18728: TAJO-647: Work unbalance on disk scheduling of DefaultScheduler

Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/
-----------------------------------------------------------

(Updated March 12, 2014, 2:52 p.m.)


Review request for Tajo.


Bugs: TAJO-647
    https://issues.apache.org/jira/browse/TAJO-647


Repository: tajo


Description
-------

The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
We should control the remote tasks.
Leaf scheduling priorities
1. tasks in a volume of host
2. unknown disk, non-splitable in local host
3. remote tasks in rack. (consider the remaining tasks)
4. random remote tasks. (tail concurrency control of scheduled tasks)


Diffs (updated)
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac0f996d8f3854265f7d1dd1c58e050e14a 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java 57b3db4730ccaf80bcf61d0ef157908b2f5be2c9 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b652e2d75014cd7ffe2b20527fae2d948e 

Diff: https://reviews.apache.org/r/18728/diff/


Testing
-------

I've tested two cluster.
Cluster-1 : ( disk * 4) * 4 servers
Cluster-2 : ( disk * 8) * 10 servers
Test DataSet : TPCH 100GB, 1TB


Thanks,

Jinho Kim


Re: Review Request 18728: TAJO-647: Work unbalance on disk scheduling of DefaultScheduler

Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/
-----------------------------------------------------------

(Updated March 7, 2014, 5:36 a.m.)


Review request for Tajo.


Changes
-------

I have  update some loop for remote scan balancing


Bugs: TAJO-647
    https://issues.apache.org/jira/browse/TAJO-647


Repository: tajo


Description
-------

The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
We should control the remote tasks.
Leaf scheduling priorities
1. tasks in a volume of host
2. unknown disk, non-splitable in local host
3. remote tasks in rack. (consider the remaining tasks)
4. random remote tasks. (tail concurrency control of scheduled tasks)


Diffs (updated)
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac0f996d8f3854265f7d1dd1c58e050e14a 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java 57b3db4730ccaf80bcf61d0ef157908b2f5be2c9 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b652e2d75014cd7ffe2b20527fae2d948e 

Diff: https://reviews.apache.org/r/18728/diff/


Testing
-------

I've tested two cluster.
Cluster-1 : ( disk * 4) * 4 servers
Cluster-2 : ( disk * 8) * 10 servers
Test DataSet : TPCH 100GB, 1TB


Thanks,

Jinho Kim


Re: Review Request 18728: TAJO-647: Work unbalance on disk scheduling of DefaultScheduler

Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18728/
-----------------------------------------------------------

(Updated March 4, 2014, 2:34 p.m.)


Review request for Tajo.


Summary (updated)
-----------------

TAJO-647: Work unbalance on disk scheduling of DefaultScheduler


Bugs: TAJO-647
    https://issues.apache.org/jira/browse/TAJO-647


Repository: tajo


Description
-------

The main problem is that localAllocation does not find next lowest volume. and it will assign to remote in rack.
We should control the remote tasks.
Leaf scheduling priorities
1. tasks in a volume of host
2. unknown disk, non-splitable in local host
3. remote tasks in rack. (consider the remaining tasks)
4. random remote tasks. (tail concurrency control of scheduled tasks)


Diffs
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 3ee93ac 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java 790d30b 

Diff: https://reviews.apache.org/r/18728/diff/


Testing
-------

I've tested two cluster.
Cluster-1 : ( disk * 4) * 4 servers
Cluster-2 : ( disk * 8) * 10 servers
Test DataSet : TPCH 100GB, 1TB


Thanks,

Jinho Kim