You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by jihoonson <gi...@git.apache.org> on 2016/04/20 00:46:31 UTC

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of sibling ...

GitHub user jihoonson opened a pull request:

    https://github.com/apache/tajo/pull/1003

    TAJO-2126: Allow parallel execution of sibling ExecutionBlocks

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jihoonson/tajo-2 peb

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/1003.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1003
    
----
commit 70f6475d271c321e9f2c829e37eec4130e05372e
Author: Jihoon Son <ji...@apache.org>
Date:   2016-04-19T13:01:51Z

    testing

commit 10986c500b161ecf1db4eba24c0d400976101b0c
Author: Jihoon Son <ji...@apache.org>
Date:   2016-04-19T22:45:48Z

    cleanup

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of non-leaf...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/1003#issuecomment-213009006
  
    Do you mean more shuffle cost caused by heavy load?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of non-leaf...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/1003#issuecomment-213399107
  
    Thanks for your review!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of non-leaf...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/tajo/pull/1003


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of non-leaf...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/1003#issuecomment-212704759
  
    I did basic tests to verify query results, but there is one thing I'm concerned with. That is the shuffle cost. This patch enables the parallel execution of non-leaf EBs, thereby more shuffle operations can be executed simultaneously. It can increase the shuffle cost especially for range shuffle due to heavy index search.
    So, we need to improve hash shuffle by reusing http connections and range shuffle by redesigning the index structure.
    
    However, I think the heavy shuffle costs should be addressed in other Jiras. What do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of non-leaf...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/1003#issuecomment-212599170
  
    The change looks straightforward and reasonable. But, I'm concerned with potential side effects. Did you do enough tests in cluster environments?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of non-leaf...

Posted by hyunsik <gi...@git.apache.org>.
Github user hyunsik commented on the pull request:

    https://github.com/apache/tajo/pull/1003#issuecomment-213398365
  
    Yep, I agree. I think that this patch will enhance the machine resource utilization instead of causing more shuffle cost.
    
    +1 LGTM



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] tajo pull request: TAJO-2126: Allow parallel execution of non-leaf...

Posted by jihoonson <gi...@git.apache.org>.
Github user jihoonson commented on the pull request:

    https://github.com/apache/tajo/pull/1003#issuecomment-213165892
  
    It is possible in single query execution. If multiple queries are executed, the shuffle cost of the current implementation will also be huge. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---