You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@quickstep.apache.org by navsan <gi...@git.apache.org> on 2016/06/27 16:27:15 UTC

[GitHub] incubator-quickstep pull request #41: Remove misleading comment about hash t...

GitHub user navsan opened a pull request:

    https://github.com/apache/incubator-quickstep/pull/41

    Remove misleading comment about hash table collector type

    Consulted with @hbdeshmukh who confirmed that these comments are misleading.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-quickstep hashtable_collector

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-quickstep/pull/41.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #41
    
----
commit 258970235f2b2d62cb4faefeffe1d2a8ccd53bc1
Author: Navneet Potti <na...@gmail.com>
Date:   2016-06-27T16:00:04Z

    Remove misleading comment about hash table collector type

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #41: Remove misleading comment about hash t...

Posted by navsan <gi...@git.apache.org>.
Github user navsan commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/41#discussion_r68634878
  
    --- Diff: relational_operators/HashJoinOperator.cpp ---
    @@ -65,21 +65,11 @@ DEFINE_bool(vector_based_joined_tuple_collector, false,
                 "If true, use simple vector-based joined tuple collector in "
                 "hash join, with a final sort pass to group joined tuple pairs "
                 "by inner block. If false, use unordered_map based collector that "
    -            "keeps joined pairs grouped by inner block as they are found "
    -            "(this latter option has exhibited performance/scaling problems, "
    -            "particularly in NUMA contexts).");
    +            "keeps joined pairs grouped by inner block as they are found.");
    --- End diff --
    
    Sure, I can change that. 
    Do we actually need the vector_based collector at all? If it\u2019s unclear whether it will ever outperform map_based, maybe we should just get rid of it? 
    
    
    > On Jun 27, 2016, at 13:53, Harshad Deshmukh <no...@github.com> wrote:
    > 
    > In relational_operators/HashJoinOperator.cpp <https://github.com/apache/incubator-quickstep/pull/41#discussion_r68634602>:
    > 
    > > @@ -65,21 +65,11 @@ DEFINE_bool(vector_based_joined_tuple_collector, false,
    > >              "If true, use simple vector-based joined tuple collector in "
    > >              "hash join, with a final sort pass to group joined tuple pairs "
    > >              "by inner block. If false, use unordered_map based collector that "
    > > -            "keeps joined pairs grouped by inner block as they are found "
    > > -            "(this latter option has exhibited performance/scaling problems, "
    > > -            "particularly in NUMA contexts).");
    > > +            "keeps joined pairs grouped by inner block as they are found.");
    > Hi @navsan <https://github.com/navsan>
    > How about changing the flag name to map_based_joined_tuple_collector and switch the default to true?
    > 
    > \u2014
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub <https://github.com/apache/incubator-quickstep/pull/41/files/258970235f2b2d62cb4faefeffe1d2a8ccd53bc1#r68634602>, or mute the thread <https://github.com/notifications/unsubscribe/ACZB65jQr1J2bVFXiC7GyiZ3em2eEWQgks5qQBwugaJpZM4I_RVv>.
    > 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #41: Remove misleading comment about hash t...

Posted by navsan <gi...@git.apache.org>.
Github user navsan commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/41#discussion_r68635473
  
    --- Diff: relational_operators/HashJoinOperator.cpp ---
    @@ -65,21 +65,11 @@ DEFINE_bool(vector_based_joined_tuple_collector, false,
                 "If true, use simple vector-based joined tuple collector in "
                 "hash join, with a final sort pass to group joined tuple pairs "
                 "by inner block. If false, use unordered_map based collector that "
    -            "keeps joined pairs grouped by inner block as they are found "
    -            "(this latter option has exhibited performance/scaling problems, "
    -            "particularly in NUMA contexts).");
    +            "keeps joined pairs grouped by inner block as they are found.");
    --- End diff --
    
    My rudimentary profiling also gave the same results. I\u2019ll wait for others to opine. If there are no objections, I\u2019ll remove it.
    
    
    > On Jun 27, 2016, at 13:57, Harshad Deshmukh <no...@github.com> wrote:
    > 
    > In relational_operators/HashJoinOperator.cpp <https://github.com/apache/incubator-quickstep/pull/41#discussion_r68635258>:
    > 
    > > @@ -65,21 +65,11 @@ DEFINE_bool(vector_based_joined_tuple_collector, false,
    > >              "If true, use simple vector-based joined tuple collector in "
    > >              "hash join, with a final sort pass to group joined tuple pairs "
    > >              "by inner block. If false, use unordered_map based collector that "
    > > -            "keeps joined pairs grouped by inner block as they are found "
    > > -            "(this latter option has exhibited performance/scaling problems, "
    > > -            "particularly in NUMA contexts).");
    > > +            "keeps joined pairs grouped by inner block as they are found.");
    > I don't mind removing it. If I recall correctly, an older profiling done by @pateljm <https://github.com/pateljm> revealed that map collector outperformed the vector collector pretty much always.
    > 
    > \u2014
    > You are receiving this because you were mentioned.
    > Reply to this email directly, view it on GitHub <https://github.com/apache/incubator-quickstep/pull/41/files/258970235f2b2d62cb4faefeffe1d2a8ccd53bc1#r68635258>, or mute the thread <https://github.com/notifications/unsubscribe/ACZB62AXjOwQg0eKFvaerORVzB3rYcSrks5qQB0XgaJpZM4I_RVv>.
    > 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #41: Remove misleading comment about hash table co...

Posted by hbdeshmukh <gi...@git.apache.org>.
Github user hbdeshmukh commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/41
  
    Hi @navsan 
    
    I have one minor comment, otherwise LGTM. Thanks for the cleanup. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #41: Remove misleading comment about hash t...

Posted by navsan <gi...@git.apache.org>.
Github user navsan closed the pull request at:

    https://github.com/apache/incubator-quickstep/pull/41


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #41: Remove misleading comment about hash table co...

Posted by navsan <gi...@git.apache.org>.
Github user navsan commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/41
  
    I'm going to close this PR, and remove the vector_based collector type instead.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #41: Remove misleading comment about hash t...

Posted by hbdeshmukh <gi...@git.apache.org>.
Github user hbdeshmukh commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/41#discussion_r68635258
  
    --- Diff: relational_operators/HashJoinOperator.cpp ---
    @@ -65,21 +65,11 @@ DEFINE_bool(vector_based_joined_tuple_collector, false,
                 "If true, use simple vector-based joined tuple collector in "
                 "hash join, with a final sort pass to group joined tuple pairs "
                 "by inner block. If false, use unordered_map based collector that "
    -            "keeps joined pairs grouped by inner block as they are found "
    -            "(this latter option has exhibited performance/scaling problems, "
    -            "particularly in NUMA contexts).");
    +            "keeps joined pairs grouped by inner block as they are found.");
    --- End diff --
    
    I don't mind removing it. If I recall correctly, an older profiling done by @pateljm revealed that map collector outperformed the vector collector pretty much always. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #41: Remove misleading comment about hash t...

Posted by hbdeshmukh <gi...@git.apache.org>.
Github user hbdeshmukh commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/41#discussion_r68634602
  
    --- Diff: relational_operators/HashJoinOperator.cpp ---
    @@ -65,21 +65,11 @@ DEFINE_bool(vector_based_joined_tuple_collector, false,
                 "If true, use simple vector-based joined tuple collector in "
                 "hash join, with a final sort pass to group joined tuple pairs "
                 "by inner block. If false, use unordered_map based collector that "
    -            "keeps joined pairs grouped by inner block as they are found "
    -            "(this latter option has exhibited performance/scaling problems, "
    -            "particularly in NUMA contexts).");
    +            "keeps joined pairs grouped by inner block as they are found.");
    --- End diff --
    
    Hi @navsan 
    
    How about changing the flag name to ``map_based_joined_tuple_collector`` and switch the default to true?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---