You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Atri Sharma (JIRA)" <ji...@apache.org> on 2017/07/05 18:28:00 UTC

[jira] [Commented] (CALCITE-500) Ensure EnumerableJoin hashes the smallest input

    [ https://issues.apache.org/jira/browse/CALCITE-500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075201#comment-16075201 ] 

Atri Sharma commented on CALCITE-500:
-------------------------------------

[~julianhyde] I ran CalciteSuite with tracing on, and it hit JdbcFrontLinqBackTest.testJoin. The trace log looks like:


  private final java.util.List relClasses;
  public final org.apache.calcite.rel.metadata.RelMdCollation provider0;
  public GeneratedMetadataHandler_Collation(java.util.List relClasses,
      org.apache.calcite.rel.metadata.RelMdCollation provider0) {
    this.relClasses = relClasses;
    this.provider0 = provider0;
  }
  public org.apache.calcite.rel.metadata.MetadataDef getDef() {
    return org.apache.calcite.rel.metadata.BuiltInMetadata$Collation.DEF;
  }
  public com.google.common.collect.ImmutableList collations(
      org.apache.calcite.rel.RelNode r,
      org.apache.calcite.rel.metadata.RelMetadataQuery mq) {
    final java.util.List key = org.apache.calcite.runtime.FlatLists.of(org.apache.calcite.rel.metadata.BuiltInMetadata$Collation.DEF, r);
    final Object v = mq.map.get(key);
    if (v != null) {
      if (v == org.apache.calcite.rel.metadata.NullSentinel.ACTIVE) {
        throw org.apache.calcite.rel.metadata.CyclicMetadataException.INSTANCE;
      }
      return (com.google.common.collect.ImmutableList) v;
    }
    mq.map.put(key,org.apache.calcite.rel.metadata.NullSentinel.ACTIVE);
    try {
      final com.google.common.collect.ImmutableList x = collations_(r, mq);
      mq.map.put(key, x);
      return x;
    } catch (java.lang.Exception e) {
      mq.map.remove(key);
      throw e;
    }
  }

  private com.google.common.collect.ImmutableList collations_(
      org.apache.calcite.rel.RelNode r,
      org.apache.calcite.rel.metadata.RelMetadataQuery mq) {
    switch (relClasses.indexOf(r.getClass())) {
    default:
      return provider0.collations((org.apache.calcite.rel.RelNode) r, mq);
    case 2:
      return provider0.collations((org.apache.calcite.plan.volcano.RelSubset) r, mq);
    case 3:
      return collations(((org.apache.calcite.plan.hep.HepRelVertex) r).getCurrentRel(), mq);
    case 10:
    case 25:
    case 34:
      return provider0.collations((org.apache.calcite.rel.core.Filter) r, mq);
    case 14:
    case 26:
    case 38:
    case 67:
      return provider0.collations((org.apache.calcite.rel.core.Project) r, mq);
    case 15:
    case 39:
    case 73:
      return provider0.collations((org.apache.calcite.rel.core.Sort) r, mq);
    case 18:
    case 28:
    case 42:
    case 71:
      return provider0.collations((org.apache.calcite.rel.core.TableScan) r, mq);
    case 20:
    case 44:
      return provider0.collations((org.apache.calcite.rel.core.Values) r, mq);
    case 21:
    case 45:
      return provider0.collations((org.apache.calcite.rel.core.Window) r, mq);
    case 76:
      return provider0.collations((org.apache.calcite.adapter.enumerable.EnumerableMergeJoin) r, mq);
    case -1:
      throw new org.apache.calcite.rel.metadata.JaninoRelMetadataProvider$NoHandler(r.getClass());
    }
  }

I put a breakpoint at EnumerableJoin::implement, and it seems to populate the EnumerableJoin with the correct left and right children. Can you please point me to the physical operator implementation of EnumerableJoin, where the actual hashing and probe happens? I will take a look there then.

> Ensure EnumerableJoin hashes the smallest input
> -----------------------------------------------
>
>                 Key: CALCITE-500
>                 URL: https://issues.apache.org/jira/browse/CALCITE-500
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.0.0-incubating
>            Reporter: Vladimir Sitnikov
>            Assignee: Atri Sharma
>              Labels: newbie
>
> {{EnumerableJoin}} tries to put the smallest input the first, however when it comes to execution, Calcite creates lookup for _second_ input of join.
> It would be nice to ensure the lookup is created on the smallest input.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)