You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Claus Stadler (JIRA)" <ji...@apache.org> on 2017/07/12 10:16:00 UTC

[jira] [Created] (CALCITE-1887) Detect transitive join conditions via expressions

Claus Stadler created CALCITE-1887:
--------------------------------------

             Summary: Detect transitive join conditions via expressions
                 Key: CALCITE-1887
                 URL: https://issues.apache.org/jira/browse/CALCITE-1887
             Project: Calcite
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.13.0
            Reporter: Claus Stadler
            Assignee: Julian Hyde


Given table aliases ta, tb column names ca, cb, and an arbitrary (deterministic) expression expr then calcite should be capable to infer join conditions by transitivity:

{noformat}
ta.ca = expr AND tb.cb = expr -> ta.ca = tb.cb
{noformat}

The use case for us stems from SPARQL to SQL rewriting, where SPARQL queries such as

{code:java}
SELECT {
  dbr:Leipzig a ?type .
  dbr:Leipzig dbo:mayor ?mayor
}
{code}
result in an SQL query similar to

{noformat}
SELECT s.rdf a, s.rdf b WHERE a.s = 'dbr:Leipzig' AND b.s = 'dbr:Leipzig'
{noformat}

A consequence of the join condition not being recognized is, that Apache Flink does not find an executable plan to process the query.

Self contained example:
{code:java}
package my.package;

import org.apache.calcite.adapter.java.ReflectiveSchema;
import org.apache.calcite.plan.RelOptUtil;
import org.apache.calcite.rel.RelNode;
import org.apache.calcite.rel.RelRoot;
import org.apache.calcite.schema.SchemaPlus;
import org.apache.calcite.sql.SqlNode;
import org.apache.calcite.sql.parser.SqlParser;
import org.apache.calcite.tools.FrameworkConfig;
import org.apache.calcite.tools.Frameworks;
import org.apache.calcite.tools.Planner;
import org.junit.Test;


public class TestCalciteJoin {
    public static class Triple {
        public String s;
        public String p;
        public String o;

        public Triple(String s, String p, String o) {
            super();
            this.s = s;
            this.p = p;
            this.o = o;
        }

    }

    public static class TestSchema {
        public final Triple[] rdf = {new Triple("s", "p", "o")};
    }


    @Test
    public void testCalciteJoin() throws Exception {
        SchemaPlus rootSchema = Frameworks.createRootSchema(true);

        rootSchema.add("s", new ReflectiveSchema(new TestSchema()));

        Frameworks.ConfigBuilder configBuilder = Frameworks.newConfigBuilder();
        configBuilder.defaultSchema(rootSchema);
        FrameworkConfig frameworkConfig = configBuilder.build();

        SqlParser.ConfigBuilder parserConfig = SqlParser.configBuilder(frameworkConfig.getParserConfig());
        parserConfig
            .setCaseSensitive(false)
            .setConfig(parserConfig.build());

        Planner planner = Frameworks.getPlanner(frameworkConfig);

        // SELECT s.rdf a, s.rdf b WHERE a.s = 5 AND b.s = 5
        SqlNode sqlNode = planner.parse("SELECT * FROM \"s\".\"rdf\" \"a\", \"s\".\"rdf\" \"b\" WHERE \"a\".\"s\" = 5 AND \"b\".\"s\" = 5");
        planner.validate(sqlNode);
        RelRoot relRoot = planner.rel(sqlNode);
        RelNode relNode = relRoot.project();
        System.out.println(RelOptUtil.toString(relNode));
    }
}
{code}



Actual plan:
{code:java}
LogicalProject(s=[$0], p=[$1], o=[$2], s0=[$3], p0=[$4], o0=[$5])
  LogicalFilter(condition=[AND(=($0, 5), =($3, 5))])
    LogicalJoin(condition=[true], joinType=[inner])
      EnumerableTableScan(table=[[s, rdf]])
      EnumerableTableScan(table=[[s, rdf]])
{code}

Expected Plan fragment:
{code:java}
    LogicalJoin(condition=[=($0, $3)], joinType=[inner])
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)