You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Wenhai Li (Code Review)" <do...@asterixdb.incubator.apache.org> on 2016/10/01 15:57:14 UTC

Change in asterixdb[master]: Applied the multiway fuzzyjoin based on the prefix-based joi...

Hello Jenkins,

I'd like you to reexamine a change.  Please visit

    https://asterix-gerrit.ics.uci.edu/1076

to look at the new patch set (#16).

Change subject: Applied the multiway fuzzyjoin based on the prefix-based join and the selectFuzzyJoin testCases.
......................................................................

Applied the multiway fuzzyjoin based on the prefix-based join and the selectFuzzyJoin testCases.

- Enabled the fuzzyjoin rule.
- Introduced six existing rules in FuzzyJoinRuleCollections after applied the fuzzyjoin rule.
  - Extract the common expressions in the star-like fuzzyjoin substitution of ~=.
  - Translate the generated subplan from the ~= substitution into join.
  - Remove the unused assign/vars after the new select-push-down and inlineSubplanInpurForNestedTupleSource are applied.
- Add three new optimization Cases for multi-fuzzyjoin.
- Add a running Cases for select-fuzzyjoin.
- Change the inverted-based fuzzyjoin onto prefix-based join due to the efficiency considerations.

Change-Id: I8736f104905eeda763d39709e002c2b9629278cc
---
M asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/base/FuzzyUtils.java
M asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/base/RuleCollections.java
M asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/FuzzyJoinRule.java
M asterixdb/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/subplan/InlineSubplanInputForNestedTupleSourceRule.java
A asterixdb/asterix-app/data/dblp-small/csx-small-multi-id.txt
A asterixdb/asterix-app/data/dblp-small/dblp-small-multi-id.txt
A asterixdb/asterix-app/data/pub-small/csx-small-multi-id.txt
A asterixdb/asterix-app/data/pub-small/dblp-small-multi-id.txt
M asterixdb/asterix-app/src/main/java/org/apache/asterix/api/common/APIFramework.java
A asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-hybrid.aql
A asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-selflink.aql
A asterixdb/asterix-app/src/test/resources/optimizerts/queries/fj-dblp-csx-star.aql
A asterixdb/asterix-app/src/test/resources/optimizerts/results/fj-dblp-csx-hybrid.plan
A asterixdb/asterix-app/src/test/resources/optimizerts/results/fj-dblp-csx-selflink.plan
A asterixdb/asterix-app/src/test/resources/optimizerts/results/fj-dblp-csx-star.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join-noeqjoin/ngram-jaccard-inline.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join-noeqjoin/word-jaccard-inline.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/issue741.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/leftouterjoin-probe-pidx-with-join-jaccard-check-idx_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-fuzzyeq-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-fuzzyeq-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard-check_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard-check_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard-check_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/ngram-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-fuzzyeq-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-fuzzyeq-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard-check-after-btree-access.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard-check_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard-check_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard-check_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/inverted-index-join/word-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/leftouterjoin-probe-pidx-with-join-jaccard-check-idx_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/ngram-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/ngram-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/ngram-jaccard-inline.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/ngram-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/word-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/word-jaccard-check-after-btree-access.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/word-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/word-jaccard-inline.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-index/inverted-index-join/word-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-fuzzyeq-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-fuzzyeq-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-fuzzyeq-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard-check_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard-check_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard-check_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard-inline.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/ngram-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-fuzzyeq-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-fuzzyeq-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-fuzzyeq-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard-check-after-btree-access.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard-check_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard-check_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard-check_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard-inline.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/nested-open-index/inverted-index-join/word-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-fuzzyeq-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-fuzzyeq-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-fuzzyeq-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard-check_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard-check_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard-check_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard-check_inline_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/ngram-jaccard_inline_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-fuzzyeq-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-fuzzyeq-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-fuzzyeq-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-fuzzyeq-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard-check-after-btree-access.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard-check_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard-check_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard-check_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard-check_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard-check_inline_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard_01.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard_02.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard_04.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/open-index-enforced/inverted-index-join/word-jaccard_inline_03.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/split-materialization-above-join.plan
M asterixdb/asterix-app/src/test/resources/optimizerts/results/tpch/q12_shipping.plan
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.1/word-jaccard.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.2/ngram-jaccard-inline.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.2/ngram-jaccard-inline.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.2/ngram-jaccard-inline.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.1.2/ngram-jaccard-inline.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.1/word-jaccard.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.1/word-jaccard.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.1/word-jaccard.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.1/word-jaccard.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.2/ngram-jaccard-inline.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.2/ngram-jaccard-inline.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.2/ngram-jaccard-inline.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.2.2/ngram-jaccard-inline.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.1/word-jaccard.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.1/word-jaccard.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.1/word-jaccard.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.1/word-jaccard.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.2/ngram-jaccard-inline.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.2/ngram-jaccard-inline.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.2/ngram-jaccard-inline.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.3.2/ngram-jaccard-inline.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.1/word-jaccard.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.1/word-jaccard.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.1/word-jaccard.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.1/word-jaccard.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.2/word-jaccard.1.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.2/word-jaccard.2.update.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.2/word-jaccard.3.ddl.aql
A asterixdb/asterix-app/src/test/resources/runtimets/queries/fuzzyjoin/dblp-csx-4.4.2/word-jaccard.4.query.aql
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.1.1/dblp-csx-4.1.1.adm
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.1.2/dblp-csx-4.1.2.adm
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.2.1/dblp-csx-4.2.1.adm
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.2.2/dblp-csx-4.2.2.adm
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.3.1/dblp-csx-4.3.1.adm
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.3.2/dblp-csx-4.3.2.adm
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.4.1/dblp-csx-4.4.1.adm
A asterixdb/asterix-app/src/test/resources/runtimets/results/fuzzyjoin/dblp-csx-4.4.2/dblp-csx-4.4.2.adm
M asterixdb/asterix-app/src/test/resources/runtimets/testsuite.xml
M hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/IsomorphismUtilities.java
M hyracks-fullstack/hyracks/hyracks-storage-am-lsm-invertedindex/src/main/java/org/apache/hyracks/storage/am/lsm/invertedindex/tokenizers/NGramUTF8StringBinaryTokenizer.java
153 files changed, 20,963 insertions(+), 2,144 deletions(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/76/1076/16
-- 
To view, visit https://asterix-gerrit.ics.uci.edu/1076
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8736f104905eeda763d39709e002c2b9629278cc
Gerrit-PatchSet: 16
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Wenhai Li <lw...@yahoo.com>
Gerrit-Reviewer: Chen Li <ch...@gmail.com>
Gerrit-Reviewer: Jenkins <je...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Taewoo Kim <wa...@yahoo.com>
Gerrit-Reviewer: Wenhai Li <lw...@yahoo.com>