You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@rya.apache.org by meiercaleb <gi...@git.apache.org> on 2016/05/16 22:47:05 UTC

[GitHub] incubator-rya pull request: Added OPTIONAL support for Precomputed...

GitHub user meiercaleb opened a pull request:

    https://github.com/apache/incubator-rya/pull/42

    Added OPTIONAL support for Precomputed-Joins. 

    Added capability to match multiple PCJs containing OPTIONALs to sub-queries of a given RYA query.  Enhanced how the AccumuloIndexSet PCJ node evaluates BindingSets and performs joins to accommodate PCJs with OPTIONALs and to make PCJ evaluation more robust and efficient in general.  

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/meiercaleb/incubator-rya RYA-62-OPTIONAL-Support

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-rya/pull/42.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #42
    
----
commit 96dd55ec505c3e207451702d99a7d296374ff850
Author: Caleb Meier <me...@gmail.com>
Date:   2016-04-11T20:23:04Z

    Added OPTIONAL support for Precomputed-Joins, including support for matching
    PCJs with OPTIONALs and evaluation of query plans containing PCJs with OPTIONALs.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #42: Added OPTIONAL support for Precomputed-Joins.

Posted by pujav65 <gi...@git.apache.org>.
Github user pujav65 commented on the issue:

    https://github.com/apache/incubator-rya/pull/42
  
    Looks good.  Only concern I have is with context support, which is kind of out of the scope of this pr.  Going forward, try to include jira numbers in the commit comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #42: Added OPTIONAL support for Precomputed-Joins...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-rya/pull/42


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #42: Added OPTIONAL support for Precomputed-Joins...

Posted by pujav65 <gi...@git.apache.org>.
Github user pujav65 commented on a diff in the pull request:

    https://github.com/apache/incubator-rya/pull/42#discussion_r65615898
  
    --- Diff: extras/indexing/src/main/java/mvm/rya/indexing/accumulo/ConfigUtils.java ---
    @@ -157,7 +157,7 @@ public static boolean createTableIfNotExists(final Configuration conf, final Str
             return false;
         }
     
    -    private static String getIndexTableName(final Configuration conf, final String indexTableNameConf, final String altSuffix){
    --- End diff --
    
    try to avoid changes like this that don't change logic or intent.  the finals are annoying, but it makes it hard to parse what changed.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya issue #42: Added OPTIONAL support for Precomputed-Joins.

Posted by pujav65 <gi...@git.apache.org>.
Github user pujav65 commented on the issue:

    https://github.com/apache/incubator-rya/pull/42
  
    merged.  can you close the pull request?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-rya pull request #42: Added OPTIONAL support for Precomputed-Joins...

Posted by pujav65 <gi...@git.apache.org>.
Github user pujav65 commented on a diff in the pull request:

    https://github.com/apache/incubator-rya/pull/42#discussion_r65616727
  
    --- Diff: extras/indexing/src/main/java/mvm/rya/indexing/external/tupleSet/AccumuloIndexSet.java ---
    @@ -225,173 +273,336 @@ private void setLocalityGroups(final String tableName, final Connector conn, fin
     
     	}
     
    +	@Override
    +	public CloseableIteration<BindingSet, QueryEvaluationException> evaluate(
    +			BindingSet bindingset) throws QueryEvaluationException {
    +		return this.evaluate(Collections.singleton(bindingset));
    +	}
     
    +	/**
    +	 * Core evaluation method used during query evaluation - given a collection
    +	 * of binding set constraints, this method finds common binding labels
    +	 * between the constraints and table, uses those to build a prefix scan of
    +	 * the Accumulo table, and creates a solution binding set by iterating of
    +	 * the scan results.
    +	 * @param bindingset - collection of {@link BindingSet}s to be joined with PCJ
    +	 * @return - CloseableIteration over joined results
    +	 */
    +	@Override
    +	public CloseableIteration<BindingSet, QueryEvaluationException> evaluate(
    +			final Collection<BindingSet> bindingset)
    +			throws QueryEvaluationException {
    +
    +		if (bindingset.isEmpty()) {
    +			return new IteratorWrapper<BindingSet, QueryEvaluationException>(
    +					new HashSet<BindingSet>().iterator());
    +		}
     
    -    @Override
    -    public CloseableIteration<BindingSet,QueryEvaluationException> evaluate(final BindingSet bindingset) throws QueryEvaluationException {
    -        return this.evaluate(Collections.singleton(bindingset));
    -    }
    -
    -    /**
    -     * Core evaluation method used during query evaluation - given a collection of binding set constraints, this
    -     * method finds common binding labels between the constraints and table, uses those to build a prefix scan
    -     * of the Accumulo table, and creates a solution binding set by iterating of the scan results.
    -     */
    -    @Override
    -    public CloseableIteration<BindingSet,QueryEvaluationException> evaluate(final Collection<BindingSet> bindingset) throws QueryEvaluationException {
    -        String localityGroup = "";
    -        final Set<String> commonVars = Sets.newHashSet();
    -        // if bindingset is empty, there are no results, so return empty iterator
    -        if (bindingset.isEmpty()) {
    -        	return new IteratorWrapper<BindingSet, QueryEvaluationException>(new HashSet<BindingSet>().iterator());
    -        }
    -      //to build range prefix, find common vars of bindingset and PCJ bindings
    -        else {
    -        	final BindingSet bs = bindingset.iterator().next();
    -            for (final String b : this.getTupleExpr().getAssuredBindingNames()) {
    -                final Binding v = bs.getBinding(b);
    -                if (v != null) {
    -                    commonVars.add(b);
    -                }
    -            }
    -        }
    -        //add any constant constraints to common vars to be used in range prefix
    -        commonVars.addAll(getConstantConstraints());
    -        PcjQuery apq = null;
    -        List<String> fullVarOrder =  null;
    -        String commonVarOrder = null;
    -        try {
    -            if (commonVars.size() > 0) {
    -                commonVarOrder = getVarOrder(commonVars);
    -                if(commonVarOrder == null) {
    -                    throw new IllegalStateException("Index does not support binding set!");
    -                }
    -                fullVarOrder = Lists.newArrayList(prefixToOrder(commonVarOrder).split(VAR_ORDER_DELIM));
    -                //use varOrder and tableVarMap to set correct scan column
    -                localityGroup = orderToLocGroup(fullVarOrder);
    -            } else {
    -                localityGroup = varOrder.get(0);
    -            }
    -            apq = new AccumuloPcjQuery(accCon, tablename);
    -            final ValueMapVisitor vmv = new ValueMapVisitor();
    -            this.getTupleExpr().visit(vmv);
    -
    -            List<String> commonVarOrderList = null;
    -            if(commonVarOrder != null) {
    -            	commonVarOrderList = Lists.newArrayList(commonVarOrder.split(VAR_ORDER_DELIM));
    -            } else {
    -            	commonVarOrderList = new ArrayList<>();
    -            }
    -
    -            return apq.queryPrecompJoin(commonVarOrderList, localityGroup, vmv.getValMap(),
    -            		HashBiMap.create(this.getTableVarMap()).inverse(), bindingset);
    -        } catch(final TableNotFoundException e) {
    -            throw new QueryEvaluationException(e);
    -        }
    -    }
    -
    -    /**
    -     *
    -     * @param order - variable order as indicated by query
    -     * @return - locality group or column family used in scan - this
    -     * is just the variable order expressed in terms of the variables stored
    -     * in the table
    -     */
    -    private String orderToLocGroup(final List<String> order) {
    -        String localityGroup = "";
    -        for (final String s : order) {
    -            if (localityGroup.length() == 0) {
    -                localityGroup = this.getTableVarMap().get(s);
    -            } else {
    -                localityGroup = localityGroup + VAR_ORDER_DELIM + this.getTableVarMap().get(s);
    -            }
    -        }
    -        return localityGroup;
    -    }
    -
    -    /**
    -     *
    -     * @param order - prefix of a full variable order
    -     * @return - full variable order that includes all variables whose values
    -     * are stored in the table - used to obtain the locality group
    -     */
    -    //given partial order of query vars, convert to PCJ vars and determine
    -    //if converted partial order is a substring of a full var order of PCJ variables.
    -    //if converted partial order is a prefix, convert corresponding full PCJ var order to query vars
    -    private String prefixToOrder(String order) {
    -        final Map<String, String> invMap = HashBiMap.create(this.getTableVarMap()).inverse();
    -        String[] temp = order.split(VAR_ORDER_DELIM);
    -        //get order in terms of PCJ variables
    -        for (int i = 0; i < temp.length; i++) {
    -            temp[i] = this.getTableVarMap().get(temp[i]);
    -        }
    -        order = Joiner.on(VAR_ORDER_DELIM).join(temp);
    -        for (final String s : varOrder) {
    -        	//verify that partial order is prefix of a PCJ varOrder
    -            if (s.startsWith(order)) {
    -                temp = s.split(VAR_ORDER_DELIM);
    -                //convert full PCJ varOrder back to query varOrder
    -                for (int i = 0; i < temp.length; i++) {
    -                    temp[i] = invMap.get(temp[i]);
    -                }
    -                return Joiner.on(VAR_ORDER_DELIM).join(temp);
    -            }
    -        }
    -        throw new NoSuchElementException("Order is not a prefix of any locality group value!");
    -    }
    -
    -    /**
    -     *
    -     * @param variables
    -     * @return - string representation of the Set variables, in an order that is in the
    -     * table
    -     */
    -    private String getVarOrder(final Set<String> variables) {
    -        final Map<String, Set<String>> varOrderMap = this.getSupportedVariableOrders();
    -        final Set<Map.Entry<String, Set<String>>> entries = varOrderMap.entrySet();
    -        for (final Map.Entry<String, Set<String>> e : entries) {
    -            if (e.getValue().equals(variables)) {
    -                return e.getKey();
    -            }
    -        }
    -        return null;
    -    }
    -
    -    /**
    -     * @return - all constraints which correspond to variables
    -     * in {@link AccumuloIndexSet#getTupleExpr()} which are set
    -     * equal to a constant, but are non-constant in Accumulo table
    -     */
    -    private Set<String> getConstantConstraints() {
    -        final Map<String, String> tableMap = this.getTableVarMap();
    -        final Set<String> keys = tableMap.keySet();
    -        final Set<String> constants = Sets.newHashSet();
    -        for (final String s : keys) {
    -            if (s.startsWith("-const-")) {
    -                constants.add(s);
    -            }
    -        }
    -        return constants;
    -    }
    -
    -    /**
    -     *
    -     * Extracts the values associated with constant labels in the query
    -     * Used to create binding sets from range scan
    -     */
    -    public class ValueMapVisitor extends QueryModelVisitorBase<RuntimeException> {
    -        Map<String, org.openrdf.model.Value> valMap = Maps.newHashMap();
    -        public Map<String, org.openrdf.model.Value> getValMap() {
    -            return valMap;
    -        }
    -        @Override
    -        public void meet(final Var node) {
    -            if (node.getName().startsWith("-const-")) {
    -                valMap.put(node.getName(), node.getValue());
    -            }
    -        }
    -    }
    +		List<BindingSet> crossProductBs = new ArrayList<>();
    +		Map<String, org.openrdf.model.Value> constantConstraints = new HashMap<>();
    +		Set<Range> hashJoinRanges = new HashSet<>();
    +		final Range EMPTY_RANGE = new Range("", true, "~", false);
    +		Range crossProductRange = EMPTY_RANGE;
    +		String localityGroupOrder = varOrder.get(0);
    +		int maxPrefixLen = Integer.MIN_VALUE;
    +		int prefixLen = 0;
    +		int oldPrefixLen = 0;
    +		Multimap<String, BindingSet> bindingSetHashMap = HashMultimap.create();
    +		HashJoinType joinType = HashJoinType.CONSTANT_JOIN_VAR;
    +		Set<String> unAssuredVariables = Sets.difference(getTupleExpr().getBindingNames(), getTupleExpr().getAssuredBindingNames());
    +		boolean useColumnScan = false;
    +		boolean isCrossProd = false;
    +		boolean containsConstantConstraints = false;
    +		BindingSet constants = getConstantConstraints();
    +		containsConstantConstraints = constants.size() > 0;
     
    -}
    +		try {
    +			for (BindingSet bs : bindingset) {
    +				if (bindingset.size() == 1 && bs.size() == 0) {
    +					// in this case, only single, empty bindingset, pcj node is
    +					// first node in query plan - use full Range scan with
    +					// column
    +					// family set
    +					useColumnScan = true;
    +				}
    +				// get common vars for PCJ - only use variables associated
    +				// with assured Bindings
    +				QueryBindingSet commonVars = new QueryBindingSet();
    +				for (String b : getTupleExpr().getAssuredBindingNames()) {
    +					Binding v = bs.getBinding(b);
    +					if (v != null) {
    +						commonVars.addBinding(v);
    +					}
    +				}
    +				// no common vars implies cross product
    +				if (commonVars.size() == 0 && bs.size() != 0) {
    +					crossProductBs.add(bs);
    +					isCrossProd = true;
    +				}
    +				//get a varOrder from orders in PCJ table - use at least
    +				//one common variable
    +				BindingSetVariableOrder varOrder = getVarOrder(
    +						commonVars.getBindingNames(),
    +						constants.getBindingNames());
    +
    +				// update constant constraints not used in varOrder and
    +				// update Bindings used to form range by removing unused
    +				// variables
    +				commonVars.addAll(constants);
    +				if (commonVars.size() > varOrder.varOrderLen) {
    +					Map<String, Value> valMap = getConstantValueMap();
    +					for (String s : new HashSet<String>(varOrder.unusedVars)) {
    +						if (valMap.containsKey(s)
    +								&& !constantConstraints.containsKey(s)) {
    +							constantConstraints.put(s, valMap.get(s));
    +						}
    +						commonVars.removeBinding(s);
    +					}
    +				}
    +
    +				if (containsConstantConstraints
    +						&& (useColumnScan || isCrossProd)) {
    +					// only one range required in event of a cross product or
    +					// empty BindingSet
    +					// Range will either be full table Range or determined by
    +					// constant constraints
    +					if (crossProductRange == EMPTY_RANGE) {
    +						crossProductRange = getRange(varOrder.varOrder,
    +								commonVars);
    +						localityGroupOrder = prefixToOrder(varOrder.varOrder);
    +					}
    +				} else if (!useColumnScan && !isCrossProd) {
    +					// update ranges and add BindingSet to HashJoinMap if not a
    +					// cross product
    +					hashJoinRanges.add(getRange(varOrder.varOrder, commonVars));
    +
    +					prefixLen = varOrder.varOrderLen;
    +					// check if common Variable Orders are changing between
    +					// BindingSets (happens in case
    +					// of Optional). If common variable set length changes from
    +					// BindingSet to BindingSet
    +					// update the HashJoinType to be VARIABLE_JOIN_VAR.
    +					if (oldPrefixLen == 0) {
    +						oldPrefixLen = prefixLen;
    +					} else {
    +						if (oldPrefixLen != prefixLen
    +								&& joinType == HashJoinType.CONSTANT_JOIN_VAR) {
    +							joinType = HashJoinType.VARIABLE_JOIN_VAR;
    +						}
    +						oldPrefixLen = prefixLen;
    +					}
    +					// update max prefix len
    +					if (prefixLen > maxPrefixLen) {
    +						maxPrefixLen = prefixLen;
    +					}
    +					String key = getHashJoinKey(varOrder.varOrder, commonVars);
    +					bindingSetHashMap.put(key, bs);
    +				}
    +
    +				isCrossProd = false;
    +			}
    +
    +			// create full Range scan iterator and set column family if empty
    +			// collection or if cross product BindingSet exists and no hash join
    +			// BindingSets
    +			if ((useColumnScan || crossProductBs.size() > 0)
    +					&& bindingSetHashMap.size() == 0) {
    +				Scanner scanner = accCon.createScanner(tablename, auths);
    +				// cross product with no cross product constraints here
    +				scanner.setRange(crossProductRange);
    +				scanner.fetchColumnFamily(new Text(localityGroupOrder));
    +				return new PCJKeyToCrossProductBindingSetIterator(scanner,
    --- End diff --
    
    dumb question but how are we supporting named graphs (context) in pcjs?  This doesn't look like we are limiting to a specific context here.  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---