You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Gabor Gevay (JIRA)" <ji...@apache.org> on 2015/08/18 13:27:45 UTC

[jira] [Created] (FLINK-2542) It should be documented that it is required from a join key to override hashCode(), when it is not a POJO

Gabor Gevay created FLINK-2542:
----------------------------------

             Summary: It should be documented that it is required from a join key to override hashCode(), when it is not a POJO
                 Key: FLINK-2542
                 URL: https://issues.apache.org/jira/browse/FLINK-2542
             Project: Flink
          Issue Type: Bug
          Components: Gelly, Java API
            Reporter: Gabor Gevay
            Priority: Minor
             Fix For: 0.10, 0.9.1


If the join key is not a POJO, and does not override hashCode, then the join silently fails (produces empty output). I don't see this documented anywhere.

The Gelly documentation should also have this info separately, because it does joins internally on the vertex IDs, but the user might not know this, or might not look at the join documentation when using Gelly.

Here is an example code:

{noformat}
public static class ID implements Comparable<ID> {
	public long foo;

	//no default ctor --> not a POJO

	public ID(long foo) {
		this.foo = foo;
	}

	@Override
	public int compareTo(ID o) {
		return ((Long)foo).compareTo(o.foo);
	}

	@Override
	public boolean equals(Object o0) {
		if(o0 instanceof ID) {
			ID o = (ID)o0;
			return foo == o.foo;
		} else {
			return false;
		}
	}

	@Override
	public int hashCode() {
		return 42;
	}
}


public static void main(String[] args) throws Exception {
	ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

	DataSet<Tuple2<ID, Long>> inDegrees = env.fromElements(Tuple2.of(new ID(123l), 4l));
	DataSet<Tuple2<ID, Long>> outDegrees = env.fromElements(Tuple2.of(new ID(123l), 5l));

	DataSet<Tuple3<ID, Long, Long>> degrees = inDegrees.join(outDegrees, JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0).equalTo(0)
			.with(new FlatJoinFunction<Tuple2<ID, Long>, Tuple2<ID, Long>, Tuple3<ID, Long, Long>>() {
				@Override
				public void join(Tuple2<ID, Long> first, Tuple2<ID, Long> second, Collector<Tuple3<ID, Long, Long>> out) {
					out.collect(new Tuple3<ID, Long, Long>(first.f0, first.f1, second.f1));
				}
			}).withForwardedFieldsFirst("f0;f1").withForwardedFieldsSecond("f1");

	System.out.println("degrees count: " + degrees.count());
}
{noformat}


This prints 1, but if I comment out the hashCode, it prints 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)