You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Ufuk Celebi (JIRA)" <ji...@apache.org> on 2015/08/25 16:56:45 UTC

[jira] [Resolved] (FLINK-2542) It should be documented that it is required from a join key to override hashCode(), when it is not a POJO

     [ https://issues.apache.org/jira/browse/FLINK-2542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ufuk Celebi resolved FLINK-2542.
--------------------------------
    Resolution: Won't Fix

I think it's OK to assume that people follow the general Object contract:
{code}
Note that it is generally necessary to override the {@code hashCode} method whenever this method is overridden, so as to maintain the general contract for the {@code hashCode} method, which states that equal objects must have equal hash codes.
{code}

If more people run into this, we can revisit this issue.

> It should be documented that it is required from a join key to override hashCode(), when it is not a POJO
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-2542
>                 URL: https://issues.apache.org/jira/browse/FLINK-2542
>             Project: Flink
>          Issue Type: Bug
>          Components: Gelly, Java API
>            Reporter: Gabor Gevay
>            Priority: Minor
>             Fix For: 0.10, 0.9.1
>
>
> If the join key is not a POJO, and does not override hashCode, then the join silently fails (produces empty output). I don't see this documented anywhere.
> The Gelly documentation should also have this info separately, because it does joins internally on the vertex IDs, but the user might not know this, or might not look at the join documentation when using Gelly.
> Here is an example code:
> {noformat}
> public static class ID implements Comparable<ID> {
> 	public long foo;
> 	//no default ctor --> not a POJO
> 	public ID(long foo) {
> 		this.foo = foo;
> 	}
> 	@Override
> 	public int compareTo(ID o) {
> 		return ((Long)foo).compareTo(o.foo);
> 	}
> 	@Override
> 	public boolean equals(Object o0) {
> 		if(o0 instanceof ID) {
> 			ID o = (ID)o0;
> 			return foo == o.foo;
> 		} else {
> 			return false;
> 		}
> 	}
> 	@Override
> 	public int hashCode() {
> 		return 42;
> 	}
> }
> public static void main(String[] args) throws Exception {
> 	ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
> 	DataSet<Tuple2<ID, Long>> inDegrees = env.fromElements(Tuple2.of(new ID(123l), 4l));
> 	DataSet<Tuple2<ID, Long>> outDegrees = env.fromElements(Tuple2.of(new ID(123l), 5l));
> 	DataSet<Tuple3<ID, Long, Long>> degrees = inDegrees.join(outDegrees, JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST).where(0).equalTo(0)
> 			.with(new FlatJoinFunction<Tuple2<ID, Long>, Tuple2<ID, Long>, Tuple3<ID, Long, Long>>() {
> 				@Override
> 				public void join(Tuple2<ID, Long> first, Tuple2<ID, Long> second, Collector<Tuple3<ID, Long, Long>> out) {
> 					out.collect(new Tuple3<ID, Long, Long>(first.f0, first.f1, second.f1));
> 				}
> 			}).withForwardedFieldsFirst("f0;f1").withForwardedFieldsSecond("f1");
> 	System.out.println("degrees count: " + degrees.count());
> }
> {noformat}
> This prints 1, but if I comment out the hashCode, it prints 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)