You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Fabian Hueske (JIRA)" <ji...@apache.org> on 2015/02/04 16:04:35 UTC

[jira] [Created] (FLINK-1474) Cross produces wrong results for non-tiny inputs

Fabian Hueske created FLINK-1474:
------------------------------------

             Summary: Cross produces wrong results for non-tiny inputs
                 Key: FLINK-1474
                 URL: https://issues.apache.org/jira/browse/FLINK-1474
             Project: Flink
          Issue Type: Bug
          Components: Local Runtime
    Affects Versions: 0.8
            Reporter: Fabian Hueske
            Assignee: Fabian Hueske
            Priority: Critical
             Fix For: 0.9, 0.8.1


The cross operator produces wrong results for larger inputs.
It appears that the number of emitted records is correct, but the values are not.
There seems to be some cut of at 128 elements.

To reproduce add the following test to {CrossITCase} and execute it:

{code}
	@Test
	public void testCorrectnessOfCrossOfLargeInputs() throws Exception {
		/*
		 * check correctness of cross with larger inputs
		 */

		final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

		DataSet<Long> ds = env.generateSequence(0, 1000);
		DataSet<Long> ds2 = env.generateSequence(0, 1000);
		DataSet<Tuple> crossDs = ds.cross(ds2)
				.filter(new FilterFunction<Tuple2<Long, Long>>() {
					@Override
					public boolean filter(Tuple2<Long, Long> value) throws Exception {
						return value.f0 == value.f1;
					}
				})
				.project(0).sum(0);

		crossDs.writeAsCsv(resultPath);
		env.execute();

		expected = "(500500)\n"; // 500500 = 0+1+2+...+999+1000
	}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)