You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org> on 2009/10/06 11:48:31 UTC
[jira] Created: (MAPREDUCE-1066) Add a unit test to test all the
apis in mapreduce.lib.join
Add a unit test to test all the apis in mapreduce.lib.join
------------------------------------------------------------
Key: MAPREDUCE-1066
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1066
Project: Hadoop Map/Reduce
Issue Type: Test
Components: test
Affects Versions: 0.21.0
Reporter: Amareshwari Sriramadasu
Fix For: 0.21.0
Add a unit test to test all the api/features in mapreduce.lib.join
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAPREDUCE-1066) Add a unit test to test all the
apis in mapreduce.lib.join
Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762571#action_12762571 ]
Amareshwari Sriramadasu commented on MAPREDUCE-1066:
----------------------------------------------------
Chris Douglas has suggested the following unit test to test join.
Given three sorted, equally partitioned datasets { A, B, C }.
* For each source let all values be a unique prime p
* Define an operator "count" derived from MultiFilterRecordReader.
* The correct output of output of count(A,B,C) should be [ p0^i, p1^j, p2^k ] such that i, j, k are the number of values in each source iterator. The output for a given position in the tuple is the product of all the values it receives (or 1 if it does not contain that key).
e.g. Given count(A,B,C) with the following keys, and values (2, 3, 5)
for all records, respectively
A = k0, k0, k0, k1, k2
B = k0, k1
C = k0, k0, k2, k2
The output would be 600 = (23 * 31 * 52), 6 = (21 * 31 * 50),
and 50 = (21 * 30 * 52) from the following trace:
(k0, [ 2, 3, 5 ])
(k0, [ 2, 3, 25 ])
(k0, [ 4, 3, 5 ])
(k0, [ 4, 3, 25])
(k0, [ 8, 3, 5])
(k0, [ 8, 3, 25])
(k1, [ 2, 3, 1])
(k2, [ 2, 1, 5])
(k2, [ 2, 1, 25])
Run a job with identity map and a combiner/reducer that computes the product of the values. Verify that this matches the output of count(A, B, C).
Alternatively, add a unary operator mult(A) that computes the product of its values for a given key and verify that it matches outer(mult(A), mult(B), mult(C)).
> Add a unit test to test all the apis in mapreduce.lib.join
> ------------------------------------------------------------
>
> Key: MAPREDUCE-1066
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1066
> Project: Hadoop Map/Reduce
> Issue Type: Test
> Components: test
> Affects Versions: 0.21.0
> Reporter: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
>
> Add a unit test to test all the api/features in mapreduce.lib.join
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.