You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ravi Prakash (JIRA)" <ji...@apache.org> on 2012/09/07 21:50:07 UTC

[jira] [Updated] (MAPREDUCE-4645) Providing a random seed to Slive should make the sequence of filenames completely deterministic

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Prakash updated MAPREDUCE-4645:
------------------------------------

    Attachment: MAPREDUCE-4645.branch-0.23.patch

This patch changes the dummy key for the SliveMapper to be a "splitID" and the Random number generator to be seeded with that splitID + user-specified seed. Also the PathFinder which generates the path, is given its own separate instance of Random, so that if you run the same Slive command twice, all ops will succeed the first time and fail the second time (because the file would already have been created / deleted the first time)
                
> Providing a random seed to Slive should make the sequence of filenames completely deterministic
> -----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4645
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4645
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: performance, test
>    Affects Versions: 0.23.1, 2.0.0-alpha
>            Reporter: Ravi Prakash
>            Assignee: Ravi Prakash
>              Labels: performance, test
>         Attachments: MAPREDUCE-4645.branch-0.23.patch
>
>
> Using the -random seed option still doesn't produce a deterministic sequence of filenames. Hence there's no way to replicate the performance test. If I'm providing a seed, its obvious that I want the test to be reproducible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira