You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2014/07/23 20:35:38 UTC

[jira] [Updated] (MAPREDUCE-6000) native-task: Simplify ByteBufferDataReader/Writer

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon updated MAPREDUCE-6000:
-----------------------------------

    Attachment: mapreduce-6000.txt

Attached patch removes about 250 lines of code and the tests still pass.

Reader changes:
- throw NotSupportedException for Reader.readLine() (no use cases)
- implement readUTF() using the built-in Java decoding support instead of manual implementation (should be similar performance, not sure why there was explicit decoding done here)

Writer changes:
- add better javadoc
- move constants to top of class (per hadoop style)
- assume that 'handler' is always non-null (in practice, it is). Added a precondition check for it.
- remove implementation of writeBytes(String) since this is a bizarre artifact of the java DataOutput interface (it truncates any high-order bytes out of the characters). Unlikely to be useful.
- remove implementation of writeChars(String) since it's unused, and again, users should probably be using UTF8 methods
- implement writeUTF() using Java support for encoding, instead of manual encoding

Tests:
- use Mockito spy instead of the new 'Counter' and 'Flag' classes.

> native-task: Simplify ByteBufferDataReader/Writer
> -------------------------------------------------
>
>                 Key: MAPREDUCE-6000
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6000
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: task
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Minor
>         Attachments: mapreduce-6000.txt
>
>
> The ByteBufferDataReader and ByteBufferDataWriter class are more complex than necessary:
> - several methods related to reading/writing strings and char arrays are implemented but never used by the native task code. Given that the use case for these classes is limited to serializing binary data to/from the native code, it seems unlikely people will want to use these methods in any performance-critical space. So, let's do simpler implementations that are less likely to be buggy, even if they're slightly less performant.
> - methods like readLine() are even less likely to be used. Since it's a complex implementation, let's just throw UnsupportedOperationException
> - in the test case, we can use Mockito to shorten the amount of new code



--
This message was sent by Atlassian JIRA
(v6.2#6252)