You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2017/05/17 05:39:04 UTC

[jira] [Created] (TEZ-3732) Reduce Object size of InputAttemptIdentifier and MapOutput for large jobs

Jonathan Eagles created TEZ-3732:
------------------------------------

             Summary: Reduce Object size of InputAttemptIdentifier and MapOutput for large jobs
                 Key: TEZ-3732
                 URL: https://issues.apache.org/jira/browse/TEZ-3732
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jonathan Eagles
            Assignee: Jonathan Eagles


Objects in 64bit java are 12bytes + member size aligned to 8 bytes

InputAttemptIdentifier -> 33Bytes gets aligned up to 40 bytes
This class is just one byte over the 32 byte alignment. Reducing object size by one byte can save 8 bytes per object.
This is ~8MB savings for 1,000,000 inputs and ~80 MB savings for tasks with 10,000,000 inputs to fetch (Yes this is a real job)

MapOutput -> 45 bytes gets aligned to 48 bytes
This class can be sub-classed to avoid all sub-classes paying the object size cost for the other sub-classes
Wait InMemory and DiskDirect -> 32 bytes
Disk -> 40 bytes
Total savings is harder to account for but more than the above case.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)