You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Anthony Abate (Jira)" <ji...@apache.org> on 2020/06/04 15:45:00 UTC

[jira] [Created] (ARROW-9035) 8 vs 64 byte alignment

Anthony Abate created ARROW-9035:
------------------------------------

             Summary: 8 vs 64 byte alignment
                 Key: ARROW-9035
                 URL: https://issues.apache.org/jira/browse/ARROW-9035
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++, Documentation
    Affects Versions: 0.17.0
            Reporter: Anthony Abate


I used the C++ library to create a very small arrow file (1 field of 5 int32) and was surprised that the buffers are not aligned to 64 bytes as per the documentation section "Buffer Alignment and Padding" with examples.. based on the examples there, the 20 bytes of int32 should be padded to 64 bytes, but it is only 24 (see below) .   

extract message metadata - see BodyLength = 24
{code:java}
{
  version: "V4",
  header_type: "RecordBatch",
  header: {
    nodes: [
      {
        length: 5,
        null_count: 0
      }
    ],
    buffers: [
      {
        offset: 0,
        length: 0
      },
      {
        offset: 0,
        length: 20
      }
    ]
  },
  bodyLength: 24
} {code}
Reading further down the documentation section "Encapsulated message format" it says serialization should use 8 byte alignment. 

These both seem at odds with each other and some clarification is needed.

Is the documentation wrong? 

Or

Should 8 byte alignment be used for File and 64 byte for IPC ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)